[Openais] Split brain when using EVS library

Arne Eriksson R arne.r.eriksson at ericsson.com
Tue Sep 9 03:27:34 PDT 2008


Hi,
We have a cluster with 6 processors using openais stable version 0.80.3.

For some reason our cluster splits up into two rings.
Scenario is:
node1(n1) n2 n3 n4 n5 n6 are in the ring.

Suddenly the ring splits into two rings:
n1 n2 n3 got leave msg from n4 n5 n6
n4 n5 n6 got leave msg from n1 n2 n3

After a few milliseconds the two rings joins again:
n1 n2 n3 got join msg from n4 n5 n6
n4 n5 n6 got join msg from n1 n2 n3

The two ring is joined to one ring again:
node1(n1) n2 n3 n4 n5 n6 are in the ring.

The question is if this is a normal scenario from EVS in the openais
implementation?

The problem is that the application needs to detect the difference
between two kinds of joins: The "normal" join where the two rings/nodes
join for the first time and the "abnormal" joins where a ring has split
and re-joined (without any nodes being restarted). The first case
typically requires only a sync of some nodes (bringing the history up to
date). The second case requires a merger, i.e selection of a loosing
side and the looser discarding the loosers history.

Best regards,
Arne


More information about the Openais mailing list