[Openais] Split brain when using EVS library

Robert Wipfel RAWIPFEL at novell.com
Tue Sep 9 05:12:02 PDT 2008


>>> On 9/9/2008 at  4:27 AM, in message
<63E39ADA42BF8B49BEAE3666683A248407342715 at esealmw107.eemea.ericsson.se>, "Arne
Eriksson R" <arne.r.eriksson at ericsson.com> wrote:
> Hi,
> We have a cluster with 6 processors using openais stable version 0.80.3.
> 
> For some reason our cluster splits up into two rings.
> Scenario is:
> node1(n1) n2 n3 n4 n5 n6 are in the ring.
> 
> Suddenly the ring splits into two rings:
> n1 n2 n3 got leave msg from n4 n5 n6
> n4 n5 n6 got leave msg from n1 n2 n3
> 
> After a few milliseconds the two rings joins again:
> n1 n2 n3 got join msg from n4 n5 n6
> n4 n5 n6 got join msg from n1 n2 n3
> 
> The two ring is joined to one ring again:
> node1(n1) n2 n3 n4 n5 n6 are in the ring.
> 
> The question is if this is a normal scenario from EVS in the openais
> implementation?
> 
> The problem is that the application needs to detect the difference
> between two kinds of joins: The "normal" join where the two rings/nodes
> join for the first time and the "abnormal" joins where a ring has split
> and re-joined (without any nodes being restarted). The first case
> typically requires only a sync of some nodes (bringing the history up to
> date). The second case requires a merger, i.e selection of a loosing
> side and the looser discarding the loosers history.

Sidebar: if assuming the presence of a shared disk someplace, then it
can be used as a different kind of communication channel; for detecting
Split Brain conditions:
http://wiki.linux-ha.org/SBD_Fencing 
The idea is for the partitions to share membership information / detect
that a partition exists. Just a thought - hopefully nothing bad happened
while the partitions were split - in the second case ;-)

Hth,
Robert



More information about the Openais mailing list