[Openais] correlating events

David Teigland teigland at redhat.com
Fri Sep 11 08:41:19 PDT 2009

On Thu, Sep 10, 2009 at 04:11:28PM -0700, Steven Dake wrote:
> IMO the proper way to do this is to ensure whatever ringid was delivered in
> a callback to the application is the current ring id returned by the api.
> This gets rid of any races you describe above.

I can't really think of any races that would concern me.

I described two different queries using one function, maybe it would be
clearer if I described them using two separate functions.

1. cpg_ringid_confchg_cb(&id1)
   id1 is the ringid associated with the last cpg confchg callback delivered
   to the app via cpg_dispatch().  If I call cpg_ringid_confchg_cb() from
   within a callback, I will be able to know the ringid associated with
   each confchg.

Of course cpg confchgs (joins/leaves) can happen without a change in the
ringid.  And likewise, the ringid can change without any corresponding cpg
confchg.  Cman on the otherhand is always in step with each ringid change.
What I want my app to do is wait until it knows that cpg and cman are in sync
with each other:

1. If cpg has more recent events than cman, then wait for cman to catch up.
   (the cpg_ringid_confchg_cb call above will solve this one)
2. If cman has more recent events than cpg, then wait for cpg to catch up.
   (still looking for a way to do this one)

So the next function is trying to solve 2, and I figured using ringid's again
might be good.  What makes it tricky is that the most recent ringid returned
by cman may not cause a cpg confchg.  The last ringid returned by
cpg_ringid_confchg_cb() may be less than the cman ringid, and waiting for them
to match won't work.  When the cman ringid is greater than the cpg ringid, the
app doesn't know if it's because the cpg callbacks just haven't been delivered
yet, or because there are no cpg callbacks for that ringid.

Functions of various forms could tell us, though.  One possibility:

2. rv = cpg_ringid_done(ringid)
   (I'd pass in the ringid from cman)
   rv would be 0 if there are any undelivered confchgs to the app for the
   ringid provided
   rv would be 1 if all confchgs have been delivered to the app up to and
   including the ringid provided

Or, something like I mentioned in the previous mail where cpg returns the
latest ringid it has seen for which all confchgs (if any) have been delivered
to the app.

> > Chrissie pointed out that libcman only returns the 64 bit ringid as uint32,
> > but I doubt we'll see ringid's bigger than that.... even if we do I'm just
> > comparing consecutive id's so the lower 32 bits should be fine.
> > 
> Once the ring id is greater then 32 bits, you would always be comparing 0.

I don't follow.

> Looks like cman needs this error corrected, along with the addition of the
> ring leader node id.
> A ring id is uniquely identified by the nodeid of the ring leader and
> the 64 bit value of the ringid.  Need both values in the comparison.

I'm mainly interested in an equal comparison of ringids, but it might be
convenient to know if one came after another.  Would the ringid sequence
number ever not increase and in what situations?


More information about the Openais mailing list