[Openais] [PATCH corosync trunk] When sync is aborted clear the "my_ring_id" variable.

Angus Salkeld asalkeld at redhat.com
Mon Apr 12 03:17:41 PDT 2010


This patch fixes crashes found by repeated pacemaker CTS SimluStart
tests. When you bring up the nodes together it can cause a lot of
configuration changes and sync gets started and aborted
lots of times.

When abort is called the ring_id is not changed which means that any
sync packet that arrive from that point on will be accepted as valid.
I have seen old barrier messages causing the processing index to increment
later causing an array out of bounds.

This patch memsets the ring_id to 0, thus causing the ring_id in the packet and
my_ring_id not to match.    


Signed-off-by: Angus Salkeld <asalkeld at redhat.com>
 exec/syncv2.c |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/exec/syncv2.c b/exec/syncv2.c
index 57b501b..559e199 100644
--- a/exec/syncv2.c
+++ b/exec/syncv2.c
@@ -665,6 +665,11 @@ void sync_v2_abort (void)
 		schedwrk_destroy (my_schedwrk_handle);
 		my_service_list[my_processing_idx].sync_abort ();
+	/* this will cause any "old" barrier messages from causing
+	 * problems.
+	 */
+	memset (&my_ring_id, 0,	sizeof (struct memb_ring_id));
 void sync_v2_memb_list_determine (const struct memb_ring_id *ring_id)

More information about the Openais mailing list