[Openais] [PATCH corosync trunk] When sync is aborted clear the "my_ring_id" variable.
Angus Salkeld
asalkeld at redhat.com
Mon Apr 12 03:17:41 PDT 2010
Hi
This patch fixes crashes found by repeated pacemaker CTS SimluStart
tests. When you bring up the nodes together it can cause a lot of
configuration changes and sync gets started and aborted
lots of times.
When abort is called the ring_id is not changed which means that any
sync packet that arrive from that point on will be accepted as valid.
I have seen old barrier messages causing the processing index to increment
later causing an array out of bounds.
This patch memsets the ring_id to 0, thus causing the ring_id in the packet and
my_ring_id not to match.
Regards
Angus
Signed-off-by: Angus Salkeld <asalkeld at redhat.com>
---
exec/syncv2.c | 5 +++++
1 files changed, 5 insertions(+), 0 deletions(-)
diff --git a/exec/syncv2.c b/exec/syncv2.c
index 57b501b..559e199 100644
--- a/exec/syncv2.c
+++ b/exec/syncv2.c
@@ -665,6 +665,11 @@ void sync_v2_abort (void)
schedwrk_destroy (my_schedwrk_handle);
my_service_list[my_processing_idx].sync_abort ();
}
+
+ /* this will cause any "old" barrier messages from causing
+ * problems.
+ */
+ memset (&my_ring_id, 0, sizeof (struct memb_ring_id));
}
void sync_v2_memb_list_determine (const struct memb_ring_id *ring_id)
--
1.6.6.1
More information about the Openais
mailing list