[Openais] firewire
ray klassen
julius_ahenobarbus at yahoo.co.uk
Tue Mar 8 13:12:27 PST 2011
MCP is not really mentioned anywhere except ClusterGuy's blog (maybe you're him)
but from that I'm assuming that you mean starting the pacemaker separately. as
/etc/init.d/pacemaker. So I removed the /etc/corosync/services.d/pcmk file. I
also (from ClusterGuy's page on 'MCP'
http://theclusterguy.clusterlabs.org/post/907043024/introducing-the-pacemaker-master-control-process-for
) added 'cman' (yum install cman -- for mailing list readers yet to come) from
the alternative 2.
And it does work. I now can view a 'partition with quorum' with crm_mon. over
firewire, with udpu.
Just don't really know how it works. how does pacemaker communicate with the
stack? etc.? unix sockets? shared memory? how does corosync communicate with the
stack?
----- Original Message ----
From: Steven Dake <sdake at redhat.com>
To: ray klassen <julius_ahenobarbus at yahoo.co.uk>
Cc: openais at lists.linux-foundation.org
Sent: Tue, 8 March, 2011 10:02:28
Subject: Re: [Openais] firewire
First off, I'd recommend using the "MCP" process that is part of
Pacemaker rather then the plugin.
Second, if you could run corosync-objctl and put the output on the list,
along with your /etc/corosync/corosyn.conf, that would be helpful.
Regards
-steve
On 03/08/2011 09:19 AM, ray klassen wrote:
> what I'm finding on further investigation is that all the pacemaker
> child processes are dying on startup
>
>
> Mar 08 08:15:28 corosync [pcmk ] ERROR: pcmk_wait_dispatch: Child
> process lrmd exited (pid=6356, rc=100)
> Mar 08 08:15:28 corosync [pcmk ] notice: pcmk_wait_dispatch: Child
> process lrmd no longer wishes to be respawned
> Mar 08 08:15:28 corosync [pcmk ] debug: send_cluster_id: Leaving
> born-on unset: 308
> Mar 08 08:15:28 corosync [pcmk ] debug: send_cluster_id: Local update:
> id=168430090, born=0, seq=308
> Mar 08 08:15:28 corosync [pcmk ] info: update_member: Node wwww.com now
> has process list: 00000000000000000000000000111302 (1118978)
> Mar 08 08:15:28 corosync [pcmk ] ERROR: pcmk_wait_dispatch: Child
> process cib exited (pid=6355, rc=100)
> Mar 08 08:15:28 corosync [pcmk ] notice: pcmk_wait_dispatch: Child
> process cib no longer wishes to be respawned
> Mar 08 08:15:28 corosync [pcmk ] debug: send_cluster_id: Leaving
> born-on unset: 308
> Mar 08 08:15:28 corosync [pcmk ] debug: send_cluster_id: Local update:
> id=168430090, born=0, seq=308
> Mar 08 08:15:28 corosync [pcmk ] info: update_member: Node wwww.com now
> has process list: 00000000000000000000000000111202 (1118722)
> Mar 08 08:15:28 corosync [pcmk ] ERROR: pcmk_wait_dispatch: Child
> process crmd exited (pid=6359, rc=100)
> Mar 08 08:15:28 corosync [pcmk ] notice: pcmk_wait_dispatch: Child
> process crmd no longer wishes to be respawned
> Mar 08 08:15:28 corosync [pcmk ] debug: send_cluster_id: Leaving
> born-on unset: 308
> Mar 08 08:15:28 corosync [pcmk ] debug: send_cluster_id: Local update:
> id=168430090, born=0, seq=308
> Mar 08 08:15:28 corosync [pcmk ] info: update_member: Node wwww.com now
> has process list: 00000000000000000000000000111002 (1118210)
> Mar 08 08:15:28 corosync [TOTEM ] mcasted message added to pending queue
> Mar 08 08:15:28 corosync [pcmk ] ERROR: pcmk_wait_dispatch: Child
> process attrd exited (pid=6357, rc=100)
> Mar 08 08:15:28 corosync [pcmk ] notice: pcmk_wait_dispatch: Child
> process attrd no longer wishes to be respawned
> Mar 08 08:15:28 corosync [pcmk ] debug: send_cluster_id: Leaving
> born-on unset: 308
> Mar 08 08:15:28 corosync [pcmk ] debug: send_cluster_id: Local update:
> id=168430090, born=0, seq=308
> Mar 08 08:15:28 corosync [pcmk ] info: update_member: Node wwww.com now
> has process list: 00000000000000000000000000110002 (1114114)
> Mar 08 08:15:28 corosync [pcmk ] ERROR: pcmk_wait_dispatch: Child
> process pengine exited (pid=6358, rc=100)
> Mar 08 08:15:28 corosync [pcmk ] notice: pcmk_wait_dispatch: Child
> process pengine no longer wishes to be respawned
> Mar 08 08:15:28 corosync [pcmk ] debug: send_cluster_id: Leaving
> born-on unset: 308
> Mar 08 08:15:28 corosync [pcmk ] debug: send_cluster_id: Local update:
> id=168430090, born=0, seq=308
> Mar 08 08:15:28 corosync [pcmk ] info: update_member: Node wwww.com now
> has process list: 00000000000000000000000000100002 (1048578)
> Mar 08 08:15:28 corosync [TOTEM ] mcasted message added to pending queue
> Mar 08 08:15:28 corosync [pcmk ] ERROR: pcmk_wait_dispatch: Child
> process stonith-ng exited (pid=6354, rc=100)
> Mar 08 08:15:28 corosync [pcmk ] notice: pcmk_wait_dispatch: Child
> process stonith-ng no longer wishes to be respawned
> Mar 08 08:15:28 corosync [pcmk ] debug: send_cluster_id: Leaving
> born-on unset: 308
> Mar 08 08:15:28 corosync [pcmk ] debug: send_cluster_id: Local update:
> id=168430090, born=0, seq=308
> Mar 08 08:15:28 corosync [pcmk ] info: update_member: Node wwww.com now
> has process list: 00000000000000000000000000000002 (2)
> Mar 08 08:15:28 corosync [TOTEM ] mcasted message added to pending queue
> Mar
>
>
>
> ------------------------------------------------------------------------
> *From:* Dan Frincu <df.cluster at gmail.com>
> *To:* openais at lists.linux-foundation.org
> *Sent:* Tue, 8 March, 2011 2:45:00
> *Subject:* Re: [Openais] firewire
>
>
>
> On Tue, Mar 8, 2011 at 2:07 AM, ray klassen
> <julius_ahenobarbus at yahoo.co.uk <mailto:julius_ahenobarbus at yahoo.co.uk>>
> wrote:
>
> well I have the 1.3.0 version of corosync seemingly happy with udpu and
> firewire. The logs report connection back and forth between the two
> boxes. But
> now crm_mon never connects. Does pacemaker not support udpu yet?
>
>
> Pacemaker is the Cluster Resource Manager, so it doesn't really care
> about the underlying method that the Messaging and Membership layer uses
> to connect between nodes.
>
> I've had this issue (crm_mon not connecting) when I performed an upgrade
> from openais-0.80 to corosync-1.3.0 with udpu, I solved it by eventually
> rebooting the servers. In your case I doubt it's an upgrade between
> versions of software, since you've reinstalled.
>
> My 2 cents.
>
>
>
> pacemaker-1.1.4-5.fc14.i686
> (I switched to fedora from debian to get the latest version of corosync)
>
>
>
>
> ----- Original Message ----
> From: Steven Dake <sdake at redhat.com <mailto:sdake at redhat.com>>
> To: ray klassen <julius_ahenobarbus at yahoo.co.uk
> <mailto:julius_ahenobarbus at yahoo.co.uk>>
> Cc: openais at lists.linux-foundation.org
> <mailto:openais at lists.linux-foundation.org>
> Sent: Thu, 3 March, 2011 16:56:21
> Subject: Re: [Openais] firewire
>
> On 03/03/2011 05:45 PM, ray klassen wrote:
> > Has anyone had any success running corosync with the firewire-net
> module? I
> >want
> >
> > to set up a two node router cluster with a dedicated link between
> the routers.
>
> > Only problem is, I've run out of ethernet ports so I've got ip
> configured on
> >the
> >
> > firewire ports. pinging's no problem between the addresses.. funny
> thing is, on
> >
> > one of them (and they're really identical) corosync starts up no
> problem at all
> >
> > and stays up. on the other one corosync fails with "ERROR:
> ais_dispatch:
> > Receiving message body failed: (2) Library error: Resource temporarily
> > unavailable (11)."
> >
> >
> > Reading up on the firewire-net mailing outstanding issues turned
> up that
> > multicast wasn't fully implemented so my corosync.conf files both say
> >broadcast:
> >
> > yes. instead of mcast-addr
> >
> > Firewire-net was emitting fwnet_write_complete: failed: 10 errors
> so I pulled
>
> > down the latest vanilla kernel 2.6.37.2 and am running that. with
> far fewer of
>
> > that error..
> >
> > otherwise versions are
> > Debian Squeeze
> > Corosync Version: 1.2.1-4
> > Pacemaker 1.0.9.1+hg15626-1
> >
> > Is this a hopeless case? I've a got a debug log from corosync that
> doesn't seem
> >
> > that helpful. If you want I can post that as well
> >
> > Thanks
> >
>
> I'm hesitant to suggest using firewire as a transport as your the first
> person that has ever tried it. If multicast is broken on your hardware,
> you might try the "udpu" transport which uses UDP only (udp is the basis
> for all network communication).
>
> Regards
> -steve
>
> >
> >
> > _______________________________________________
> > Openais mailing list
> > Openais at lists.linux-foundation.org
> <mailto:Openais at lists.linux-foundation.org>
> > https://lists.linux-foundation.org/mailman/listinfo/openais
>
>
>
> _______________________________________________
> Openais mailing list
> Openais at lists.linux-foundation.org
> <mailto:Openais at lists.linux-foundation.org>
> https://lists.linux-foundation.org/mailman/listinfo/openais
>
>
>
>
> --
> Dan Frincu
> CCNA, RHCE
>
>
>
>
> _______________________________________________
> Openais mailing list
> Openais at lists.linux-foundation.org
> https://lists.linux-foundation.org/mailman/listinfo/openais
More information about the Openais
mailing list