[Openais] RE: GMI problem as node does not see other nodes
Steven Dake
sdake at mvista.com
Fri Jul 9 14:10:03 PDT 2004
Cool sounds good!
Thanks
-steve
On Fri, 2004-07-09 at 13:03, Sabharwal, Atul wrote:
> I seem to have found the problem. I had two syslogd running and bind on
> the
> Socket was failing. Now, it gets the token and reaches CONSENSUS.
>
> Thanks,
>
> Atul
>
> -------------------------------------------------------------
> P.S: All opinions are my personal opinion(s) & responsibility and do
> not represent the view of my employer ( Intel Corporation ).
>
>
> >-----Original Message-----
> >From: Sabharwal, Atul
> >Sent: Friday, July 09, 2004 12:49 PM
> >To: 'sdake at mvista.com'; openais at lsts.osdl.org
> >Subject: RE: GMI problem as node does not see other nodes
> >
> >I had one instance where the two nodes joined. It seems to be
> >a packet UDP port issue.
> >In network.conf, I have UdpPort set to 514. Now, one machine
> >uses port 514 for
> >Both source and destination while other uses it for
> >destination field only.
> >I ran ethereal for windows to get packet trace for it.
> >
> >The tar ball for inetutils is attached. Please look at
> >syslogd.c and to run it in
> >Debug mode, do syslogd -d --multicast. Somehow the short
> >option for popt is not
> >Working.
> >
> >Thanks,
> >
> >Atul
> >
> >-------------------------------------------------------------
> >P.S: All opinions are my personal opinion(s) & responsibility
> >and do not represent the view of my employer ( Intel Corporation ).
> >
> >
> >>-----Original Message-----
> >>From: Steven Dake [mailto:sdake at mvista.com]
> >>Sent: Friday, July 09, 2004 12:04 PM
> >>To: Sabharwal, Atul; openais at lsts.osdl.org
> >>Subject: Re: GMI problem as node does not see other nodes
> >>
> >>On Fri, 2004-07-09 at 11:51, Sabharwal, Atul wrote:
> >>> Hi Steve,
> >>>
> >>> I have a problem where I have GMI running on two machines. Both do
> >>> gmi_join but do not
> >>> Find each other and create their own groups. Do you know
> >>what could be
> >>> wrong ?
> >>> Both the nodes are in the same multicast group and have same
> >>multicast
> >>> address and
> >>
> >>One problem that can happen is the lack of root permissions (which
> >>gmi_init requires to bind to a specific interface).
> >>
> >>Another problem is that your routing table may not have a default
> >>route. The kernel multicast code will not work without a
> >>default root.
> >>I'm not sure why the kernel has this requirement since I should
> >>multicast over a bounded interface. But I don't know alot about
> >>networking so there may some reason.
> >>
> >>Another problem is that multiple groups may not work as
> >>expected (bug).
> >>See end of message.
> >>
> >>Another problem is that the machines wont find each other on join, but
> >>instead when poll_run is started. The gmi_join function sets up the
> >>poll abstraction to work properly for group messaging, but the final
> >>call to poll_run is required.
> >>
> >>Send me a tarball of the code and I'll look for immediate
> >>errors and run
> >>it on my test network. It sounds like your doing the right things
> >>though.
> >>
> >>> Port number for the destination field ( from
> >>/etc/ais/network.conf file
> >>> ).
> >>>
> >>> Also, some comments on GMI :
> >>> 1. If logger is not initialised, gmi_init segfaults.
> >>
> >>yes I thought about this :) but was too lazy to immediately fix it.
> >>
> >>> 2. gmi_mcast prototype is misleading. It has field iov_len when it
> >>> should be iov_count
> >>> i.e number of iovectors.
> >>
> >>your right i'll change the prototypes.
> >>
> >>> 3. gmi_mcast with group name as NULL segfaults. In
> >>exec/main.c gmi_join
> >>> does not
> >>> have a group name but never does gmi_mcast, so it does
> >>not show up.
> >>> Maybe,
> >>> it should not segfault but return -1.
> >>>
> >>
> >>Yes, the group name stuff is a little wacky at the moment. Currently
> >>there is a big "TODO" around group names.
> >>
> >>What should happen is when you do a join, all message sends should
> >>include that group name and all receives should only select messages
> >>with that group name for the joined handle. This probably doesn't
> >>happen correctly today, as the gmi code as used by openais
> >only has one
> >>user, one group, and seems to work correctly for that task.
> >>If you have
> >>a patch for this I'd merge it.
> >>
> >>
> >>> Thanks,
> >>>
> >>> Atul
> >>>
> >>> -------------------------------------------------------------
> >>> P.S: All opinions are my personal opinion(s) &
> >responsibility and do
> >>> not represent the view of my employer ( Intel Corporation ).
> >>>
> >>
> >>
> >>
More information about the Openais
mailing list