[Openais] RE: GMI problem as node does not see other nodes

Steven Dake sdake at mvista.com
Fri Jul 9 14:10:03 PDT 2004


Cool sounds good!
Thanks
-steve
On Fri, 2004-07-09 at 13:03, Sabharwal, Atul wrote:
> I seem to have found the problem. I had two syslogd running and bind on
> the
> Socket was failing. Now, it gets the token and reaches CONSENSUS.
> 
> Thanks,
> 
> Atul
> 
> -------------------------------------------------------------
> P.S:  All opinions are my personal opinion(s) & responsibility and do
> not represent the view of my employer ( Intel Corporation ).
>  
> 
> >-----Original Message-----
> >From: Sabharwal, Atul 
> >Sent: Friday, July 09, 2004 12:49 PM
> >To: 'sdake at mvista.com'; openais at lsts.osdl.org
> >Subject: RE: GMI problem as node does not see other nodes
> >
> >I had one instance where the two nodes joined. It seems to be 
> >a packet UDP port issue.
> >In network.conf, I have UdpPort set to 514. Now, one machine 
> >uses port 514 for
> >Both source and destination while other uses it for 
> >destination field only.
> >I ran ethereal for windows to get packet trace for it.
> >
> >The tar ball for inetutils is attached. Please look at 
> >syslogd.c and to run it in
> >Debug mode, do syslogd -d --multicast. Somehow the short 
> >option for popt is not
> >Working.
> >
> >Thanks,
> >
> >Atul
> >
> >-------------------------------------------------------------
> >P.S:  All opinions are my personal opinion(s) & responsibility 
> >and do not represent the view of my employer ( Intel Corporation ).
> > 
> >
> >>-----Original Message-----
> >>From: Steven Dake [mailto:sdake at mvista.com] 
> >>Sent: Friday, July 09, 2004 12:04 PM
> >>To: Sabharwal, Atul; openais at lsts.osdl.org
> >>Subject: Re: GMI problem as node does not see other nodes
> >>
> >>On Fri, 2004-07-09 at 11:51, Sabharwal, Atul wrote:
> >>> Hi Steve,
> >>> 
> >>> I have a problem where I have GMI running on two machines. Both do
> >>> gmi_join but do not
> >>> Find each other and create their own groups. Do you know 
> >>what could be
> >>> wrong ?
> >>> Both the nodes are in the same multicast group and have same 
> >>multicast
> >>> address and
> >>
> >>One problem that can happen is the lack of root permissions (which
> >>gmi_init requires to bind to a specific interface).
> >>
> >>Another problem is that your routing table may not have a default
> >>route.  The kernel multicast code will not work without a 
> >>default root. 
> >>I'm not sure why the kernel has this requirement since I should
> >>multicast over a bounded interface.  But I don't know alot about
> >>networking so there may some reason.
> >>
> >>Another problem is that multiple groups may not work as 
> >>expected (bug). 
> >>See end of message.
> >>
> >>Another problem is that the machines wont find each other on join, but
> >>instead when poll_run is started.  The gmi_join function sets up the
> >>poll abstraction to work properly for group messaging, but the final
> >>call to poll_run is required.
> >>
> >>Send me a tarball of the code and I'll look for immediate 
> >>errors and run
> >>it on my test network.  It sounds like your doing the right things
> >>though.  
> >>
> >>> Port number for the destination field ( from 
> >>/etc/ais/network.conf file
> >>> ).
> >>> 
> >>> Also, some comments on GMI :
> >>> 1. If logger is not initialised, gmi_init segfaults.
> >>
> >>yes I thought about this :)  but was too lazy to immediately fix it.
> >>
> >>> 2. gmi_mcast prototype is misleading. It has field iov_len when it
> >>> should be iov_count
> >>>    i.e number of iovectors.
> >>
> >>your right i'll change the prototypes.
> >>
> >>> 3. gmi_mcast with group name as NULL segfaults. In 
> >>exec/main.c gmi_join
> >>> does not
> >>>    have a group name but never does gmi_mcast, so it does 
> >>not show up.
> >>> Maybe,
> >>>    it should not segfault but return -1.
> >>> 
> >>
> >>Yes, the group name stuff is a little wacky at the moment.  Currently
> >>there is a big "TODO" around group names.
> >>
> >>What should happen is when you do a join, all message sends should
> >>include that group name and all receives should only select messages
> >>with that group name for the joined handle.  This probably doesn't
> >>happen correctly today, as the gmi code as used by openais 
> >only has one
> >>user, one group, and seems to work correctly for that task.  
> >>If you have
> >>a patch for this I'd merge it.
> >>
> >>
> >>> Thanks,
> >>> 
> >>> Atul
> >>> 
> >>> -------------------------------------------------------------
> >>> P.S:  All opinions are my personal opinion(s) & 
> >responsibility and do
> >>> not represent the view of my employer ( Intel Corporation ).
> >>> 
> >>
> >>
> >>




More information about the Openais mailing list