nmblookup problems [Includes patch for Samba bug]
David Collier-Brown
davecb at canada.sun.com
Wed May 10 17:49:35 GMT 2000
The patch is at the very end...
Ron Alexander wrote:
> > socket 7 bound to addr:port 0.0.0.0:0
> > Socket 7 opened for port 0.
> > querying * on 255.255.255.255
> > FD 7 is set before select
> > Num FDS ready=1, errno=0 ()
> > FD 7 is set after select
> > Got a positive name query response from 134.111.57.12 ( 134.111.220.160 )
> > FD 7 is set before select
> > Num FDS ready=0, errno=1081 (Timeout period has expired.)
> > FD 7 is set after select
> >
> > At this point it waits forever!
and then:
> Here is what I see in my debugger.
>
> 1: # 10: read_udp_socket (line 179 in module util_sock)
> 1: # 9: read_packet (line 693 in module nmblib)
> 1: # 8: receive_packet (line 947 in module nmblib)
> 1: # 6: name_query (line 277 in module namequery)
> 1: # 5: query_one (line 100 in module nmblookup)
> 1: # 3: main (line 271 in module nmblookup)
>
> It seems simple, the select has indicated that a socket is ready for reading
> and when we go to read it we hang.
What I claim is supposed to happen is:
Time Command and Args Returns
===== ======================== =======
0.0334 so_socket(2, 1, 0, "", 1) = 4 -- socket calls this on solaris
0.0340 bind(4, 0xFFBEEA88, 16, 3) = 0 -- bind fd 4
0.0375 sendto(4, "07040110\001\0\0\0\0\0\0".., 50, 0, 0xFFBEE638, 16)
= 50
-- send the packet out
0.0545 poll(0xFFBEE918, 1, 90) = 1 -- select calls this on Solaris
0.0548 recvfrom(4, "070485\0\0\0\001\0\0\0\0".., 576, 0, 0xFFBEE704,
0xFFBEE714) = 62 -- receive first response
0.0553 poll(0xFFBEE918, 1, 90) = 1 -- select again
0.0555 recvfrom(4, "070485\0\0\0\001\0\0\0\0".., 576, 0, 0xFFBEE704,
-- and so on, getting 62 bytes
-- each time, for 9 times on
-- my network ...
0.1491 poll(0xFFBEE918, 1, 90) = 0 -- time out
0.2391 poll(0xFFBEE918, 1, 90) = 0 -- again
0.3292 poll(0xFFBEE918, 1, 90) = 0 -- and again
-- then start printing the messages
doing parameter guest account = guest
doing parameter map to guest = bad user
pm_process() returned Yes
added interface ip=129.155.8.39 bcast=129.155.8.255
nmask=255.255.255.0
added interface ip=127.0.0.1 bcast=127.255.255.255 nmask=255.0.0.0
querying * on 255.255.255.255
Got a positive name query response from 129.155.8.113 ( 129.155.8.113
)
Got a positive name query response from 129.155.8.24 ( 129.155.8.24 )
Got a positive name query response from 129.155.8.213 ( 129.155.8.213
)
Got a positive name query response from 129.155.8.55 ( 129.155.8.55 )
Got a positive name query response from 129.155.8.38 ( 129.155.8.38 )
Got a positive name query response from 129.155.8.1 ( 129.155.8.1 )
Got a positive name query response from 129.155.8.34 ( 129.155.8.34 )
Got a positive name query response from 129.155.8.71 ( 129.155.8.71 )
Got a positive name query response from 129.155.8.79 ( 129.155.8.79 )
129.155.8.113 *<00>
129.155.8.24 *<00>
129.155.8.213 *<00>
129.155.8.55 *<00>
129.155.8.38 *<00>
129.155.8.1 *<00>
129.155.8.34 *<00>
129.155.8.71 *<00>
129.155.8.79 *<00>
0.3300 _exit(0) -- and quit
I suspect either a problem getting the data, or in the
select. The critical code is in libsmb/nmblib.c, at 935
---
struct packet_struct *receive_packet(int fd,enum packet_type type,int
t)
{
fd_set fds;
struct timeval timeout;
FD_ZERO(&fds);
FD_SET(fd,&fds);
timeout.tv_sec = t/1000;
timeout.tv_usec = 1000*(t%1000);
sys_select(fd+1,&fds,&timeout);
if (FD_ISSET(fd,&fds))
return(read_packet(fd,type));
return(NULL);
}
---
It's not a failed timeout, as the bit corresponding to fd
is being returned set, and FD_ISSET succeeds...
Looking at sys_select, and assuming you have select and not poll,
we get:
---
int sys_select(int maxfd, fd_set *fds, struct timeval *tval)
struct timeval t2;
int selrtn;
do {
if (tval) memcpy((void *)&t2,(void *)tval,sizeof(t2));
errno = 0;
selrtn = select(maxfd,SELECT_CAST fds,NULL,NULL,tval?&t2:NULL);
} while (selrtn<0 && errno == EINTR);
return(selrtn);
}
---
Which basically retries if we get EINTR, which is quite sane.
HOWEVER, we should not call read_packet unless sys_select
return true.
This looks like a Samba (portability?) bug, because select is
defined to fail on any of EINTR, EBADF or EINVAL and we only
handle EINTR. Check if it's -1, and if so return NULL.
Without that, we could be getting -1, and we just set
the bit in fds, so FD_ISSET will always succeed.
I rather strongly recommend:
if (sys_select(fd+1,&fds,&timeout) == -1) {
/* errno should be EBADF or EINVAL. */
DEBUG(0,("select returned -1, errno = %s (%d)\n",
strerror(errno), errno));
return NULL;
}
else if FD_ISSET(fd,&fds)) {
/* Get the data. */
return(read_packet(fd,type));
}
else {
/* Nothing waiting. */
return NULL;
}
For the VOS port, I'd print the return from sys_select, and
see if we're passing it something bogus.
--dave
=========== PATCH ===========
*** nmblib.c.old Wed May 10 13:43:03 2000
--- nmblib.c Wed May 10 13:38:09 2000
***************
--- 942,961 ----
timeout.tv_sec = t/1000;
timeout.tv_usec = 1000*(t%1000);
! if (sys_select(fd+1,&fds,&timeout) == -1) {
! /* errno should be EBADF or EINVAL. */
! DEBUG(0,("select returned -1, errno = %s (%d)\n",
! strerror(errno), errno));
! return NULL;
! }
! else if (FD_ISSET(fd,&fds)) {
! /* Get the data. */
return(read_packet(fd,type));
! }
! else {
! /* Nothing waiting. */
! return NULL;
! }
}
============ END PATCH ===================
David Collier-Brown, | Always do right. This will gratify some people
185 Ellerslie Ave., | and astonish the rest. -- Mark Twain
Willowdale, Ontario | //www.oreilly.com/catalog/samba/author.html
Work: (905) 415-2849 Home: (416) 223-8968 Email: davecb at canada.sun.com
More information about the samba-technical
mailing list