Samba hanging up after about 15 days
Stephen Eickhoff
operagost at email.com
Mon Oct 27 20:23:23 GMT 2003
I apologize for the wide margins, but I have some system output I need to post.
I wonder if anyone has had Samba VMS on a VAX hang up on them after about two weeks.
This process is causing BACKUP to hang, and any process that tries to run TCPIP hangs
as well.
The system in question is a Vaxstation 4000/60 with 32 MB RAM running VMS 7.1 and
TCPIP 5.1, eco 5.
NMBD is going into MUTEX and hanging up my system so badly that I have to reboot it.
Take a look at SHOW SYSTEM:
OpenVMS V7.1 on node ORFF 27-OCT-2003 15:02:53.91 Uptime 16 16:48:37
Pid Process Name State Pri I/O CPU Page flts Pages
20200081 SWAPPER HIB 16 0 0 00:00:44.72 0 0
20200086 CONFIGURE HIB 10 6 0 00:00:03.62 6644 167
20200088 IPCACP HIB 10 6 0 00:00:00.13 6019 101
20200089 ERRFMT HIB 8 11036 0 00:00:34.25 1784 119
2020008A CACHE_SERVER HIBO 16 -- swapped out -- 121
2020008B CLUSTER_SERVER HIB 8 11 0 00:00:00.05 192 281
2020008C OPCOM HIB 7 22437 0 00:00:47.87 6013 169
2020008D AUDIT_SERVER HIB 10 44564 0 00:00:53.23 3290 397
2020008E JOB_CONTROL HIB 10 28477 0 00:00:41.15 10181 163
2020008F QUEUE_MANAGER HIB 9 8537 0 00:00:43.73 13399 549
20200090 SECURITY_SERVER HIB 10 7540 0 00:02:08.84 131238 603
20200091 SMISERVER HIB 9 35 0 00:00:00.66 7951 71
20200092 TP_SERVER HIB 9 96640 0 00:08:43.59 36033 158
20200093 TCPIP$TNS2 HIBO 4 -- swapped out -- 381
20200094 TCPIP$TNS1 HIBO 4 -- swapped out -- 399
20200095 TCPIP$INETACP HIB 8 25761 0 00:01:03.77 16604 589
20200096 TCPIP$BIND_1 LEF 9 1182971 0 00:48:14.88 189439 1813 N
20200097 TCPIP$PORTM_1 LEF 10 110 0 00:00:00.74 7484 59 N
20200098 TCPIP$FTP_1 LEF 10 207 0 00:00:01.29 8642 1172 N
20200099 TCPIP$LBROKER_1 LEF 9 3381919 0 00:56:27.58 203993 691 N
2020009A TCPIP$METRIC_1 LEF 10 556766 0 00:13:09.13 43868 183 N
2020009B TCPIP$NFS_1 HIB 8 152 0 00:00:21.60 12167 59 N
2020009C TCPIP$MOUNTD_1 LEF 10 240 0 00:00:01.36 7027 65 N
2020009D TCPIP$NTP_1 LEF 9 1481698 0 00:02:06.88 101058 339 N
2020009F TCPIP$POP_1 HIB 10 25064 0 00:01:55.96 25131 1207 N
202000A0 SMTP_ORFF_01 HIB 6 20353 0 00:02:13.01 28122 2003
202000A3 TNT_SERVER HIB 6 10137 0 00:09:12.23 212289 1268
20204824 SMBD_BG1152 RWAST 8 259 0 00:00:01.98 2635 2660 N
202000A5 CircleMUD LEF 6 192364 0 00:01:37.58 24704 489
202000A6 NMBD MUTEX 9 690146 0 03:31:10.51 117217 744
20203C27 ZAP_BRANAGEN LEF 8 5674 0 00:00:10.00 2477 616
202000AB TNT1202000A3 LEFO 1 -- swapped out -- 495 S
202000AF SYSTEM LEF 5 2715 0 00:00:29.42 30293 268
202049B0 CYRIL LEF 5 35421 0 00:04:36.08 2859 1940
20204A31 OPERAGOST RWAST 6 579 0 00:00:01.80 1289 329
20203EB2 XOO6 LEF 9 11859 0 00:00:45.03 4273 1742
20204833 SWAT_BG1177 RWAST 6 135 0 00:00:01.62 2232 1892 N
20204934 _VTA1256: CUR 4 821 0 00:00:04.02 3887 322
20204A39 TCPIP$SM_BG3533 LEF 8 143 0 00:00:01.57 2328 1517 N
20204743 BATCH_553 LEF 6 1360 0 00:00:05.19 1391 1288 B
Action taken: first, Samba wasn't responing do I tried to SWAT in to restart it.
SWAT hung up in the middle of bring up the web page. So I went in and tried to
kill NMBD. Of course this didn't work. I tried killing SWAT and SMBD processes
in the hope that would free up something. They just went into RWAST. I tried to
run TCPIP so I could disable SAMBA, but it hung up before giving a prompt,
putting that process into RWAST as well.
Here's what NMBD looks like with SHOW PROCESS in SDA:
Process index: 0026 Name: NMBD Extended PID: 202000A6
---------------------------------------------------------
Status : 00140023 res,delpen,respen,phdres,login
Status2: 00000001 quantum_resched
PCB address 81EE6B40 JIB address 81E86600
PHD address 83639800 Swapfile disk address 00000000
Master internal PID 00010026 Subprocess count 0
Internal PID 00010026 Creator internal PID 00000000
Extended PID 202000A6 Creator extended PID 00000000
State MUTEX Termination mailbox 0000
Current priority 9 AST's enabled KESU
Base priority 4 AST's active ES
UIC [00001,000004] AST's remaining 21
Mutex count 0 Buffered I/O count/limit 16/18
Waiting EF cluster 0 Direct I/O count/limit 17/18
Starting wait time 1B011B1B BUFIO byte count/limit 128/896
Event flag wait mask 81E86600 # open files allowed left 10
Local EF cluster 0 C0000001 Timer entries allowed left 8
Local EF cluster 1 80000000 Active page table count 0
Global cluster 2 pointer 00000000 Process WS page count 560
Global cluster 3 pointer 00000000 Global WS page count 184
and SHOW SYSTEM /CHANNEL:
Process index: 0026 Name: NMBD Extended PID: 202000A6
---------------------------------------------------------
%SDA-W-NOACCESS, process not accessible (swapped out or suspended)
Process active channels
-----------------------
Channel Window Status Device/file accessed
------- ------ ------ --------------------
0010 00000000 DKA0:
%SDA-E-NOREAD, unable to access location 8363B7EC
Here's SHOW PROCESS for the SMDB process that was left running:
Process index: 0024 Name: SMBD_BG1152 Extended PID: 20204824
----------------------------------------------------------------
Status : 00240023 res,delpen,respen,phdres,netwrk
Status2: 00000001 quantum_resched
PCB address 81EF9100 JIB address 81EAC700
PHD address 8368B800 Swapfile disk address 00000000
Master internal PID 00900024 Subprocess count 0
Internal PID 00900024 Creator internal PID 00000000
Extended PID 20204824 Creator extended PID 00000000
State RWAST Termination mailbox 0013
Current priority 8 AST's enabled KESU
Base priority 6 AST's active S
UIC [00001,000004] AST's remaining 4195
Mutex count 0 Buffered I/O count/limit 511/512
Waiting EF cluster 0 Direct I/O count/limit 4094/4096
Starting wait time 1B011919 BUFIO byte count/limit ******/2046848
Event flag wait mask 00000001 # open files allowed left 294
Local EF cluster 0 80000000 Timer entries allowed left 30
Local EF cluster 1 80000000 Active page table count 0
Global cluster 2 pointer 00000000 Process WS page count 2321
Global cluster 3 pointer 00000000 Global WS page count 339
WHAT'S UP WITH THE ASTERISKS IN BUFIO?
And SHOW PROCESS /CHANNEL:
Process active channels
-----------------------
Channel Window Status Device/file accessed
------- ------ ------ --------------------
0010 00000000 DKA0:
0020 81DD4FC0 DKA0:[SAMBA.BIN]SMBD.EXE;1
0030 81DCB700 DKA0:[VMS$COMMON.SYSLIB]SECURESHRP.EXE;1
(section file)
0040 81DCE080 DKA0:[VMS$COMMON.SYSLIB]SECURESHR.EXE;1
(section file)
0050 81DD06C0 DKA0:[VMS$COMMON.SYSLIB]LIBRTL.EXE;1 (section file)
0060 81DC8940 DKA0:[VMS$COMMON.SYSEXE]DCL.EXE;1 (section file)
0070 81DC5040 DKA0:[VMS$COMMON.SYSLIB]UVMTHRTL.EXE;1 (section file)
0080 81DE1340 DKA0:[VMS$COMMON.SYSLIB]DCLTABLES.EXE;81
(section file)
0090 81F0F600 Busy DKA0:[SYS0.SYSMGR]SMBD_STARTUP.LOG;210
00A0 81E9B300 DKA0:[SAMBA.BIN]SMBD_STARTUP.COM;7
00B0 81DD1780 DKA0:[VMS$COMMON.SYSLIB]DECC$SHR.EXE;3 (section file)
00C0 81DD1980 DKA0:[VMS$COMMON.SYSLIB]CMA$TIS_SHR.EXE;1 (section file)
00D0 81DD1740 DKA0:[VMS$COMMON.SYSLIB]UCX$IPC_SHR.EXE;1 (section file)
00E0 81DCFF40 DKA0:[VMS$COMMON.SYSLIB]TCPIP$ACCESS_SHR.EXE;1 (section file)
Process index: 0024 Name: SMBD_BG1152 Extended PID: 20204824
----------------------------------------------------------------
Channel Window Status Device/file accessed
------- ------ ------ --------------------
00F0 00000000 BG1152:
0100 81E8F540 DKA0:[VMS$COMMON.SYSEXE]RIGHTSLIST.DAT;1
0110 81E5C780 DKA0:[SAMBA.PRIVATE]SECRETS.TDB;1
0120 81E9B480 DKA0:[SAMBA]LOG.SMBD;1
0130 00000000 Busy DKA0:
DKA0: is my system disk, so dismounting it just to free this process isn't an option!
I don't get it- I assume the NMBD is what's hung up, but I don't know what it means
by "%SDA-E-NOREAD, unable to access location 8363B7EC" and it doesn't show anything busy!
It also doesn't look like either process has exhausted any quota.
----------------------------------
Stephen Eickhoff
operagost at email.com
----------------------------------
More information about the samba-vms
mailing list