[linux-cifs-client] OOPS in 2.6.26

Wed Jul 16 17:25:03 GMT 2008

(Resending without attachments, as I think my post was automatically
rejected. Jeff -- Attachments follow in an off list email).

On Wed, Jul 16, 2008 at 10:26:20AM -0400, Jeff Layton wrote:

>> I just upgraded to 2.6.26. On copying a large file from my server, my
>> client Oops'ed, and eventually caused my system to become unusable.
>> Here's the message I got:
>> 
>>     BUG: unable to handle kernel paging request at f8001d6f
>>     IP: [<f91c4b42>] :cifs:CIFSSMBQAllEAs+0x242/0x340
>>     *pde = 00000000 
>>     Oops: 0000 [#1] SMP 
>>     Modules linked in: nls_iso8859_1 cifs nls_base mmc_block b43 ssb rng_core mac80211 crc32 led_class input_polldev rfkill_input rfkill aes_i586 aes_generic libafs(P) e1000e i915 drm fuse ipv6 autofs4 ipt_recent ipt_addrtype xt_multiport xt_mac xt_state xt_tcpudp ipt_REJECT ipt_LOG xt_limit iptable_nat nf_nat nf_conntrack_ipv4 iptable_filter ip_tables xt_iprange x_tables nf_conntrack_ftp nf_conntrack snd_pcm_oss snd_mixer_oss snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device ext2 mbcache arc4 ecb crypto_blkcipher loop 8250_pnp 8250 serial_core acpi_cpufreq sg sr_mod cdrom usbhid usb_storage scsi_mod intelfb fb i2c_algo_bit cfbcopyarea intel_agp i2c_core button sdhci mmc_core video backlight output wmi battery ehci_hcd ac snd_hda_intel agpgart uhci_hcd snd_pcm snd_timer snd_page_alloc snd_hwdep cfbimgblt cfbfillrect snd serio_raw usbcore soundcore evdev [last unloaded: ricoh_mmc]
>>     
>>     Pid: 1417, comm: cp Tainted: P       A  (2.6.26 #1)
>>     EIP: 0060:[<f91c4b42>] EFLAGS: 00210282 CPU: 1
>>     EIP is at CIFSSMBQAllEAs+0x242/0x340 [cifs]
>>     EAX: f8001d6e EBX: f8001d6e ECX: 006d6495 EDX: 1a59604f
>>     ESI: d9cd003d EDI: 00000000 EBP: f8001d72 ESP: ed3cdec0
>>     DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
>>     Process cp (pid: 1417, ti=ed3cc000 task=cf105b80 task.ti=ed3cc000)
>>     Stack: d9cd0000 ed3cdee4 00000000 ef9e97c8 f76abf40 d9f99c00 000008bf 006d6495 
>>     dc44d000 0000004f d9cd0000 d9cd0000 f76abf40 ffffffa1 00000000 00000000 
>>     f91de6da 00000000 00000000 f8b96ce0 00000000 000008bf f76e01c0 d9f99c00 
>>     Call Trace:
>>     [<f91de6da>] cifs_listxattr+0xba/0x180 [cifs]
>>     [<f91de620>] cifs_listxattr+0x0/0x180 [cifs]
>>     [<c0194514>] vfs_listxattr+0x24/0x40
>>     [<c01947e0>] listxattr+0x50/0xb0
>>     [<c01948c9>] sys_llistxattr+0x39/0x50
>>     [<c01c7f40>] reiserfs_file_write+0x0/0xc0
>>     [<c0103bd9>] sysenter_past_esp+0x6a/0x91
>>     [<c0300000>] migration_call+0x340/0x460
>>     =======================
>>     Code: 85 d8 00 00 00 0f b6 43 01 0f b7 4b 02 29 c2 83 ea 05 29 ca 85 d2 0f 8e 63 ff ff ff 8d 44 05 01 01 c8 89 c3 8b 4c 24 1c 8d 68 04 <0f> b6 43 01 8d 44 08 06 39 44 24 48 89 44 24 1c 7e bc a1 34 f8 
>>     EIP: [<f91c4b42>] CIFSSMBQAllEAs+0x242/0x340 [cifs] SS:ESP 0068:ed3cdec0
>>     ---[ end trace f626a9f8ae856e81 ]---
> 
> Not a panic that I've seen before. Is this reproducible?

Ah. Haven't tried reproducing. Since it's the machine I use primarily
for work, intentionally crashing it will have to wait till the
weekend...

If I free up some disk space, I could try reproducing it under kvm.

>> Finally, I should mention that I have a few send / receive errors in my
>> /var/log/messages too:
>> 
>>     CIFS VFS: server not responding
>>     CIFS VFS: No response to cmd 46 mid 323
>>     CIFS VFS: Send error in read = -11
>>     CIFS VFS: Write2 ret -11, wrote 0
>>     CIFS VFS: No response for cmd 162 mid 14620
>>     CIFS VFS: No response to cmd 47 mid 14626
>>     CIFS VFS: No response to cmd 47 mid 14627
>>     CIFS VFS: Write2 ret -11, wrote 0
>>     CIFS VFS: No response for cmd 162 mid 14633
>>     CIFS VFS: No response to cmd 47 mid 14632
>>     CIFS VFS: Write2 ret -11, wrote 0
>>     CIFS VFS: No response to cmd 47 mid 14639
>>     CIFS VFS: No response to cmd 47 mid 14640
> 
> cmd 47 (0x2f) is SMB_COM_WRITE_ANDX. 46 (0x2e) is SMB_COM_READ_ANDX.
> Those errors mainly just mean that your server is being slow to
> respond here. SMBQueryAllEAs uses SMB_COM_TRANSACTION2 (0x32) and I
> don't see any of those in the logs above. Still though, I suppose it
> could be related to retransmissions of those calls. 

Yes, my server is known to be slow. The samba throughput (on Mac/Linux)
is much lower than the hard disk speed and channel capacity. Thus a lot
of complicated operations (e.g. running mke2fs, or rsync-ing a large
directory) tend to fail.

This is the first time I've had trouble with cp though...

>> Any idea what's going on? My server is a Vantex Nexstar LX, and my
>> client runs Gentoo (if that makes any difference). I attach my kernel
>> configuration,
> 
> Any chance you could bzip2 cifs.ko module and send it to me? It would
> be nice to disassemble it and see if we can tell where it fell down.

Of course. I have no control over the server. But my cifs.ko and
System.map are both attached. Let me know if you need anything else,

GI

-- 
'Worry' -- The interest you pay on trouble before it comes.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
Url : http://lists.samba.org/archive/linux-cifs-client/attachments/20080716/691f2693/attachment.bin