Question on continuous EIOs in CIFS protocol
Richard Sharpe
realrichardsharpe at gmail.com
Fri Sep 19 16:18:25 MDT 2014
On Thu, Sep 18, 2014 at 8:09 PM, sandeep nag <sandeepnagamalli at gmail.com> wrote:
> In any event, why is read getting EINTR so many times in a row?
> This is what I am trying to figure out, due to log rotation few logs were
> lost on customer box, I will work on finding the reason for this.
>
> Is it because of the SIGNALs they are using for write completions?
> I will see the code and try to find this.
Is it possible you have disturbed the thread creation code and are no
longer blocking signals in the non-Samba threads? You can lose signals
that way and cause other problems.
> On Thu, Sep 18, 2014 at 11:53 AM, Richard Sharpe
> <realrichardsharpe at gmail.com> wrote:
>>
>> On Thu, Sep 18, 2014 at 11:49 AM, Volker Lendecke
>> <Volker.Lendecke at sernet.de> wrote:
>> > Hi!
>> >
>> > From this snipped I can't really tell what your big picture
>> > looks like. Can you please post the whole thing somewhere?
>> > I'd be happy with your patchset, as long as I can cleanly
>> > apply this to 3.5.15.
>>
>> From what I recall this is deep in their VFS module ...
>>
>> In any event, why is read getting EINTR so many times in a row?
>>
>> Is it because of the SIGNALs they are using for write completions?
>>
>> > Thanks,
>> >
>> > Volker
>> >
>> > On Thu, Sep 18, 2014 at 11:28:34AM -0700, sandeep nag wrote:
>> >> We are using 3.5.15 + Async IO + few patches.
>> >>
>> >> Here is the code:
>> >> In the write() call path we call ' cifs_wd_to_buf()'
>> >>
>> >> static cifs_error_t
>> >> cifs_wd_to_buf(
>> >> wdata_t *wd,
>> >> char *buf,
>> >> size_t n)
>> >> {
>> >> size_t bytes_read = 0;
>> >> size_t num_continuous_interupts = 0;
>> >> ssize_t read_ret = -1;
>> >> cifs_error_t err = CIFS_OK;
>> >>
>> >> if (wd->recv_data == false)
>> >> {
>> >> if (buf != wd->u.data_buf)
>> >> {
>> >> memcpy(buf, wd->u.data_buf, n);
>> >> }
>> >> CIFS_DONE();
>> >> }
>> >>
>> >> bytes_read = 0;
>> >> while(bytes_read != n)
>> >> {
>> >> read_ret = read(wd->u.rfd, buf + bytes_read, n - bytes_read);
>> >> if (read_ret == 0)
>> >> {
>> >> read_ret = -1;
>> >> CIFS_NOTICE("%s: Short read from socket, n %ld, bytes_read
>> >> %ld,"
>> >> " errno %d", __FUNCTION__, n, bytes_read,
>> >> errno);
>> >> if (errno == 0)
>> >> {
>> >> errno = EIO;
>> >> }
>> >> }
>> >>
>> >> if (read_ret == -1)
>> >> {
>> >> if (errno == EINTR)
>> >> {
>> >> num_continuous_interupts++;
>> >> if(num_continuous_interupts > 100)
>> >> {
>> >> errno =* EIO;*
>> >> *<==== This is the EIO error I was speaking about*
>> >> }
>> >> else
>> >> continue;
>> >> }
>> >>
>> >> assert(errno);
>> >> err = ERRNO_TO_CIFS_ERROR(errno);
>> >> CIFS_ERROR_DONE(err, "%s: Error reading file data from
>> >> socket",
>> >> __FUNCTION__);
>> >> }
>> >>
>> >> bytes_read += read_ret;
>> >> }
>> >>
>> >> assert(bytes_read == n);
>> >>
>> >> done:
>> >> return err;
>> >> }
>> >>
>> >>
>> >> On Thu, Sep 18, 2014 at 10:17 AM, Richard Sharpe <
>> >> realrichardsharpe at gmail.com> wrote:
>> >>
>> >> > On Thu, Sep 18, 2014 at 10:08 AM, Volker Lendecke
>> >> > <Volker.Lendecke at sernet.de> wrote:
>> >> > > Hi!
>> >> > >
>> >> > > Windows servers expect writes to succeed unless there's a
>> >> > > real error like a full disk or permission denied or so.
>> >> > > There won't be a retry. The retry logic needs to live within
>> >> > > smbd completely.
>> >> > >
>> >> > > For a VFS that goes out to a socket and not to a local
>> >> > > syscall, I would recommend that you look at our async I/O.
>> >> > > With Samba 4.1 we have implemented very flexible VFS
>> >> > > operations (pread_send/recv and pwrite_send/recv) that are
>> >> > > able to handle this nicely. In case you're sitting on an
>> >> > > earlier version, you need to take a look at
>> >> > > aio_read/aio_write/aio_return. Not as flexibile, but you can
>> >> > > also implement async retry logic there.
>> >> > >
>> >> > > It would be best if you would post your VFS module source
>> >> > > somewhere (it's GPL in the end :-)) so that we can make
>> >> > > recommendations.
>> >> >
>> >> > What he didn't tell you is that it is Samba 3.5.15 or so and they are
>> >> > using the Async IO stuff in that version.
>> >> >
>> >> > --
>> >> > Regards,
>> >> > Richard Sharpe
>> >> > (何以解憂?唯有杜康。--曹操)
>> >> >
>> >
>> > --
>> > SerNet GmbH, Bahnhofsallee 1b, 37081 Göttingen
>> > phone: +49-551-370000-0, fax: +49-551-370000-9
>> > AG Göttingen, HRB 2816, GF: Dr. Johannes Loxen
>> > http://www.sernet.de, mailto:kontakt at sernet.de
>>
>>
>>
>> --
>> Regards,
>> Richard Sharpe
>> (何以解憂?唯有杜康。--曹操)
>
>
--
Regards,
Richard Sharpe
(何以解憂?唯有杜康。--曹操)
More information about the samba-technical
mailing list