Bogus rsync "Success" message when out of disk space

jw schultz jw at pegasys.ws
Sun Apr 27 10:42:14 EST 2003


On Sat, Apr 26, 2003 at 03:09:29PM -0500, John Van Essen wrote:
> Patches welcome, eh, Paul?
> 
> Upon further (belated) investigation, there are 2 affected places
> in receiver.c with this error message.  Both call write_file().
> And write_file is called only in those two places.  So that is the
> appropriate location to patch.  Especially since the obvious fix is
> to use the rewrite code already there for the sparse file writes.

This looks correct to me.  On partial write we need to
retry.  The callers to write_file only test for
completeness.

I'd drop the added comments or put a generic
"loop until all is written or error" at the top of the loop.
The inline comments make the code harder to read.

> -------------------------------------8<-------------------------------------
> --- fileio.c.orig       Fri Jan 25 15:07:34 2002
> +++ fileio.c    Sat Apr 26 12:16:25 2003
> @@ -69,25 +69,28 @@
>         return len;
>  }
>  
>  
>  
>  int write_file(int f,char *buf,size_t len)
>  {
>         int ret = 0;
>  
> -       if (!sparse_files) {
> -               return write(f,buf,len);
> -       }
> -
>         while (len>0) {
> -               int len1 = MIN(len, SPARSE_WRITE_SIZE);
> -               int r1 = write_sparse(f, buf, len1);
> +               int r1;
> +               if (sparse_files) {
> +                       int len1 = MIN(len, SPARSE_WRITE_SIZE);
> +                       r1 = write_sparse(f, buf, len1);
> +               } else {
> +                       /* Normal writes are in this loop, too, so that */
> +                       /*   retries of partial writes will set errno.  */
> +                       r1 = write(f, buf, len);
> +               }
>                 if (r1 <= 0) {
>                         if (ret > 0) return ret;
>                         return r1;
>                 }
>                 len -= r1;
>                 buf += r1;
>                 ret += r1;
>         }
>         return ret;
> -------------------------------------8<-------------------------------------
> 
> Test of unpatched rsync (bogus error message):
> 
> write failed on jvejunk2 : Success
> rsync error: error in file IO (code 11) at receiver.c(243)
> rsync: connection unexpectedly closed (200 bytes read so far)
> rsync error: error in rsync protocol data stream (code 12) at io.c(165)
> 
> 
> Test of patched rsync (correct error message):
> 
> write failed on jvejunk2 : No space left on device
> rsync error: error in file IO (code 11) at receiver.c(243)
> rsync: connection unexpectedly closed (200 bytes read so far)
> rsync error: error in rsync protocol data stream (code 12) at io.c(165)
> 
> -- 
>         John Van Essen <vanes002 at umn.edu>
> 
> 
> 
> 
> On Thu, 7 Nov 2002 12:37:09, Green, Paul <Paul.Green at stratus.com> wrote:
> >John Van Essen [mailto:vanes002 at umn.edu] wrote, in partial:
> >> In receiver.c there are three places with code similar to:
> >>
> >>    if (fd != -1 && write_file(fd,data,i) != i) {
> >>        rprintf(FERROR,"write failed on %s : %s\n",fname,strerror(errno));
> >>        exit_cleanup(RERR_FILEIO);
> >>    }
> >>
> >> A partial write due to running out of room (as opposed to a write
> >> that fails to write any data at all) apparently does not set an
> >> error code (errno stays 0).  This is on RedHat 7.2 btw.
> >[snip]
> >> So the rprintf lines need to be changed to something like this:
> >> 
> >>         rprintf(FERROR,"write failed on %s : %s\n",fname,
> >>           strerror(errno ? errno : ENOSPC));
> >
> >I disagree.  If you read the POSIX standard, you will see that the error
> >code is returned on the NEXT call to write the data. POSIX requires that the
> >"short write" not return any error.  (It can't, because it has to set the
> >return value to -1 to indicate the error, but it also has to set the return
> >value to the number of bytes written in the short write; hence, two calls
> >are needed).  So, there are more problems with this code than just the way
> >it handles errors.
> >
> >See paragraph 6.4.2.1 of the POSIX-1996 standard.  To quote the relevant
> >paragraph here:
> >
> >"If a write() requests that more bytes be written than there is room for
> >(for example, the physical end of a medium), only as many bytes as there is
> >room for shall be written. For example, suppose there is space for 20 bytes
> >more in a file before reaching a limit. A write of 512 bytes would return
> >20. The next write of a nonzero number of bytes would give a failure return
> >(except as noted below)."
> >
> >The exception has to do with getting interrupted by a signal and is not
> >interesting here.
> >
> >"Upon successful completion, write() shall return an integer indicating the
> >number of bytes actually written. Otherwise, it shall return a value of -1
> >and set errno to indicate the error."
> >
> >"[ENOSPC] There is no free space remaining on the device containing the
> >file."
> >
> >So, RedHat is correct in not setting errno in this case.  And you are
> >correct that ENOSPC is the proper error code for this case, but you only
> >know it is out of space via delphic knowledge.  rsync can't possibly know
> >that.  I would not want to see rsync assume it knows the proper error code;
> >that's the job of the OS.  Instead, rsync should be recoded to continue
> >attempting to write data until get it gets a nonzero error code, not until
> >it gets a short write.
> >
> >Patches welcome.
> >
> >PG
> >--
> >Paul Green, Senior Technical Consultant, Stratus Technologies.
> >Voice: +1 (978) 461-7557; FAX: +1 (978) 461-3610
> >Speaking from Stratus not for Stratus
> 
> -- 
> To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
> Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
> 

-- 
________________________________________________________________
	J.W. Schultz            Pegasystems Technologies
	email address:		jw at pegasys.ws

		Remember Cernan and Schmitt


More information about the rsync mailing list