improper check on written in non mmap path of tdb_write, tdb 1.2.9

Rusty Russell rusty at rustcorp.com.au
Mon Feb 7 20:36:01 MST 2011


On Tue, 8 Feb 2011 12:27:24 am Michael Adam wrote:
> Hi,
> 
> sorry for the late reply.
> 
> Rusty, you are right in that I did those changes.
> Actually they were done in oder to treat only the correct
> error conditions on pwrite as errors. But it seems
> something is not quite correct here. Thanks, James,
> for spotting this.
> 
> There is another place where I did the first patch like this:
> it is tdb_expand_file() in common/io.c
> But there, I think I did it right or at least better
> (commits d568e2b and af8432e in master):
> 
> 263         /* now fill the file with something. This ensures that the
> 264            file isn't sparse, which would be very bad if we ran out of
> 265            disk. This must be done with write, not via mmap */
> 266         memset(buf, TDB_PAD_BYTE, sizeof(buf));
> 267         while (addition) {
> 268                 size_t n = addition>sizeof(buf)?sizeof(buf):addition;
> 269                 ssize_t written = pwrite(tdb->fd, buf, n, size);
> 270                 if (written == 0) {
> 271                         /* prevent infinite loops: try _once_ more */
> 272                         written = pwrite(tdb->fd, buf, n, size);
> 273                 }
> 274                 if (written == 0) {
> 275                         /* give up, trying to provide a useful errno */
> 276                         TDB_LOG((tdb, TDB_DEBUG_FATAL, "expand_file write "
> 277                                 "returned 0 twice: giving up!\n"));
> 278                         errno = ENOSPC;
> 279                         return -1;
> 280                 } else if (written == -1) {
> 281                         TDB_LOG((tdb, TDB_DEBUG_FATAL, "expand_file write of "
> 282                                  "%d bytes failed (%s)\n", (int)n,
> 283                                  strerror(errno)));
> 284                         return -1;
> 285                 } else if (written != n) {
> 286                         TDB_LOG((tdb, TDB_DEBUG_WARNING, "expand_file: wrote "
> 287                                  "only %d of %d bytes - retrying\n", (int)written,
> 288                                  (int)n));
> 289                 }
> 290                 addition -= written;
> 291                 size += written;
> 292         }
> 
> At least it is better in two respects:
> - it tries more than once
> - it correctly corrects the summed up size of what has been written
> maybe the made-up errno of ENOSPC is not good.
> 
> Doing these patches was triggered by a real error situation
> over which I stumbled, in which using
> (return code of pwrite) != (bytes to be written)
> really was wrong. I can't quite remember what it was right now, but
> I will try to reproduce it.

This is still voodoo programming.  And it's due to bad commit messages,
so we don't know what was actually being fixed :(

I suspect you were actually hitting quota or equiv, but I'm nervous in case
we're wrong.  I'm tempted to apply your patches, but *not* do it for TDB2,
where I will just report a failure.

Thanks,
Rusty.


More information about the samba-technical mailing list