[clug] stdio library, feof(3) can't detect EOF in a script [was Re: using 'dd' in a pipeline.]

steve jenkin sjenkin at canb.auug.org.au
Mon Nov 24 03:30:56 MST 2014

On 24 Nov 2014, at 2:33 pm, steve jenkin <sjenkin at canb.auug.org.au> wrote:

> feof(3) from the stdio library detects EOF.
> Not sure about the interaction of syscalls for open/read and library calls to stdio streams.
> The function feof() tests the end-of-file indicator for the stream pointed to by stream

feof(3) cannot work in a shell script to detect EOF on stdin.

fread(3) from the “stdio” library needs to be called to initialise the ‘stream’ and only the act of reading can detect EOF.
The stdio ’stream' pre-reads a buffer (could be large).

There are two problems with that:
 - the size of the read has to be > zero, discarding bytes
 - even if you try to read 1 byte at a time, much like fgets(3), the library has sucked in a buffer-full.

Exiting the process closes the file descriptor and discards the unread bytes, pretty much the exact opposite of what I want.

The only process that knows when the system cal read(2) hits EOF, is the one consuming data, in this case ‘dd’.

That means either modifying ‘dd’.

Or creating a minimal program that copies from blocks from STDIN to STDOUT, which can be a named pipe that ‘dd’ reads from.
Which sounds a lot like writing a stripped down, modified ‘dd’, but not nearly as good :(

John Mills solution is correct & complete and just proves how far ahead of the game Ousterhout’s TCL was :(
[PERL can do everything, probably up to and including “reboot Universe;", I know that as well.]

Eyal’s suggestion of capturing the STDERR of ‘dd’ and parsing the output is good, but presents quite a few challenges, the least of which is extracting all possible messages and writing a reasonable parser for them - which could be just grep, but might not be.

My test files are exactly 25 blocks (of 4kB) long, they don’t finish with a short block. Because it’s exact, no other hint is given.
EOF is only detected next iteration when ‘dd’ reads, and writes, zero bytes, which causes an extra hash to be written.

Clearly others have solved this problem in the past. I need to keep chipping away.


More information about the linux mailing list