[clug] using 'dd' in a pipeline

Eyal Lebedinsky eyal at eyal.emu.id.au
Sun Nov 23 20:49:37 MST 2014

On 24/11/14 13:56, steve jenkin wrote:
> I�ve been looking at deliberate MD5 collisions and went to write a simple script based on �dd�, only to fall flat on my face - detecting �end of file� from STDIN.
> �dd� doesn�t consider EOF as an error and returns �success�.
> I could use dd�s �iseek�, but that ties me to named files in the file system :(
> I�d prefer to use a pipeline - current script below, but it requires I first know the size of the file, exactly the same issue as �iseek�.
> One option is to catch the output of �dd� in a temporary file, then test for its size, ending when the temp file is zero length.
> I tried using the bash �read� with "-n 0�, but that reads an entire line [not good for a binary file]
> �IFS=�� read -r -n 1� runs, but consumes a byte of the input and discards binary �\0� bytes :(
> The stat() and read() system calls either don�t report EOF, or return zero-length at EOF.
> I could test if a program that does a read() of zero-length, then uses �select� with a short timeout, would reliably detect EOF on STDIN if it was a pipe or file.
> It doesn�t seem reliable & portable to me, but it may be.
> ioctl() and fcntl() didn�t seem to have
> Anyone got ideas that I can use in a shell script? I�d prefer not to have to write code :)
> There may already be extended versions of �dd� that do this (haven�t checked these: dc3dd dcfldd dd_rescue ddd)
> Or suggestion on existing tools or how to implement a simple test for EOF on STDIN (pipe or file) in PERL, PHP or other scripting languages.
> Thanks in Advance
> steve
> =====================
> Two files with same MD5 hash (e06723d4961a0a3f950e7786f3766338)
> <http://natmchugh.blogspot.co.at/2014/10/how-i-created-two-images-with-same-md5.html>
> Files:
> brown.jpg <http://www.fishtrap.co.uk/james.jpg.coll>
> white.jpg <http://www.fishtrap.co.uk/barry.jpg.coll>
> =====================
> #!/bin/bash
> bs=4096
> count=1
>   for file in white.jpg brown.jpg
> do
>    echo "$file blocksize=${bs}"
>    size=$(ls -l $file | awk -v bs="$bs" '{print $5 / bs}')
>    cat $file|\
>    while [[ $size -gt 0 ]]
>    do
>      echo -n "$size "
>      dd bs=$bs count=$blocks 2>/dev/null |md5sum -
>      size=$(( $size - $blocks ))
>    done
>    echo
> done

Maybe I am just slow... is the need: detection of a short 'dd' in the code above?
Will this do it:
	dd bs=$bs count=$blocks 2>dd.2 |md5sum -
	read bytes xxx <<<"`grep bytes dd.2`"
	unset xxx
	echo "processed $bytes bytes"


> --
> Steve Jenkin, IT Systems and Design
> 0412 786 915 (+61 412 786 915)
> PO Box 48, Kippax ACT 2615, AUSTRALIA
> mailto:sjenkin at canb.auug.org.au http://members.tip.net.au/~sjenkin

Eyal Lebedinsky (eyal at eyal.emu.id.au)

More information about the linux mailing list