[clug] using 'dd' in a pipeline

Sun Nov 23 23:46:49 MST 2014

On 24 Nov 2014, at 4:43 pm, Scott Ferguson <scott.ferguson.clug at gmail.com> wrote:

> xy...
> 
> Why not use Marc Stevens' tools instead?
> 
> Are you planning on renting an AWS large GPU service for this - or do
> you have your own GPU farm?
> 
> Kind regards

Links for Marc, for anyone playing along at home:
<http://marc-stevens.nl/research/>
<https://code.google.com/p/hashclash/>

1. I’m looking at the other side of the equation, not breaking MD5’s, but building checks that are robust in the face of deliberately constructed collisions.
  Currently, the best I can do is to use MD5 or SHA1 as ‘possible match’ of blocks,
   then use other methods to confirm “not a match”,
   with a byte-by-byte comparison needed for a conservative and definitive comparison. Slower, but correct is better.
   The list of hashes I built _must_ include provision for collisions, i.e. keep a list, not a single entry per ‘bucket’.

  Don’t need to rent time on a super-computer to do that :)

2. I want the generic solution to my question on splitting files with ‘dd’. Marc’s tools don’t do this, they do cryptanalysis.
   ‘dd’ has been around in some form since 1970, Steve Bourne's shell for a similar time.
    I’d be very surprised if I’m the first person to have wanted to stick ‘dd’ in a pipeline and do something like this.
    No point in “reinventing the wheel”, when someone has already built a better mousetrap, if you’ll excuse the mixed metaphor.

--