[clug] using 'dd' in a pipeline

Scott Ferguson scott.ferguson.clug at gmail.com
Mon Nov 24 00:30:53 MST 2014


Please don't email me directly and CC the list, or vise versa. I
subscribe to the list[*1], and it breaks my "Reply to list" one-click
ability.

On 24/11/14 17:46, steve jenkin wrote:
> 
> On 24 Nov 2014, at 4:43 pm, Scott Ferguson <scott.ferguson.clug at gmail.com> wrote:
> 
>> xy...
>>
>> Why not use Marc Stevens' tools instead?
>>
>> Are you planning on renting an AWS large GPU service for this - or do
>> you have your own GPU farm?
>>
>> Kind regards
> 
> Links for Marc, for anyone playing along at home:
> <http://marc-stevens.nl/research/>
> <https://code.google.com/p/hashclash/>
> 
> 1. I’m looking at the other side of the equation, not breaking MD5’s, but building checks that are robust in the face of deliberately constructed collisions.

Again, xy problem?
https://marc-stevens.nl/research/software/download.php?file=libdetectcoll-0.2.zip

>   Currently, the best I can do is to use MD5 or SHA1 as ‘possible match’ of blocks,
>    then use other methods to confirm “not a match”,
>    with a byte-by-byte comparison needed for a conservative and definitive comparison. Slower, but correct is better.
>    The list of hashes I built _must_ include provision for collisions, i.e. keep a list, not a single entry per ‘bucket’.
> 
>   Don’t need to rent time on a super-computer to do that :)
> 
> 2. I want the generic solution to my question on splitting files with ‘dd’. 

The problem 'appear' to be that you need a known outcome to feed dd, and
in this instance you won't have that until you've done the analysis.
Corrections to my 'understanding' welcomed.

> Marc’s tools don’t do this, they do cryptanalysis.

Please see the supplied link above.

>    ‘dd’ has been around in some form since 1970, Steve Bourne's shell for a similar time.

Yes.

>     I’d be very surprised if I’m the first person to have wanted to stick ‘dd’ in a pipeline and do something like this.

Undoubtably you are correct, one approach that allows for the nature of
jpgs *'might'* be to search for 0xFF, 0xD9. But... I don't really
understand your approach to the problem. :(



>     No point in “reinventing the wheel”, when someone has already built a better mousetrap, if you’ll excuse the mixed metaphor.

Agreed - hence my suggestions. Though, the exercise may have
serendipitous results, and even failure is not without benefits - it's
the weighting of the results and benefits that's especially problematic.

> 
> --
> 

Kind regards


[*1] Meant gently - and mainly for the benefit of other (future)
readers. Established nettiquette is not to CC a poster unless they ask.


More information about the linux mailing list