[clug] January CLUG Programmers' Special Interest Group meeting

Mike Carden mike.carden at gmail.com
Mon Jan 8 22:41:58 GMT 2007

On 1/9/07, Steve Walsh <steve at nerdvana.org.au> wrote:
> Abstract:
>                 The CLUG's answer to thedailywtf.com

As a fan of the daily wtf and since I can't make it Thursday evening,
I'll share a wee wtf with you all here via the list.

Imagine if you will, a java application that creates text files with a
metadata wrapper. A typical file will contain a bunch of XML headers,
a content block (often base64 ASCII) and a bunch of XML footers.
Further imagine that one of the metadata elements in the footer is a

The checksum is part of the metadata so that the integrity of the
file's contents can be checked at any time. So far, so good.

All this had been in place for some years, was subjected to a rigorous
testing regime and all was declared to be meeting specifications.

Then bring along a clueless newbie. Let's call him 'crash.'

Crash was idly looking at some nice fresh files one day and
daydreaming his way through the metadata. His roving eye lit upon the
checksum tag and he gazed at the familiar looking jumble of letters
and numbers it contained. Just for fun, he decided to manually run the
content through md5sum to verify the checksum. Oops, no match.

No problem. It's probably a checksum for the content plus its XML
tags. Try again. Nope. Maybe it should include the headers? No. The
footers? No. Random combinations of headers, footers, content and the
number of days in the month? No.

Oh, maybe it isn't MD5 after all. Try all of the above with a couple
of other checksum tests. Nope. Hmmmm.... "Fellas!"

A minor panic ensues. WTF does the checksum represent? Yes, it is an
md5sum and it is certainly different for each and every source file.

Ooooh. Small issue discovered with the way java's Simple API for XML
does its parsing of tags. Turns out the checksum in each case was in
fact an MD5 checksum of... an uninitialised byte array in the java
virtual machine. Yes, a nice bit of pseudo-randomness.

Needless to say, the checksum now does what it's meant to, and even
more importantly the XML metadata now leaves no doubt as to the
algorithm used and the exact bits that it should be applied to. And
the developer who originally wrote it and the tester who passed it
have both moved on to rewarding careers in other domains, so everyone
is happy.


More information about the linux mailing list