[clug] Unique Id's and CD's

Andrew Janke a.janke at gmail.com
Fri May 8 02:46:50 GMT 2009

2009/5/8 steve jenkin <sjenkin at canb.auug.org.au>:
> Something I'm unclear about, does each CD have just one image file or
> many?? Only images, or other stuff too?

Lots of (dicom) image files.

161609	/media/cdrom0/dicom/12345
578	 /media/cdrom0/dicom/12345/22/40095876
578	 /media/cdrom0/dicom/12345/22/40095892
578	 /media/cdrom0/dicom/12345/22/40095908
578 /media/cdrom0/dicom/12345/22/40095924

(about 1000-2000 per CD depending on the sequences used).

Each file represents a single "slice" of a 3D imaging acquisition and
you then have to read the headers in order to sort out which groups of
files belong to each other and if a set of images is complete. This is
done downstream from the CD extraction though.

And then there is also the "dicomdir" meta-data file but you cannot
always rely on it. Some will also contain other things that are not
important to me. Dicom is an image file format from which you can dump
headers but as these CD's represent database dumps for a specific
patient from a RIS-PACS hospital system the headers will contain
differing information for separate dumps of the same data.

> Something to test on your existing set of image files is the uniqueness
> of MD5's in the first 128Kb, 512Kb, 1Mb, 4Mb, ...
> And/or the last fraction of the file.

Aye, but given the Dicom header problem above this would likely always
return a different number for CD's that have the same imaging data on
them but were dumped on different days/hours.

> If your data providers decide to change media - like dual-layer DVD or
> USB HDD's, they could be sending you many objects on a single media :-(

Yup! I already use this same script for dumps of DVD's and USB (flash)
drives  depending on how it is sent to me.  The eventual goal is to
use proper DICOM xfer protocols but it is not easy getting a network
connection into a Hospital network. :)

Andrew Janke
(a.janke at gmail.com || http://a.janke.googlepages.com/)
Canberra->Australia    +61 (402) 700 883

More information about the linux mailing list