[clug] Digital Corpora - Govdocs1

Brad Hards bradh at frogmouth.net
Wed Jul 17 03:33:44 MDT 2013

I find myself in need of some data for testing an indexing application. Its an 
open source project (DDF - https://github.com/codice/ddf), but we don't have a 
(public, common) shared data set.

Then I came across http://digitalcorpora.org/corpora/files

I'm happy to download it, but I'm guessing its going to be ~300G to 400G, 
which will take a while.

So before I start downloading (all of) it, I wondered if anyone in Canberra 
already had it, and would be willing to put it onto (my) removable media?

Also,  if anyone wants it, and is willing to wait for me to download it (or 
copy it), just let me know.


More information about the linux mailing list