[clug] Perl (or python) page scraping tools

Michael James clug at james.st
Tue Aug 8 07:18:26 GMT 2006

The grubby task of HTML page-scraping is rearing its ugly head.

The first task is to snarf a 2 column table,
 the first column is the variable name
 the second, its value.

Ideally I'd like to get back a hash:   table{name} = value

Sounds simple but the HTML::TableParse module
 returns a complicated and too general structure.

Anyone got any recommendations of modules for scraping HTML?


There is no perl one line hack
 that a page of java won't do more elegantly.

More information about the linux mailing list