[clug] searching for a corporate search system

Mark Triggs mst at dishevelled.net
Sat Jul 18 05:15:33 MDT 2009


Lucene is a popular open-source indexing engine, and Nutch builds on top
of Lucene to provide a crawler/indexer that can handle HTTP, filesystems
and various document types:

  http://wiki.apache.org/nutch/

I must admit I don't have a huge amount of experience with administering
Nutch, but I hear good things about it.  Lucene itself is certainly fast
and very flexible.

Cheers,

Mark


Andrew <andrew at donehue.net> writes:

> Hi All,
>
> I am considering purchasing a google mini for a corporate environment
> (searching intranet wiki's, a few other types of web pages, and
> SMB/network shares).  Types of documents that need to be searched (at a
> minimum) would be text, html, pdf, doc. 
>
> Before buying the mini, I thought I should check to see what open source
> systems are available, and I found lots.  To cut through the clutter I
> decided to ask this list... has anyone else here has had a good
> experience with an open source search/indexing system that is fast, easy
> to use and easy to administer? Any pointers are greatly appreciated.
>
>
>
> Cheers,
> Andrew.

-- 
Mark Triggs
<mst at dishevelled.net>


More information about the linux mailing list