[clug] searching for a corporate search system
Mark Triggs
mst at dishevelled.net
Sat Jul 18 05:15:33 MDT 2009
Lucene is a popular open-source indexing engine, and Nutch builds on top
of Lucene to provide a crawler/indexer that can handle HTTP, filesystems
and various document types:
http://wiki.apache.org/nutch/
I must admit I don't have a huge amount of experience with administering
Nutch, but I hear good things about it. Lucene itself is certainly fast
and very flexible.
Cheers,
Mark
Andrew <andrew at donehue.net> writes:
> Hi All,
>
> I am considering purchasing a google mini for a corporate environment
> (searching intranet wiki's, a few other types of web pages, and
> SMB/network shares). Types of documents that need to be searched (at a
> minimum) would be text, html, pdf, doc.
>
> Before buying the mini, I thought I should check to see what open source
> systems are available, and I found lots. To cut through the clutter I
> decided to ask this list... has anyone else here has had a good
> experience with an open source search/indexing system that is fast, easy
> to use and easy to administer? Any pointers are greatly appreciated.
>
>
>
> Cheers,
> Andrew.
--
Mark Triggs
<mst at dishevelled.net>
More information about the linux
mailing list