"Nutch is open source web-search software. It builds on Lucene Java, adding web-specifics, such as a crawler, a link-graph database, parsers for HTML and other document formats, etc."
"The CDL eXtensible Text Framework (XTF) is a flexible indexing and query tool that supports searching across collections of heterogeneous data and presents results in a highly configurable manner."
"Project Ungava is a test-bed for innovative search, indexing, navigation and visualization of library catalog and scholarly journal article metadata and full-text."
"Montezuma is a Common Lisp port of Ferret. Ferret is a Ruby port of Lucene. Lucene is sort of Doug Cutting's Java version of Text Database (TDB), which he and Jan Pedersen developed at Xerox PARC, and which, to complete the circle, was written in Common Lisp."
LIUS is an indexing Java framework based on the Jakarta Lucene project. The LIUS framework adds to Lucene many files format indexing fonctionalities as: Ms Word, Ms Excel, Ms PowerPoint, RTF, PDF, XML, HTML, TXT, Open Office suite and JavaBeans.
A lucene-like engine. Michael Salib gave a presentation re: Xapian at Europython titled "Stupidity and laser cat toys: Indexing the US Patent Database with Xapian and Twisted" that has not yet been published on the Web.
An indexing Java framework based on the Jakarta Lucene project. The LIUS framework adds to Lucene many files format indexing fonctionalities as: Ms Word, Ms Excel, Ms PowerPoint, RTF, PDF, XML, HTML, TXT, Open Office suite and JavaBeans.