README.txt 778 B

12345678910111213141516
  1. Apache Solr Content Extraction Library (Solr Cell)
  2. Introduction
  3. ------------
  4. Apache Solr Extraction provides a means for extracting and indexing content contained in "rich" documents, such
  5. as Microsoft Word, Adobe PDF, etc. (Each name is a trademark of their respective owners) This contrib module
  6. uses Apache Tika to extract content and metadata from the files, which can then be indexed. For more information,
  7. see http://wiki.apache.org/solr/ExtractingRequestHandler
  8. Getting Started
  9. ---------------
  10. You will need Solr up and running. Then, simply add the extraction JAR file, plus the Tika dependencies (in the ./lib folder)
  11. to your Solr Home lib directory. See http://wiki.apache.org/solr/ExtractingRequestHandler for more details on hooking it in
  12. and configuring.