I am using a MySQL database and have been using database driven search. Any advantages and disadvantages of database engines and Lucene search engine? I would like to have suggestions about when and where to use them?
preguntado el 09 de enero de 11 a las 10:01
Te sugiero que leas Full Text Search Engines vs. DBMS. A one-liner would be: If the bulk of your use case is full text search, use Lucene. If the bulk of your use case is joins and other relational operations, use a database. You may use a hybrid solution for a more complicated use case.
Use Lucene when you want to index textual Documentos (of any length) and search for Texto within those documents, returning a ranked list of documents that matched the search query. The classic example is search engines, like Google, that uses text indexers like Lucene to index and query the content of web pages.
The advantages of using Lucene over a database like Mysql, for indexing and searching text are:
- para el desarrollador - tools to analyse, parse and index textual information (e.g. stemming, plurals, synonyms, tokenisation) in multiple languages. Lucene also scales very well for text search.
- para el usuario - quality search results. Lucene uses a very good similarity function (to compare the search query against each document), at the heart of which are the Cosine Similarity and Inverse Term/Document frequency. This results in good search results with very little tweaking required upfront.
Mucho useful info on Lucene here.
We used Sql Server at work to make some queries which used Fulltext search. In case of big amounts of data Sql makes an inner join between result set returned by FullText search and the rest of the query which might be slow if database is running on the low powered machine (2GB ram for 20 GB of data). Switching the same query to Lucene improved speed considerably.