This article contains an overview of the most important classes in Lucene.

Documents (Lucene.Net.Documents.Document)

Documents are the central entity in Lucene, it is what gets stored in the index. A document can represent any kind of information you want, for example: it can contain database records, emails,  HTML pages, word documents, etc.. A document doesn’t have any required attributes, it is basically just a list of [fields]. Optionally you can set a boost for a document.

Fields (Lucene.Net.Documents.Field)

Fields are used to describe a [document]. A field is basically a key value pair containing the name of the field and it’s value. There are different behaviors for a field: there are Keyword fields, UnIndexed fields, Text fields, you can use the constructor functions to instantiate a field.

Field types

Field method/type Analyzed Indexed Stored Usage
Field.Keyword(string, string)   x x URLs, nickname, social security numbers
Field.Keyword(string, dateTime)   x x  
Field.UnIndexed(string, string)     x Document type, when not used for search criteria
Field.UnStored(string, string) x x   Document titles and content
Field.Text(string, string) x x x Document titles and content
Field.Text(string, TextReader) x x   Document titles and content

IndexWriter (Lucene.Net.Index.IndexWriter)

The index writer is responsible for writing [documents] to an index. The index can be stored in either a directory on disk (Lucene.Net.Store.FSDirectory), or in memory (Lucene.Net.Store.RAMDirectory). It uses [analyzer] to break up text before a [document] is stored in the index. You can either create a new index or you can change an existing index.

Analyzerer (Lucene.Net.Analysis.Analyzer)

This class, and it derivatives, is responsible for breaking down text into single words or terms and do processing on them. For example: remove commonly used words, like: ‘the’, ‘and’ and ‘a’ or transform word into other words in case of verbs ( walked -> walk ) or even add synonyms.

Term (Lucene.Net.Index.Term)

A term is a key/value pair on which you want to search, where the key is the name of the [field] and value is the value on which to search.

IndexSearcher (Lucene.Net.Search.IndexSearcher)

The index searcher performs the actual search, it opens the index, searches through it using the search [query] and returns the [hits] matching the the [query].

Query (Lucene.Net.Search.Query)

A query describes the results you want to get from a search.

Hits (Lucene.Net.Search.Hits)

A list with all the documents returned from a search operation performed by the [indexreader].

That is it for this article, I hope you learned how Lucene works and that you understand the basic concepts. If you have additional information, comments or questions please don’t hesitate to respond.

In my next article I will show you how to implement a basic index application, which can index a bunch of plain text files from disk. No rocket sience but that is not the point.

Share and Enjoy:
  • Print
  • Twitter
  • LinkedIn
  • Digg
  • DZone
  • del.icio.us
  • Google Bookmarks
  • StumbleUpon
  • Technorati