Xindice API
version 1.2m1

org.apache.xindice.core.indexer
Class LuceneIndexer

java.lang.Object
  extended byorg.apache.xindice.core.indexer.LuceneIndexer
All Implemented Interfaces:
Configurable, DBObject, Indexer, Named

public final class LuceneIndexer
extends Object
implements Indexer, DBObject

LuceneIndexer is used for maintaining full text indexes. It operates on documents instead of elements and allows to search for documents using native Lucene query. There can be only one LuceneIndexer per collection, however, it may have more than one IndexPattern.

Every IndexPattern corresponds to a Lucene document field. For every Xindice document, value of all matching elements will be indexed by a single Lucene document, allowing to search across the patterns.

Sample LuceneIndexer configuration:

 <index name='fulltext' class='org.apache.xindice.core.indexer.LuceneIndexer'
                        analyzer='org.apache.lucene.analysis.SimpleAnalyzer'>
   <pattern pattern='meta@title' alias='title'/>
   <pattern pattern='description' alias='text'/>
 </index>

To search over this sample index, one could issue a query "title:tutorial AND text:xml".

For more details about LuceneIndexer configuration please see documentation for setConfig(org.apache.xindice.util.Configuration)

Version:
$Revision: 586647 $, $Date: 2007-10-19 20:32:43 -0400 (Fri, 19 Oct 2007) $
Author:
Andy Armstrong

Field Summary
static String DEFANALYZER
           
static String KEYNAME
           
 
Fields inherited from interface org.apache.xindice.core.indexer.Indexer
STYLE_FULLTEXT, STYLE_NODENAME, STYLE_NODEVALUE
 
Constructor Summary
LuceneIndexer()
           
 
Method Summary
 boolean close()
          close closes the DBObject
 boolean create()
          Creates necessary resources.
 boolean drop()
          drop instructs the DBObjectimplementation to remove itself from existence.
 boolean exists()
          exists returns whether or not a physical representation of this DBObject actually exists.
 void flush()
          flush forcefully flushes any unwritten buffers to disk.
 org.apache.lucene.analysis.Analyzer getAnalyzer()
           
 Configuration getConfig()
          getConfig retrieves the configuration information for the Configurable object instance.
 IndexerEventHandler getIndexerEventHandler()
          Creates new instance of a handler to listen to indexer events.
 String getIndexStyle()
          getIndexStyle returns the Index style.
 String getName()
          getName retrieves the contextually important name of the object
 String getPatternAlias(IndexPattern pattern)
          Return alias for the given pattern.
 IndexPattern[] getPatterns()
          Returns this Indexer's patterns.
 boolean isOpened()
          isOpened returns whether or not the DBObject is opened for business.
 boolean open()
          open opens the DBObject
 IndexMatch[] queryMatches(IndexQuery query)
          queryMatches retrieves a set of IndexMatch instances that match the supplied query.
 IndexMatch[] queryMatches(org.apache.lucene.search.Query query)
          Same as Indexer.queryMatches(IndexQuery), but accepts compiled Lucene query as parameter.
 void setCollection(Collection collection)
          setCollection tells the Indexer who its parent is.
 void setConfig(Configuration config)
          Configures LuceneIndexer instance.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

KEYNAME

public static final String KEYNAME
See Also:
Constant Field Values

DEFANALYZER

public static final String DEFANALYZER
See Also:
Constant Field Values
Constructor Detail

LuceneIndexer

public LuceneIndexer()
Method Detail

getIndexStyle

public String getIndexStyle()
Description copied from interface: Indexer
getIndexStyle returns the Index style. Different query languages will need to draw from different indexing styles. For example, A query that is written in quilt will require XPath indexing.

Specified by:
getIndexStyle in interface Indexer
Returns:
The index style

getPatterns

public IndexPattern[] getPatterns()
Returns this Indexer's patterns. LuceneIndexer may have more than one pattern.

Specified by:
getPatterns in interface Indexer
Returns:
Indexer's patterns

getPatternAlias

public String getPatternAlias(IndexPattern pattern)
Return alias for the given pattern. If this exact pattern is not indexed, method will look for matching indexed pattern.

Parameters:
pattern - IndexPattern
Returns:
Alias for the closest matching pattern or null, if there is none

setConfig

public void setConfig(Configuration config)
               throws XindiceException
Configures LuceneIndexer instance.
index
Top Indexer configuration element. Can have one or more pattern child elements. Its attributes:
  • name - Indexer name. Required.
  • class - Indexer class. Required. org.apache.xindice.core.indexer.LuceneIndexer for full text index.
  • analyzer - Analyzer to use for indexing. Optional, org.apache.lucene.analysis.SimpleAnalyzer by default.
pattern
Child element. Indexer must have at least one pattern. Its attributes:
  • pattern - IndexPattern. For acceptable formats, see Indexer.getPatterns()
  • alias - Name of the field to store/search values for that pattern.
default
Child element. Optional. Its attributes:
  • alias - Indicates the pattern alias that will be used as the default field for search. If omitted, search query has to include field name for all terms, there will be no default.
  • Specified by:
    setConfig in interface Configurable
    Parameters:
    config - Configuration to apply
    Throws:
    XindiceException - Configuration does not have required information, Analyzer could not have been instantiated.

    getConfig

    public Configuration getConfig()
    Description copied from interface: Configurable
    getConfig retrieves the configuration information for the Configurable object instance.

    Specified by:
    getConfig in interface Configurable
    Returns:
    The configuration Node

    exists

    public boolean exists()
    Description copied from interface: DBObject
    exists returns whether or not a physical representation of this DBObject actually exists. In the case of a HashFiler, this would check for the file, and in the case of an FTPFiler, it might perform a connection check.

    Specified by:
    exists in interface DBObject
    Returns:
    Whether or not the physical resource exists

    create

    public boolean create()
                   throws DBException
    Creates necessary resources.

    Specified by:
    create in interface DBObject
    Returns:
    true, if successful
    Throws:
    DBException - The was low-level IOException that prevented index from creating resources.
    DuplicateIndexException - Parent collection already has full text index

    open

    public boolean open()
                 throws DBException
    Description copied from interface: DBObject
    open opens the DBObject

    Specified by:
    open in interface DBObject
    Returns:
    Whether or not the DBObject was opened
    Throws:
    DBException - if operation failed

    isOpened

    public boolean isOpened()
    Description copied from interface: DBObject
    isOpened returns whether or not the DBObject is opened for business.

    Specified by:
    isOpened in interface DBObject
    Returns:
    The open status of the DBObject

    close

    public boolean close()
                  throws DBException
    Description copied from interface: DBObject
    close closes the DBObject

    Specified by:
    close in interface DBObject
    Returns:
    Whether or not the DBObject was closed
    Throws:
    DBException - if operation failed

    drop

    public boolean drop()
                 throws DBException
    Description copied from interface: DBObject
    drop instructs the DBObjectimplementation to remove itself from existence. The DBObject's parent is responsible for removing any references to the DBObject in its own context.

    Specified by:
    drop in interface DBObject
    Returns:
    Whether or not the DBObject was dropped
    Throws:
    DBException - if operation failed

    getName

    public String getName()
    Description copied from interface: Named
    getName retrieves the contextually important name of the object

    Specified by:
    getName in interface Named
    Returns:
    The object's name

    setCollection

    public void setCollection(Collection collection)
    Description copied from interface: Indexer
    setCollection tells the Indexer who its parent is.

    Specified by:
    setCollection in interface Indexer
    Parameters:
    collection - The owner Collection

    getAnalyzer

    public org.apache.lucene.analysis.Analyzer getAnalyzer()

    flush

    public void flush()
               throws DBException
    Description copied from interface: Indexer
    flush forcefully flushes any unwritten buffers to disk.

    Specified by:
    flush in interface Indexer
    Throws:
    DBException

    getIndexerEventHandler

    public IndexerEventHandler getIndexerEventHandler()
    Creates new instance of a handler to listen to indexer events. For every document that being added there will be a separate handler that will assemble all relevant values in a single Lucene document.

    Specified by:
    getIndexerEventHandler in interface Indexer
    Returns:
    new instance of IndexerEventHandler
    See Also:
    IndexerEventHandler

    queryMatches

    public IndexMatch[] queryMatches(IndexQuery query)
                              throws DBException
    Description copied from interface: Indexer
    queryMatches retrieves a set of IndexMatch instances that match the supplied query. The matches are then used by the QueryEngine in co-sequential processing. If this indexer doesn't support the passed value, it should return 'null'. If no matches are found, it should return an empty set. queryMatches will typically be used in XPath processing.

    Specified by:
    queryMatches in interface Indexer
    Parameters:
    query - The IndexQuery to use
    Returns:
    The resulting matches
    Throws:
    DBException

    queryMatches

    public IndexMatch[] queryMatches(org.apache.lucene.search.Query query)
                              throws DBException
    Same as Indexer.queryMatches(IndexQuery), but accepts compiled Lucene query as parameter.

    Parameters:
    query - Compiled Lucene query.
    Returns:
    The resulting matches
    Throws:
    DBException - if IOException prevented indexer from executing the query.

    Xindice API
    version 1.2m1

    Copyright (c) 1999-2007 The Apache Software Foundation. All Rights Reserved.