next up previous contents index
Next: 5.8 Query Manager interface Up: 5.7 Using different index/search Previous: Using WAIS as

Integrating a new index/search back-end into the Broker

To add a new index/search back-end to the Broker, you must define twelve routines for indexing, querying, and administering the Broker/indexer interface. These routines collectively define a standard for object indexing, index consistency maintenance, and querying. Depending on the system you are trying to integrate, some of these functions will likely be null calls, and much of the functionality might reside in a few of the other calls.

The code for this interface is in harvest/src/broker/index.c and harvest/src/broker/index.h. If you want to define a new indexing interface, create new index.[ch] files, and then add your routines to Indexer_Routines variable in main.c. You will also need to update the Indexer_Init routine in main.c. You can start with the skeleton files in src/broker/Skeleton/. If you create the routines for a system we do not currently support and are willing to provide those routines to us (possibly with copyright restrictions), please email harvest-dvl@cs.colorado.edu.

We discuss each of the routines below. More details about the Broker design and implementation are available in William Camargo's thesis [8]. The functions that define the indexing interface between the Broker and the indexer:

int IND_Index_Start()
Prepare for indexing a stream of objects.  

int IND_Index_Flush()
Finish indexing a stream of objects and flush updates.  

int IND_New_Object(reg_t *entry)
Index a new object from its registry entry.  

int IND_Destroy_Obj(reg_t *entry)
Remove an object from the indexer.  

int IND_Index_Full()
Completely reindex all objects.  

int IND_Index_Incremental()
do a full incremental update of the index.  

  This Broker/indexer interface is designed to support both object-at-a-time (incremental) and batch (non-incremental) indexers. An indexing session begins with a call to IND_Index_Start, where you can call initialization routines for your indexer. For each update, the Collector (a part of the Broker) calls IND_New_Object or IND_Destroy_Object. A batch indexer should just queue the request, whereas an object-at-a-time indexer can use the call to update the object index. When a stream of updates is finished, the Collector calls IND_Flush to process any queued updates. Note that if the Broker fails before an index is flushed, updates may be lost. To overcome any inconsistency in the database, the Broker forces a garbage collection that removes and reindexes all objects. For more details about the Collector interface, see Section 5.9.

  The Broker supports configuration through the broker.conf configuration file and the administrative interface. For more details about the administrative interface, see Section 5.5. An indexer can be configured through two routines:

int IND_initialize()
Initialize interface to indexer; called when the Broker is initialized.  

int IND_config(char *value, char *tag)
Set the value of indexer specific variables; called when the Broker configuration file is loaded and when a variable is set through the administrative interface.  

  The most complicated routines for most indexers are the query processing routines. Since most indexers have different query languages, the Broker translates a query into an intermediate form, which the Broker/indexer interface then translates into an indexer-specific query. The query results are analyzed and a list of UIDs is returned. The following routines define the query interface to the indexer:

int IND_do_query(qlist_t *q, int socket, int type, int time)
Process a query. Three types of queries may be processed as specified by the type parameter: USER, BULK, and DELETE. For BULK queries used by the Broker-to-Broker interface, the time parameter restricts the objects that can be returned. The socket is passed to object display routines for sending results.    

void IND_Init_Flags()
Initialize indexer-specific query parser flags.  

void IND_Set_Flags(char *tag, char *value)
Set indexer-specific query parser flag.  



next up previous contents index
Next: 5.8 Query Manager interface Up: 5.7 Using different index/search Previous: Using WAIS as



Darren Hardy
Mon Apr 3 15:22:37 MDT 1995