The Main Manual Page Dynamic API Documentation CD-ROM API Documentation About Onix Types About Onix Errors Onix's Web Site at Lextek International Lextek International Onix Full Text Indexing and Retrieval Toolkit

About Records

Onix expects you to divide up the text you index into "records" during indexing.  A record is like the page from a book.  Just as a book's index refers to the pages in which a word occurs, the indexes which Onix generates refer to which "records" a specific word appears in.  A record of text can be just about any size.  It can be only a sentence or two in length (such as a Verse from the Bible or Koran) or it may be a paragraph such as you might want to do with a piece of literature or, a record may be a whole file as you might want to do with a document management system or a web crawler.

Choosing how large a record should be is an important choice when building your application -- though usually the circumstances make it fairly easy to decide how to divide up your text.  Remember, certain operators such as the boolean operators "AND", "OR", and "NOT" operate on the record level returning which records match your boolean expression.  The smaller a record is, the more specific your index will be, the larger a record is, the less specific a index will be.  Furthermore if file size is a consideration, the smaller your records, the larger your index will be and the larger the records, the smaller your index will be.

During the indexing phase, it is common for you to write a pointer file which stores information on how to find the record indexed.  This pointer file can be as simple as a list of 4 byte integers specifying the offset into a file a record begins or it can be as complex as two different files -- one specifying a variable length field (such as a file name) and the other specifying how far into the other pointer file the variable length field begins.  It all depends on your application as to what you want to store in your pointer file and how. Onix has some of this logic built into it which you can take advantage of optionally.  There are three functions which you can use to store this pointer file into your index and retrieve the relevant record information.  These are:  ixStoreRecordData(), ixRetrieveRecordData() and ixRetrieveMoreRecordData().  This information is stored optionally and if you want to take advantage of this functionality, you will need to create the index with the function ixCreateIndexEx() which gives you more control over the index creation process than you would with ixCreateIndex().

So you can visualize how it all fits together, please look at the diagram below:     The user submits a query to the index.  The index returns a list of record numbers which match the query.  The pointer table is then consulted to find out where the record text is located.  Then the text itself is retrieved and (usually) displayed to the end user.