Text Retrieval

I gave a paper at the 1994 Usenix conference in Boston; the paper gives a general background to the field and then discusses my own implementation of a text retrieval system:

A Text Retrieval System for the Unix Operating System, Liam Quin, in Usenix Technical Proceedings, Boston, Summer 1994.

The software discussed has a home page.


View the whole paper
A chapter at a time...

Abstract

This paper describes lq-text, an inverted index text retrieval package written by the author. Inverted index text retrieval provides a fast and effective way of searching large amounts of text. This is implemented by making an index to all of the natural-language words that occur in the text. The actual text remains unaltered in place, or, if desired, can be compressed or archived; the index allows rapid searching even if the data files have been altogether removed.

The design and implementation of lq-text are discussed, and performance measurements are given for comparison with other text searching programs such as grep and agrep. The functionality provided is compared briefly with other packages such as glimpse and zbrowser.

The lq-text package is available in source form, has been successfully integrated into a number of other systems and products, and is in use at over 100 sites.


Table of Contents

Abstract
Introduction
Design Goals
Technology Overview
The lq-text Design
The lq-text Implementation
Programs and Algorithms
Performance
Ongoing and Future Work
Acknowledgements
Conclusions
References
Appendix I - Source for lqphrase
Appendix II - Sample lq session