A Text Retrieval Package for the Unix Operating System

Liam R. E. Quin

SoftQuad Inc. (lee at sq.com)

Note: the author of this paper has moved to liamquin at interlog dot com


Table of Contents

Abstract
Introduction
Design Goals
Technology Overview
The lq-text Design
The lq-text Implementation
Programs and Algorithms
Performance
Ongoing and Future Work
Acknowledgements
Conclusions
References
Appendix I - Source for lqphrase
Appendix II - Sample lq session

Abstract

  This paper describes lq-text, an inverted index text retrieval package written by the author. Inverted index text retrieval provides a fast and effective way of searching large amounts of text. This is implemented by making an index to all of the natural-language words that occur in the text. The actual text remains unaltered in place, or, if desired, can be compressed or archived; the index allows rapid searching even if the data files have been altogether removed.  
  The design and implementation of lq-text are discussed, and performance measurements are given for comparison with other text searching programs such as grep and agrep. The functionality provided is compared briefly with other packages such as glimpse and zbrowser.  
  The lq-text package is available in source form, has been successfully integrated into a number of other systems and products, and is in use at over 100 sites.  


Next   Top