<!DOCTYPE File PUBLIC "-//Liam Quin//DTD C API Documentation v1.1//EN" "doc.dtd"><File>
<Entry dir="liblqtext">
<Function File="../../src/liblqtext/readword.c">
<Name>LQT_ReadWordFromStringPointer</Name>
<Class>Database/Retrieval, Database/Documents
<Purpose>
<P>Returns the next natural-language word from the given
NUL-terminated string.</P>
<P>The definition of a word for the purpose of this routine is
determined partly by the definitions for LQT_StartsWord,
LQT_OnlyWithinWord and LQT_EndsWord in the header file
<h>wordrules.h</h>, and partly on the configuration
file in the database directory, where indexnumbers, minwordlength
and maxwordlength may be set.</P>
<P>If the arguments are all null, the effect is to reset
the routine ready to start a new string, and no useful value is
returned in that case.</P>
<P>The given Flags argument may either be zero or any combination
of LQT_READWORD_IGNORE_COMMON and LQT_READWORD_WILDCARDS, or'd
together.</P>
<P>Characters are read from the string, incrementing *Stringpp as each
byte is processed, until a recognised word is found.
If the LQT_READWORD_IGNORE_COMMON flag was set in Flags,
LQT_ReadWordFromStringPointer continues until either a word is
found that has not been registered as being too common to index,
or the end of the string is reached.</P>
<P>If Startp is not a NULL pointer, *Startp is set to point to
the first character in the word that has been found in the given
Stringpp (not to the malloc'd copy in the result).</P>
<P>If Endp is a NULL pointer, the string is considered to be
terminated by the first zero byte reached; otherwise, Endp must
point to the first character not in the string; normally, Endp
would be set to point to the terminating NUL byte.</P>
<P>If the LQT_READWORD_WILDCARDS flag is set, the `Wild Card'
characters * and ? are allowed within words.  Such characters
do not count as punctuation for the returned WordInfo flags.</P>
<Returns>
the next WordInfo on success, or zero if there are no more words
to read in the string.
<Notes>
<P>All client programs and library routines which parse words
use this routine or the companion LQT_ReadWordFromFileInfo routine.
This is very important, because lq-text relies on word counts
within each block of text to be the same on retrieval as they
were on indexing, and if different routines parsed the data each
time there would be a chance of discrepancies.</P>
<Bugs>
The interface to this routine is somewhat ugly, and may be changed
in the next release with the addition of a Reset routine and a
block offset counter.
</Function>
<Decl>
API t_WordInfo *
LQT_ReadWordFromStringPointer(db, Stringpp, Startp, Endp, Flags)
    t_LQTEXT_Database *db;
    char **Stringpp;
    char **Startp;
    CONST char *Endp;
    unsigned int Flags;
</Decl>
</Entry>
