htfuzzy

ht://Dig Copyright © 1995-2002 The ht://Dig Group
Please see the file COPYING for license information.


Synopsis

htfuzzy [-c configfile][-v] algorithm ...

Description

Htfuzzy creates indexes for different "fuzzy" search algorithms. These indexes can then be used by the htsearch program.

Options

-c configfile
Use the specified configuration file instead of the default.
-v
Verbose mode. Used once will provide progress feedback, used more than once will overflow even the biggest buffers. :-)

Algorithms

Indexes for the following search algorithms can currently be created:
soundex
Creates a slightly modified soundex key database. Differences with the standard soundex algorithm are:
  • Keys are 6 digits.
  • The first letter is also encoded.
metaphone
Creates a metaphone key database. This algorithm is more specific to English, but will get fewer "weird" matches than the soundex algorithm.
accents
Creates an accents key database. This algorithm will map all accented letters to their unaccented counterparts, so that a search for the unaccented word will yield all variations of this word with accents.
endings
Creates two databases which can be used to match common word endings. The creation of these databases requires a list of affix rules and a dictionary which uses those affix rules. The format of the affix rules and dictionary files are the ones used by the ispell program. Included with the distribution are the affix rules for English and a fairly small English dictionary. Other languages can be supported by getting the appropriate affix rules and dictionaries. These are available for many languages; check the ispell distribution for more details.
synonyms
Creates a database of synonyms for words. It reads a text database of synonyms and creates a database that htsearch can then use. Each line of the text database consists of words where the first word will have the other words on that line as synonyms.

Files

CONFIG_DIR/htdig.conf
The default configuration file.

See Also

htdig, htmerge, htsearch, Configuration file format, and ispell.

Last modified: $Date: 2002/01/27 05:33:20 $