Objects already treated
- under work... rather: tests/fstr
- BDb (new)
- Configuration
- Database / DB2_db: replace with BDb (htlib -- not yet deleted)
- Dictionary, htlib (deleted)
- DocMatch
- HtURLRewrite
- IntObject (deleted)
- List, htlib (deleted)
- Parser, htsearch
- qstrings (was: QuotedStringList)
- ResultList
- htsearch
- Stack
- strlist (was: StringList)
- WeightWord
- WordList
- WordRecord
- WordReference
- HtWordType
Functions, examples

Objects already treated

BDb (htlib)

Replaces DB2

Configuration, htlib

Compiled, but not completed yet.
Uses UnicodeString::getBuffer (maybe questionable).

Compiled.
Done with QuotedStringList
Next handle the List argument of checkSyntax, invoked from htsearch.cc, maybe a list<WeightWord>? No... These are UnicodeStrings, so either a list or a vector of them. Maybe done.
Made result a vector<ResultList> (was a List*). Let's take the vector, because of operator[]
Now, WordList: done.
So continuing in parser.
Made partial changes for Berkeley Db... Also in the Makefile, but not in the template.
perform_push: I don't understand what should happen there. The role of wildcard is not clear.
The key is not checked!?
I'm just slurping the db...
Not sure where I push to...
In the old code, was p the key or the data? And what is the data? an index, convertible to an int? It depends on the db...
I decide to understand that the key is a (now unicode) string and the data in doc_index is (convertible to) an int (in dbf, it is WordRecord).
temp is compared to the key.
Not sure what to store as the key, from the unicode string. What about the value returned from getBuffer?
I have already done that in WordList and Configuration...
Note that the key is truncated to maximum_word_length

qstrings (was: QuotedStringList), htlib

Done, for Parser.
This is a vector rather than a list.

ResultList, htsearch

Inherited from Dictionary (deleted)
map<char const*, DocMatch>
Only const members... May be a problem?

Stack, htlib

Deleted. In parser, a vector<ResultList>, since stack::pop doesn't return anything,

strlist, htlib

vector<UnicodeString> with parse/split constructor.
Skips white space and punctuation.
Use through iterators.
Uses vector in order to have operator[]

WeightWord, htsearch

Done, for use in htsearch

WordList, htcommon

Done, for htsearch
Built around a map<UnicodeString, WordReference>
For valid_word:


alpha> ./alpha foo
text: foo
alpha1: 1, alpha2: 1
alpha> ./alpha foo2
text: foo2
alpha1: 0, alpha2: 0
alpha> ./alpha foo!
text: foo!
alpha1: 0, alpha2: 0
alpha> ./alpha таня
text: таня
alpha1: 1, alpha2: 1
alpha> ./alpha foo_bar
text: foo_bar
alpha1: 0, alpha2: 0
alpha> ./alpha foo-bar
text: foo-bar
alpha1: 0, alpha2: 0
alpha> ./alpha Épaminondas
text: Épaminondas
alpha1: 1, alpha2: 1

Done (small doubt about u_fopen_u, which may not support the append mode of fopen, although it is a wrapper, so it should work...
Otherwise, only use 'rw', and fseek(fl, 0, SEEK_END);

Functions, examples

mystrncasecmp / mystrcasecmp


	if (mystrncasecmp(word, "exact:", 6) == 0)
	{
	    word += 6;
	    isExact = 1;
	}

becomes:


  while (pos = str.indexOf(UnicodeString("exact:"))) {
    if (pos != -1) {
      str.remove(pos, 6);
      isExact = true;
    } else break;
  }

operator<<


error << ' ' << boolean_keywords[1] << " '"
      << boolean_keywords[1] << "'";

Added in htcommon/uhelper.h

Using getBuffer() to store UnicodeString to BDb

In htsearch/parser.cc, not too sure this is a good idea...


  temp.toLower();
  char* p = (char*)temp.getBuffer();
  if (temp.length() > maximum_word_length) p[maximum_word_length] = '\0';
  key.set_data((void*)p);

Top, log

Marc Girod

Objects already treated

BDb (htlib)

Configuration, htlib

DocMatch, htsearch

HtURLRewrite, htlist

Parser, htsearch/parser

qstrings (was: QuotedStringList), htlib

ResultList, htsearch

Stack, htlib

strlist, htlib

WeightWord, htsearch

WordList, htcommon

WordRecord, htcommon

WordReference, htcommon

HtWordType, htlib

htsearch

IntObject, htlib

Functions, examples

mystrncasecmp / mystrcasecmp

operator<<

Using getBuffer() to store UnicodeString to BDb