Here’s some specifics:
I have a 144 item Inbox totaling 67Mbytes that is sync’ed to DTTG. I did the following experiment. I opened up the DTTG backup and took the last word from each ZFULLTEXT field and tracked how many correct hits I got. The result was 95 not found out of 144 for about a 66% miss rate. I believe the database has all the basic DT types: TXT, RTF, RTFD, HTML, Webarchive, PDF, etc. Most all are under 20Kbytes, as that is where my maximum index setting is pegged.
You can see my results in greater detail here:
http://vze26m98.net/devon/dttg-data.numbers.pdf
The columns are:
ZNAME: the name of the file
ZFILESIZE: the size of the document in bytes on disk
ZTYPE: the type of file, PDF, HTML, TXT, etc.
len(ZFULLTEXT): the length in bytes of the ZFULLTEXT field of the DTTG SQL database. Presumably this is the data that DTTG builds its index from.
len(plain text): as the documents are also stored on the Mac in DTPO, this is the length of the “plain text” field for each record. Again presumably what DT builds its index from.
% difference: the ratio of the two text fields above, in bytes. In other words, what DT has to index that DTTG doesn’t. DTTG is mostly within 95% of DTPO, which I presume is no issue, although I haven’t looked at it. There are a few that are only 88% of what DT has stored, and above a 20Kbyte filesize, what is stored truncates radically to only 1024 bytes. So for files over your index limit, DTTG only stores 1Kbytes of text to index.
Last Word in ZFULLTEXT: This is the last word that DTTG thinks is in each document. Presumably DTTG should find this word on a search, because it’s in the database, but it can’t find the last word in 95 out of 144 tries for a 66% failure rate.
NF: If there is a dot in the column, these are the (above) searches that failed.
This is a continuation of an investigation I started over the weekend, and have posted a bug report. Although I didn’t get any response to my request to hear from others who are having problems, it’s clear that I’m not the only one with search difficulties.
Although Eric sounds like this is the first he’s heard of this issue, hopefully there will be some action soon. Although syncing is a more spectacular bit of trouble, it’s a lot more forgivable than having a broken search mechanism, which is easily verifiable (as I have done) and, more importantly, the value proposition for the application.
HTH, Charles