Search Function - Not finding anything

ColinMcNairney · October 28, 2010, 1:23pm

Can any of you guys get the search function working?

I can only seem to be able to get hits returned when I search for text that I know is in a plain text document. If I search for text in any other form of document I get 0 hits.

I’ve increased the ideax size per item setting (in DTTG setting) to 20Kb but this hasn’t helped at all.

eboehnisch · October 29, 2010, 1:51pm

Are the other documents full-text searchable on the Mac? What kinds of documents are we talking about?

Eck · October 29, 2010, 5:14pm

Having the same problem…

Setup:
Mac DT Pro Office version 2.0.5 syncing with iPad DT To Go.
Document is a Pdf (780 KB), which is of type “PDF + Text” in DT Pro Office and the term “open” is found within the document (not part of the title) with the Search function of DT Pro Office.

DT To Go is not able to show the document in the search results, when “open” is the search term and the tab “Name & Contents” is selected.
Interestingly, a different term, which is not part of the title, but on the first page of the document is found…

If needed, here is the link to the document (it’s an official UK Government NHS paper)

http://www.dh.gov.uk/prod_consum_dh/groups/dh_digitalassets/@dh/@en/documents/digitalasset/dh_120598.pdf

Edit: Raised this one in the bug tracker system, apologies for the ‘cross-posting’!

cb3048 · November 3, 2010, 12:44pm

Same Problem here: PDF+Text, search finds the information on the mac, but finds nothing on the iPad

cturner · November 3, 2010, 11:59pm

Here’s some specifics:

I have a 144 item Inbox totaling 67Mbytes that is sync’ed to DTTG. I did the following experiment. I opened up the DTTG backup and took the last word from each ZFULLTEXT field and tracked how many correct hits I got. The result was 95 not found out of 144 for about a 66% miss rate. I believe the database has all the basic DT types: TXT, RTF, RTFD, HTML, Webarchive, PDF, etc. Most all are under 20Kbytes, as that is where my maximum index setting is pegged.

You can see my results in greater detail here:

http://vze26m98.net/devon/dttg-data.numbers.pdf

The columns are:

ZNAME: the name of the file

ZFILESIZE: the size of the document in bytes on disk

ZTYPE: the type of file, PDF, HTML, TXT, etc.

len(ZFULLTEXT): the length in bytes of the ZFULLTEXT field of the DTTG SQL database. Presumably this is the data that DTTG builds its index from.

len(plain text): as the documents are also stored on the Mac in DTPO, this is the length of the “plain text” field for each record. Again presumably what DT builds its index from.

% difference: the ratio of the two text fields above, in bytes. In other words, what DT has to index that DTTG doesn’t. DTTG is mostly within 95% of DTPO, which I presume is no issue, although I haven’t looked at it. There are a few that are only 88% of what DT has stored, and above a 20Kbyte filesize, what is stored truncates radically to only 1024 bytes. So for files over your index limit, DTTG only stores 1Kbytes of text to index.

Last Word in ZFULLTEXT: This is the last word that DTTG thinks is in each document. Presumably DTTG should find this word on a search, because it’s in the database, but it can’t find the last word in 95 out of 144 tries for a 66% failure rate.

NF: If there is a dot in the column, these are the (above) searches that failed.

This is a continuation of an investigation I started over the weekend, and have posted a bug report. Although I didn’t get any response to my request to hear from others who are having problems, it’s clear that I’m not the only one with search difficulties.

Although Eric sounds like this is the first he’s heard of this issue, hopefully there will be some action soon. Although syncing is a more spectacular bit of trouble, it’s a lot more forgivable than having a broken search mechanism, which is easily verifiable (as I have done) and, more importantly, the value proposition for the application.

HTH, Charles

jlehet · November 5, 2010, 4:37pm

I’m also finding, or not finding, this as well. Sometimes find works, and sometimes DTTG won’t find documents it certainly should.

I’m thrilled by how fast it is though. That is a good start, the speed, anyway.

One UI enhancement I’d like to see would be what’s becoming a common “discoverable” interface element in many iOS apps, where in any particular view of a list of files, scrolling the list up, above the fold, reveals the search field. You could see that in Notebooks, for instance. This would save clicks back to where it is now in DTTG, and also prevent the context-shift.

callesjonell · November 22, 2010, 8:23pm

I have the same issue. I have done a number of searches on DTP Office Pro and the the same on DTTG, seems like the fuzzy logic part is turned off. As long as it is not an exact match DTTG don’t see it, but DTP Office Pro turns up nice items where only a few of the words are picked up. I use a lot of .webarchive.

I ran a test. DTP Office Pro can search into the entire .webarchive file, while DTTG can only read the regular layout HTML part.

As an example, do a webarchive on any TED talk on TED.com and you can see that DTTG only reads the title and text on the left side of the page, but ignores the text and the transcript in the right grey box (which DTP Office Pro reads correctly)