Search function not working properly

Here is the problem: I have many pdf files that were created using scansnap 510M scanner and OCRed with bundled finereader. When I search these documents with preview or acrobat, I can find what I am searching for. If I search for “xyz123” it finds it. However, If I import that same pdf into DT pro office version (current beta downloaded today), the search will not find “xyz123”. Any ideas? I have spotlight index OFF and database is on encrypted sparsebundle. Thanks.

cemjack.

I assumed you were talking about the actual string “xyz123” so imported your post into a DTPO2 database, both as rich text and as PDF. Searching for that string was successful.

Select your PDF and choose Data > Convert > to Rich Text. Does the string “xyz123” appear correctly, i.e., with no extraneous spaces or other characters?

If so, look for that term in the Concordance (Tools > Concordance). The string should appear there as “Xyz123”.

Try the search again in the full Search window (Tools > Search). Check the settings to make certain that they are correct, e.g., not limited to a group that doesn’t contain the string.

I may not have explained things correctly.

I have a pdf file that has been created with ScanSnap S510M. That pdf file undergoes OCR using Finereader for ScanSnap. Now, I open that file using Adobe Acrobat or Preview and search for a specific word, let’s say that word is “stenosis”. The word is found 3 separate times in the pdf file.

Now, I import that pdf file into DT and when I search for “stenosis” it returns “no items found”

Now this does not happen on all pdf files but enough that I do not feel that I can reply on the search function. I may be using things incorrectly?

So, now that I have said all that, what I also need to know is the following: I have 1000+ pdf file that I need to be able to search. I currently have them in folders on an encrypted sparsebundle. What I have now realized is that because the sparsebundle is encrypted, spotlight will not index the files and they are therefore not searchable in the manner detailed above. So my though was to place the DT database within an encrypted sparsebundle. Then I could open sparsebudle, run DT, open database and have access to all files including ability to search. Will this work? Does anyone else have any other suggestions on how to keep files secure but searchable too. Thanks.

Well that last post was long winded…

I think I realized my problem. See if I have things correct. One way I was searching was for specific check numbers. For example the pdf document had “Check # 00004044”

I was searching for “4044” and getting no files found. I then tried searching for “*4044” and sure enough it found the pdf document.

So, it that what I was doing wrong - I needed to add wildcard at start of search string?

If that is it, then I am excited that may plan to use an encrypted sparsebundle to hold the DT database will work. Any further comments/help or other ideas would be appreciated.

Yes, the use of the Wildcard marker was correct for your check number search, because you were not searching for the full string.

No, you don’t need to use the Wildcard operator for a term like “stenosis”. It so happens that I put together a database for my cousin, who is a medical doctor. I did a search for “stenosis” and got 33 results in 0.002 seconds.

Running your database from an (opened) encrypted disk image works well, and we recommend that for maximum security of databases.

If you know that a term exists in your PDFs, but got null results in a search for that term, open the main Search window (Tools > Search). Inspect your settings to make certain that you are indeed searching the proper database. Here’s a screenshot of the way my search was set up:
Stenosis search.jpg