Search by Author and Date

I have just downloaded Devonthink Office and imported, with OCR, some sample documents into it, and have a couple of questions I hope someone might be able to answer.

Is it possible to search by the Author text that I entered when prompted on import?

How can I search for keywords relating to documents created between a certain date only, for instances when I know a document may have been authored by someone in the past two weeks?

Can I restrict my search to say, only the keywords field, or only the subject fields?

Thanks!

At this moment the Keywords you enter are imported and set into the Comments of a database record. The other data is written to the PDF file but, as with any other PDF file, is not used at this moment except for the file creation and modification date.

Also, the creation date of an OCR-ed PDF file is set to the original file creation date.

Date range searches are not possible in the Find window but the History tool can be helpful.

The History tool has many possibilities that I’d like to mention, although you may or may not find this particularly relevant to your query.

The History view presents a flat view of your entire database, initially sorted by Age on the Modification Date. You can use other Sort options from View > Sort, and/or you can add other sortable columns to your History view using View > Columns. Select an item and press Command-R to see it’s location in your organizational structure.

Click on an item and Control-click (right click). Contextual menu options will let you move or replicate the document to a location, add/remove a label to one or more items, and so on.

Remember also that there’s a script in DT Pro’s Scripts menu that will let you create a smart group that will display all of the content in your DT Pro database that’s been changed during the past week.

Ok thanks for that. It seems that this software might not be suitable for our business after all. It’s a shame because apart from the search limitations, it’s otherwise a great piece of software, and much easier to use and implement than some of the heavier document management systems around.

Thanks for your help!

Comments.

DTPO is using PDFKit from Apple to display the PDF. But Preview doesn’t show the Document Properties information (which you can enter during the OCR process). Maybe in the next OS X release?

If you have the ‘set attributes’ option checked in DTPO’s Preferences > OCR you can enter the Document Properties information, and that information can be seen if the resulting PDF file is opened in Acrobat.

If the Author information is important, just enter it Also into the Keywords field. It will be saved into the document Info panel Comment field. And yes, the Comment field can be searched independently of the other metadata and content of a document. Note also that search results can be sorted by date. As I keep the Info panel permanently open as the right-most window on my display, I can very quickly check dates even in the search window.

Of course, if the author’s name is also in the body of the document, it will be found by a content/Phrase search.

Finally, if one needs a really tricky sorting and filtering of the data, one can replicate the search results to a group created for that purpose (usually temporary). I usually use the Vertical Split view in that case and one can add additional sort columns to the view using View > Columns. One can then create subsets of that group through searches and sorts, if needed. For example, although NOT isn’t available in the current set of search operators, it can be emulated by such techniques. Quicker done than described.

To tell the truth, I’ve unchecked the ‘set attributes’ option in DTPO Preferences > OCR. When I’m feeding a bunch of documents into my ScanSnap’s feeder I don’t want to have to stop and fill in the Document Properties information. Worse, that stops the OCR processing queue until the form, is completed. Instead, I’ll depend on DTPO to help me find anything I need. I’ve set ScanSnap manager to give each PDF a name such as “2006_11_02_05_11_07.pdf”. That tells me the date and time the PDF was created. I may or may not change it to a descriptive name later on.