Hi. Newbie here I’m looking into using this software to scan and organize the mound of paperwork that perpetually grows on my desk. I’ve played around with the demo and I’m able to use my Canon MX700 printer scanner along with ExactScan Capture to scan documents into DTP Office. Right now I’m using it mostly for bills I’m paying. Instead of manually writing “paid” on a paper bill and throwing it onto one of the 9 boxes in my closet, I’d like to scan it, store it in DTP, shred/recycle the original and have it marked as paid in DTP along with searchable “fields” showing the amount paid, date due, date paid, name of vendor etc. Can this type of workflow be automated more or less in DTP Office?
There are a number of ways to approach those objectives, although not with ‘searchable fields’ added to the PDF.
A bill that has not yet been paid might be flagged, labelled or tagged as such. When I pay a bill by check, I’m in the habit of scanning the check and/or receipt. Within the database I can then merge the bill PDF and the check/receipt PDF into a single document and remove the marker (Flag, Label or Tag) for an unpaid bill.
Another approach could be to create a rich text note that’s associated by a hyperlink to the PDF of the bill. The smart template ‘Annotation’ (based on a script) might be a model that could be modified for that purpose, with appropriate cues for information such as Vendor, Amount, Date Paid, etc. If you don’t wish to get into scripting, just lay out the format of such a rich text document and Export as Template. Now the format will be available for use.
If you do quarterly reports, you might create a Tag such as ‘2010 Qtr 2’ and apply it to your bills.
Still another approach might be to use Sheets. Although you cannot insert the PDF of a bill into a Sheet, you can create a column for names of bills stored in your database, and use the Lookup command (Command-/) to automatically create a new Search window with that name already entered, then press Return to display the bill in the results list. Sheets will allow you to compute totals, and can be exported and read by Excel or Numbers.
As the legendary Senator Russell Long used to say, “There are more ways to kill a cat than by stuffing it with butter.”
Experiment a bit and work out a systematic approach that will meet your needs. DEVONthink will not only allow you to reduce the paper stored in those 9 boxes, it will allow you to find and use the information when you need it.
OK, thanks. I guess I need to get away from the searchable fields on a PDF concept and think more in terms of scanning (I guess OCR isn’t even really necessary in that case?) and tagging the PDF or attaching some kind of note/template to it that is then searchable…
Can either a note or annotation or something that can be attached to a PDF have its own custom fields created by a template? In other words can I have some template that I can attach to a PDF that would contain fields like Vendor, Date Due, Date Paid, Amount Paid etc,?
Not “fields” in the sense of distinct database entities. I’m talking about text in a rich text note.
Example: Create a template note that has the following ‘cue’ strings:
Insert Item Link to the PDF
"Billed to "
"Due Date "
"Invoice Date "
"Date Paid "
Now I select that note and choose File > Export > as Template and store it as a template.
Now a bill arrives. I scan and OCR the bill to produce a searchable PDF. I open the appropriate template and copy information from the PDF and paste it into the relevant cue areas:
Use a convention for document Name such as VendorName InvoiceNumber.
Insert Item Link to the PDF
“Item Backpack rocket engine #1204”
“Billed to Wile E. Coyote”
“Vendor Acme Fireworks”
“Invoice Date 20100204”
“Due Date 20100304”
"Date Paid "
“Comments Strap on backpack and ignite rocket to run faster than the Roadrunner”
Why the quotation marks? If I select the entire string including the quotes (a triple-click will select the entire paragraph) and press Command-/, the string will be entered as a phrase query in a new Search window. Press Return and all documents that contain that exact string will be displayed in the results. Example: I can quickly find all the bills from Acme Fireworks by a search for “Vendor Acme Fireworks”.
Why the date expressed as YYYYMMDD? I can search for all the bills for 2010 by searching for Invoice Date 2010*. Or all the bills due in March, 2010 by searching for Date Due 201003*.
One could Tag such notes as ‘unpaid’ and then remove the tag when the bill is paid. That allows a quick review of outstanding bills.
DEVONthink searches treat the Name of a document as one field and the Content as a separate field. But the above example illustrates how one can use simple tricks to emulate many ‘fields’ within the Content of a text note. I could, for example, find all the invoices from Acme Fireworks for the year 2010 very easily. I could then save the results as a smart group, and filter (Smart Group Editor) for the Tag ‘unpaid’ to find any that remain unpaid.
This is just a quick and dirty illustration of how one might apply some structure (cue strings, quotation marks) to an otherwise free-form text note to allow quite powerful slicing and dicing of information.
Wow. Thanks for the detailed reply. Where can I find out more about creating templates and “cue” strings?
Just type ‘cue’ terms as in the example above. Save and name your template note.
As noted above, the command File > Export > as Template will store your template note for future use.
To call up the template, choose Data > New from Template > and choose the desired one from the list. Now fill in the information relevant to a bill.
You can find more information about templates, including smart templates, in Help and in the user documentation PDF.
OK, I’ve been checking the documentation on templates and from what I can tell, if I want to modify a “smart template”, I need to do so in Script Editor. This is where I get lost as I’m not sure where I would enter these cues. For instance I’ve copied and am trying to modify the “annotation” template.
The example I illustrated doesn’t involve scripting at all. It’s really, really easy. Just create a rich text file with the information ‘headings’, save an name it, then choose File > Export > as Template.
Sometime things are so easy they seem hard.
OK, got that much. Now how do I attach that to a PDF? If I select “New with Template” and choose that template, nothing seems to happen. So basically I have my scanned bill open in DTP. Now I Want to attach a note or annotation based on this template I created to it.
Open the PDF and choose Edit > Copy Item Link.
Open your new rich text note and paste in the Item Link.
Now your text note contains a working hyperlink to the PDF. And as that Item Link is the same text string as the Name of the PDF, if you do an ‘All’ search for the Name of the PDF, the search will also find the associated text note.
The PDF is ‘attached’ to the text note via the Item Link.
The text note is ‘attached’ to the PDF by a search of the PDF name, using the All search.
As illustrated, you can track the paid/unpaid status of your bills, quickly find all the bills from any vendor, quickly list all the bills invoiced in, for example, the first quarter of 2010, and so on, using a simple set of notes derived from a template and ‘filled in’ with specific information from each bill.
Or, as noted, you could use a Sheet instead of a rich text note, or even an Excel or Numbers spreadsheet that you’ve set up as a template in your database (allowing number-crunching, and also containing searchable information extracted from the scanned bills). A slightly different approach would be used to ‘attach’ a row of information in an Excel sheet to the corresponding PDF, using the Name of the PDF for insertion in a search query. Again, the point is that you can devise a flexible means of aggregating/disaggregating information contained in your searchable collection of PDFs to suit your needs.
From an auditing perspective, you can demonstrate the source documentation – the scanned bills/invoices, etc. During the year, I scan in business expenses and at tax time collate the information and give it to my tax accountant. My approaches could easily satisfy an IRS audit.
These techniques of associating or ‘attaching’ documents to each other work for documents of all file types, not just PDFs.
OK, I think I’m getting the hang of it. Thank you for all your help.
I’ve been scanning years and years of statements, receipts, and other clutter I’ve had piled up in various cubby holes, closets, and drawers in my house.
Bills and statements and other oddball things started to auto-classify rather quickly after putting a few samples in the right place.
Receipts were tough for me, because I’m too lazy to look at the receipt and copy paste, copy paste. So wrote a script to do it for me.
It’s no where near comprehensive, but it does a lot of heavy lifting, and with some tweaking I can see it doing this automatically for you as well. I’d be interested in seeing new tweaks to finding information automatically with regular expressions and some good old fashioned problem solving.
check out the script here: viewtopic.php?f=20&t=10246
Well, I thought I was obsessive (I mean, “organized”) but I see from the responses you are getting, I am nowhere near other users of DTP.
Anyway, if it helps I just got started organizing bills in DTPO, and I have a few pointers that might help. But I bow to the DTP maven Bill!
I am a “grouper” not a “tagger” in the terms of “Taking Control of Getting Started with DevonThink”, so I use groups (folders) for structure, but I then use tags for “actionable” states—like “Need Pay”, “Taxes 2010”, etc. Then I can look in that tag and find all the actionable items when I need them.
I then set up my folder structure so that it is not too deep (no more than 3), and I try to enforce the rule that a folder either contains folders or files, but not both. This is an analogy with my physical files (which I am transitioning away from): a drawer contains dividers, a divider contains folders, and a folder contains documents. I try to start the name of common folders (those in the same divider) with a shared word. This is solely so that I don’t have remember all the folder names, but only the divider names–type-ahead will reveal the names when I need them. I also enforce the rule suggested in “Taking Control of Getting Started” that 2 folders never have the same name!
Finally, I uncheck “Exclude from tagging” for all the folders that contain documents (which turns them yellow).
All this is a prelude to saying that I try to completely classify a document when I scan it (I am using the ScanSnap which integrates nicely). When the document is scanned and OCR’d, a dialog comes up where I give it a name, and then use the tags field to both tag the document (“unpaid bill”) and to classify it (by “tagging” with the folder name, or folder names if it fits in more than one).
After I have scanned a bunch of documents, I then look in my in-box and all the replicants (in red) are now nicely filed, and all I have to do is delete the replicants and the documents are now in the their proper folders.
I find this much faster than trying to drag files from my inbox to various folders, especially on a small screen.
I’m also assessing information managers, as I move to a paperless home office with my brand spanking new ScanSnap S1500
Like Newbie, I’m also primarily using this (initially) for a ton of invoices, receipts and other financial documents I need to track.
It’s come down to DevonThink Pro Office vs NeatWorks, each seemingly with its own strengths. I’d be interested in hearing anyone else’s assessments. I recognize this will be biased, since I’m on a DevonThink board, but that’s ok.
I won’t go through all the differences, but here are a few key differences that are keeping me dithering.
DevonThink Pro Office
- can add any type of document, not just PDF’s. This is the biggest benefit vs NeatWorks for me
- feels rock solid
- like seeing all DBs in the sidebar
groups seem to only allow alphabetical ordering I’d really prefer to choose the order for my groups (maybe I’m missing something?)
This one’s the bigger issue: I really like to add some structured and free form notes to a PDF (like Date, Amount, Notes). This is one of the great advantages of a DB over just organizing things in Finder.
The solutions suggested in this thread make it possible in DTPO, but are a lot clumsier and involve more effort than merely adding some info to fields, like I can do in NeatWorks.
- more flexible in ordering groups
- dead simple to add other data about each PDF
- can only handle PDFs and RTFs
- some minor quirks in UI
- this one’s just aesthetic - it doesn’t look quite as clean as DTPO
Am I missing something? Can anyone help me stop dithering?
@hab777, re groups, under views/sort you can change to various sorts, including unsorted, which allows you to position groups in order preference.
I had figured that out for documents but didn’t see how to do it for groups (duh - I should have probed more).
Any advice on the other issue - easily adding free-form data and structured data like one can do in NeatWorks? I really don’t need DTPO’s sophisticated AI - I just need a place to easily file snippets and documents.
I’m thinking that If I have to put my free form notes in Spotlight comments, I might as well just use Finder, QuickLook and Spotlight, along with the bundled OCR software that came with my Snapscan. A “good enough” solution for my purposes and save the $150.