Workflow / scan -> e-mail -> inbox

Hi.

I have a Xerox copier as my only scanner. It doesn’t work as a desktop scanner: it’s pretty big and quite far from my computer. It can only scan to e-mail or to a folder on a server. I would like to find a way for DEVONthink to automatically OCR and import to inbox every file that I scan with it. This requires it to poll either an e-mail acount or a folder on a server. Is there a script that already does that or will I have to code it myself?

Assuming that the server connection is stable and that you can attach a Folder Action script to the folder on the server, here’s how:

  1. Create a new Finder folder that is to receive your scanner’s output. For the sake of illustration, I’ll call that folder “Harry”.

  2. In the Finder, Control-click on “Harry” and choose the contextual submenu “Services”, then select “Folder Actions Setup”.

  3. Choose the script named “DEVONthink - Import, OCR & Delete”. (It should be listed among the available options. It is located at ~/Library/Scripts/Folder Action Scripts/.

  4. Attach that script to “Harry”.

Now operate your scanner using its provided driver software after configuring it to save scanner output to “Harry”. DT Pro Office should be running.

Each time a new scanner output file is saved into “Harry”, the attached Folder Action script will send it to DT Pro Office for OCR and storage of the resulting searchable PDF, then send the original image file to the Trash. The folder “Harry” will therefore be emptied as each image-only PDF is sent to it and then forwarded to DT Pro Office for Import and OCR.

Especially if the new file sent to “Harry” is a large one, nothing may seem to have happened for a time. To verify that the script has sent the image file to DT Pro Office for OCR, switch to DT Pro Office and choose in the menubar Window > OCR Activity. When processing is complete, the script will then delete the image file from “Harry”.

I tried it already. It didn’t work. However, if I remove the file from the folder and then put it in again using the mouse, it works. It appears that folder actions do not get triggered by ftp uploads.

OK. The problem appears to come from folder actions starting before the file transfer is complete. So I need to modify the script so it only launches if the file transfer is complete.

I don’t know how the ftp server writes the uploaded files :

  • either it writes them bit by bit
  • or it creates the file empty, and then only write into it when it has received the data in a temp file

If it is the second way, I only need to test for empty. If it is the first way, I need to check that no other app has a write lock on the file.

Mh. This wass not it. I checked by putting incomplete/empty pdf files in the directory on purpose; this generates an error which is logged.

Finally I found the problem: when the screensaver is on, the folder actions are disabled. Why in the world? Well, this is Apple’s decision, so I’ll have to make a shell script with a launchd WatchPath XML so it gets launched when something changes in the folder. I’ll run osascript in it, should be enough. I’ll update here when it works.

OK. This is how I did it :

  • I put this script in /Library/Scripts/Scan/scan.scpt:
try
        set theFolder to "Macintosh HD:Users:jabial:Documents:scan" as alias
        tell application "Finder"
                set theItem to first file of folder theFolder
        end tell
        set thePath to theItem as text
        try
                set previous to -1
                tell application "Finder" to set current to size of theItem
                repeat while previous ≠ current
                        set previous to current
                        delay 10
                        tell application "Finder" to set current to size of theItem
                end repeat
                log "Trying to OCR now"
                tell application id "com.devon-technologies.thinkpro2"
                        set theRecord to ocr file thePath to incoming group
                end tell
                log "OCR successful (I think), deleting file now"
        on error
                log "No way to OCR the file, deleting it now"
        end try
        tell application "Finder" to delete theItem
end try
  • I put this launchd description in /Library/LaunchDaemons/org.jabial.scan.plist:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>Disabled</key>
    <false/>
    <key>Label</key>
    <string>org.jabial.scan</string>
    <key>ProgramArguments</key>
    <array>
	<string>/usr/bin/osascript</string>
        <string>/Library/Scripts/Scan/scan.scpt</string>
    </array>
    <key>WatchPaths</key>
    <array>
        <string>/Users/jabial/Documents/scan</string>
    </array>
    <key>RunAtLoad</key>
    <true/>
    <key>onDemand</key>
    <true/>
</dict>
</plist>

And then launchctl load org.jabial.scan.plist and yadda yadda it works.

Pff. Considering the price, such a workflow should have been included in the app. I hereby authorize Devon Technologies to use and distribute this code with any modifications they see fit. I hope for the sake of others who are less computer literate that they will.