Auto-import scanned files

Hi,

Trial user here. Turns out I’m down a rabbit hole…

  • I hear that my ScanSnap ix500 is not supported directly. I use Vuescan, which works great. I don’t use OCR in Vuescan.
  • I want to scan documents and get them OCR’d and imported into my encrypted database in DT
  • I see there are hundreds of forum messages about similar topics. But half of them are ancient, half of them use Hazel (sp? - which I don’t), half of them state it shouldn’t be necessary to use Hazel, half of them suggest using the Inbox, but don’t have examples…
  • I don’t understand the idea of using the Inbox because it’s a black hole - I tried to create a folder “scans” inside it, but then I can only see that inside DT and not in Finder (I created it there and I saw the folder “vanish” within a fraction of a second!) or Vuescan - so dropping things inside the “scans” subfolder seems impossible. On the other hand, I expect I may want to use the Inbox for other things in the future, so it seems wrong to attach a “watch” rule to the whole of it.
  • I found the suggestion of indexing a separate folder and then watching that. I tried that, but it appears that my smart rule that I created to run OCR on documents created in that separate folder was never run (or at least I didn’t see any results - no idea how to make sure whether the rule is being run or not), whatever I did. I used the “On Creation” trigger, but clearly this did not work. How to debug?
  • I see crazy complicated scripts being discussed and passed around in this forum. I know my way around programming languages, so I would consider that. However, I have no idea where the APIs supported by DT are documented. Pointers would be appreciated.
  • I was sort of expecting to find a standard recipe for the whole “watch a folder and process/grab what you find in it” job since I guess this is a pretty common problem. I keep seeing comments along the lines of “you can watch the Inbox and do stuff”, but I never see examples of how that would be done. What am I missing?

Lunchtime! Hoping for a few comments - I find DT very impressive, but some things are less than intuitive… thanks for reading!

Oli

I have one of those here; it is supported by macOS 12.5 and transfers files directly to DT.

Interesting. If you search this forum, you’ll easily find what I found – as I understand, the standard software was dropped several MacOS versions ago. If it’s back then I didn’t know that!

In any case I have to admit I don’t any of the standard software because it’s horrific bloatware that bangs around with several background services at all times… horrible. I tried it years ago and happily got rid of it in favor of Vuescan. I can say that my scanner is not recognized by MacOS Image Capture, and there doesn’t seem to be a simple driver for the ix500 that doesn’t involve installing a 100MB package (edit: make that 875MB - just checked). In case you have such a driver, I’d be delighted to hear about it! Off to search for it now…

  • I don’t understand the idea of using the Inbox because it’s a black hole - I tried to create a folder “scans” inside it, but then I can only see that inside DT and not in Finder (I created it there and I saw the folder “vanish” within a fraction of a second!) or Vuescan - so dropping things inside the “scans” subfolder seems impossible. On the other hand, I expect I may want to use the Inbox for other things in the future, so it seems wrong to attach a “watch” rule to the whole of it.

The Global Inbox alias is connected to the database itself. It’s a dynamic location, not a static one. Putting items in there will be imported into the database.

  • I found the suggestion of indexing a separate folder and then watching that. I tried that, but it appears that my smart rule that I created to run OCR on documents created in that separate folder was never run (or at least I didn’t see any results - no idea how to make sure whether the rule is being run or not), whatever I did. I used the “On Creation” trigger, but clearly this did not work. How to debug?

You’d use the On Import event trigger.

  • I see crazy complicated scripts being discussed and passed around in this forum. I know my way around programming languages, so I would consider that. However, I have no idea where the APIs supported by DT are documented. Pointers would be appreciated.

Check out the Automation chapter of the Help and manual. Also, check the Automation section of our forums for a ton of discussion over the years.

  • I was sort of expecting to find a standard recipe for the whole “watch a folder and process/grab what you find in it” job since I guess this is a pretty common problem. I keep seeing comments along the lines of “you can watch the Inbox and do stuff”, but I never see examples of how that would be done. What am I missing?

There are many examples and discussions on the forums as it’s a common thing to do.

Thanks… trying to make sense of what you say.

The Global Inbox alias is connected to the database itself. It’s a dynamic location, not a static one. Putting items in there will be imported into the database.

So when you say “the database”, you mean the auto-generated “inbox” database, right? I wasn’t thinking along these lines - “the database” for me is “my database”. But I get it, hence the “on import” trigger. That said, I found that “on creation” works fine.

Meanwhile - so this means that it’s impossible to use a file path to distinguish what needs to be done with files dropped into the inbox - correct? I thought I should be able to have a subfolder “INBOX/scans” for incoming scans, and potentially other subfolders for other purposes going forward. I guess I could use a “magic” filename for this specific purpose, though a folder would have been much nicer.

Check out the Automation chapter of the Help and manual. Also, check the Automation section of our forums for a ton of discussion over the years.

Cool stuff about the manual! I read the help up and down, but I never noticed a link to that manual anywhere. I’ll be sure to check that out.

There are many examples and discussions on the forums as it’s a common thing to do.

Well I found tons of contradictory information and discussions documenting how thinks worked and changed throughout several years… unfortunately that’s not the same thing as an easy-to-follow and up-to-date recipe for the new user.

In any case I have now managed to make some progress. I decided to ignore the issue of sub-dividing the Inbox and attach my watch rule there.

This worked, with one caveat:

  • After I dropped my sample file into the inbox, it sat there for a very long time – so long that I was quite convinced my rule was not being used. After a long while I noticed that the entry was suddenly gone and that my rule had totally worked! Good surprise, but it makes me wonder how I can see this sort of thing happening for a more deterministic approach in the future. Is there a place in DT where I can see a trigger being triggered as soon as it happens?

I will change the rule to use “on import” based on your feedback. Thanks for the help!

See Software Applications That Can Be Used with the ScanSnap for the software you may be looking for.

You’re welcome.

  • After I dropped my sample file into the inbox, it sat there for a very long time – so long that I was quite convinced my rule was not being used. After a long while I noticed that the entry was suddenly gone and that my rule had totally worked! Good surprise, but it makes me wonder how I can see this sort of thing happening for a more deterministic approach in the future. Is there a place in DT where I can see a trigger being triggered as soon as it happens?

Under normal circumstances, you/re not going to be watching for imports and smart rules to trigger. I suggest you just operate normally and check back occasionally to spot-check the results.

Meanwhile - so this means that it’s impossible to use a file path to distinguish what needs to be done with files dropped into the inbox - correct? I thought I should be able to have a subfolder “INBOX/scans” for incoming scans, and potentially other subfolders for other purposes going forward. I guess I could use a “magic” filename for this specific purpose, though a folder would have been much nicer.

I’m not sure what a magic filename is. Sounds like a Shortcuts convention.

You can certainly target a subgroup in a database, yours or the Global Inbox.

Thanks… but I don’t want to do anything like that. Unfortunately I’m stuck with the ScanSnap for the time being, but I don’t want to use a scanner that ignores standard system driver interfaces in favor of a huge vendor specific software suite. Glad it works for those who use the standard software! :slight_smile:

I’m not sure what a magic filename is. Sounds like a Shortcuts convention.

What I mean is that I can call the files coming from Vuescan incoming scan xxx.pdf and then I can watch for files starting with incoming scan and make sure that I don’t grab files that aren’t meant to be handled.

You can certainly target a subgroup in a database, yours or the Global Inbox.

Target, yes - but if I have a subfolder scans inside the inbox, there’s no way to actually drop anything into that folder from outside DT, so that approach doesn’t work.

I actually tried this by appending scans to the output folder in the Vuescan config. But it doesn’t work, the files arrive on the top level of the inbox instead.

Look again. This is a Fujitsu Scansnap site with their software.

That is both correct and incorrect. ScanSnap Manager is legacy software and - for a while at least - was not available for newer iterations of macOS. That was a problem for owners of the 1500M, for example. The ix500 has always been compatible with ScanSnap Home, which followed on from ScanSnap Manager and is fully compatible with current macOS versions.

I put a piece of paper in my ix500, press the scan button, and the scan appears in DT, where it goes through OCR. I’m using macOS 12.5, DT 3.8.5 and an ix500 with ScanSnap Home.

I don’t understand what you’re getting at.

Fujitsu uses a huge software package that is completely of their own making – it does not include TWAIN drivers, and more importantly it does not include Apple Image Capture drivers. What it does include is background processes and custom updaters that hang around on my system even when I’m not scanning anything.

What I want is a driver, nothing more. Canon offers this, and so does Epson (and so do others). Fujitsu does not.

In yet other words, even when you install the 875MB behemoth they offer, you still can’t see a SnapScan in the standard Image Capture utility on MacOS.

That’s as much as I know – I’m certainly interested if I should be wrong about any of these details. But I don’t want to use a huge proprietory vendor-specific software package that ignores standard interfaces like Image Capture – that’s what I meant.

Right, understood. I don’t believe that many forum posts I found actually included that detail, but I found out later.

Still, I don’t want to use ScanSnap Home (or Manager, as it were).

Ok

Correct; DT uses the principal of a global inbox. There are several options: you can get VueScan to put the scans in a specific folder on your SSD and then index that folder (with the resulting group called Inbox/Scans for example). Alternatively you can drop the files to a specific group using the import command in AppleScript (so you could run a script from e.g. Hazel, telling DT to import to a specific group; I don’t have VueScan, so don’t know if it offers to run a script after scanning, which would allow you to do the same thing). The next alternative is to use a rule with an on import trigger that simply moves all files with a magic name to a group of your choosing. Or you could use the global inbox as just that: the place where things arrive.

DT is extremely flexible - you can get it to do most things if it’s important enough to you. It is worth stopping and checking to what degree you can conform to the basic principles of the software though; and then change what is necessary. DT grows on you.

1 Like

Ok. I use Fujitsu’s Scansnap Manager software all the time with DEVONthink and I just thought it would help. Guess not.

2 Likes

Same here. Works great, I’d be lost without it.

2 Likes

Like many others, I use my Fujitsu ScanSnap to scan directly into DT. You can use either ScanSnap Manager or ScanSnap Home, both work equivalently (I prefer ScanSnap Home). In either case, you create a profile to scan into an application, and the application you choose is DEVONthink. With that profile selected, you load the paper, push the button, and the paper is scanned and OCRed and placed into the Global Inbox. The scanned file will use the naming conventions that you define in ScanSnap Manager or Home. From the Global Inbox, you can rename it if you want and file or move it anywhere you like, either manually or automatically using DT’s AI capabilities (See Also & Classify).

3 Likes

One more refinement: In Preferences → Files → Import you can change the Destination to allow you to browse to any destination for the scanned file.

1 Like