OCR images

Sorry if i didn’t find the way to do it :

I am importing nearly 10000 Evernotes Notes in order to switch to DTP . Most documents are in JPG Format, not PDF. in Evernote they were searchable (but i couldn’t access to the whole text in order to copy it, for example).

How can i transform automatically all these JPG in searchable PDF in DTPO ?

Thank you

I hope you’re not just dumping 10000+ JPGs into a database. It would be best to do this in batches to relax resources and keep an eye on things.

You can select files in DEVONthink and choose Data > Convert > To Searchable PDF.
You could also attach the OCR, Import & Delete folder action to a folder in the Finder and process them that way.

(Sorry, I’m pressed for time so short answers.)

Thank you for your quick answer.

No , i don’t dump 10000 jps in one batch in the DTP database, no worry :wink:

I began importing some notebooks (these are the “groups” in Evernote).
I am very careful and everything stays backuped. By the way, it’s a little tricky, as you cannot synchronize Evernote-> DTPO - if i have to import some notes again, DTPO will import everything duplicated)

The problem for OCR is that DTPO imports the Evernote Database as HTML (an empty note with a link) + JPG (the image, the content). But these JPG’s have no JPG extension any longer. This leads the button “OCR” to stays inactive…

It is not possible in DEVONthink to directly convert .enex, HTML, Formatted Notes, Rich Text, or Plain Text to PDFs.

Try a service such as this, which will export your Evernote notes as PDFs to Dropbox (for example) where they be imported or indexed to DEVONthink.

Still no idea, especially from the people of the support staff?

The only workaround that i found is to copy the image , then paste it again as a new document, then OCR.

Impossible to do this with all my notes…

@korm sorry didn’t see the service link.

THis might be an idea.

I will try, thank you.

Meanwhile i tried another Workaround : in Evernote, i exported the attachements in a folder (thee are the notes in JPG-Form) and the imported the folder in DTPO.

Proble: a loose the Notebook-Names (group names), and i have to push “Enter” for every note to acklownegde his name after ROC…

Evernote doesn’t make it easy to get your data out of their application / system.

Please, coudln’t you add this feature in a futur version? The only thing that you have to do is , while importing evernote notes , to OCR (on request) every image that is attached to the note and add the searchable PDF to the note.

I tried a lot of things, like

  • exporting all the attachements from a notebook in Evernote into a folder, then import the images of the folder to DTPO with OCR - > Works fine , but i only have the notes that had an attachement and not the the notes without that had been in the Evernote-Notebook.

-the workaround with the web service cloudhq doesn’t work for my purposes :all notes loose the creation date, it becomes the date of the import in Dropbox.

For me, this problem is really the big problem to switch defenitively from Evernote to DTP, because right now i loose all possibilites to search the scanned notes that has been added as a jpg to Evernote.

I have to keep the whole Evernote-Database on my computer in order to find things.

(emphasis mine)

This is not a non-trivial thing to ask if it’s even feasible at all. Transitioning that amount of data may require sacrifices on your part. Again, Evernote does not make it easy to get your own data out intact. (Hint: They don’t want you to switch.) Have you posted on their Forums about this?

I will ask in their forum, but i don’t expect lots of helps from Evernote Specialists for leaving heir community … :wink:

But please again, let me understand.

DTPO has been able to import one of my Evernote notebook, ok.
There are notes in it that have the right timestamps (creation date in Evernote).
DTPO shows these HTML (HTML+Images) without any problem in the preview window. OK.

The only thing that DTPO doesn’t do is let me select the image and OCR it.
Why not? DTPO knows that it’s an image…
In this stage, it is not any longer an Evernote -Problem. DTPO has very well done the import, now i only want to handle the contents…

I if try to select the image and dragndrop it to another group, once again, there is a HTML-note and i cannot extract the image.

If i could do this manually for the important notes, i already would be happy…

Is there still no way to ROC automatically all pictures in a folder (eg. Screenshot-Folder in FInder)?

“automatic” still requires something to be set up, like adding a Folder Action, for instance.

I don’t understand?

As Devon doesn’t think in “folders” , but in “groups”, what shall i do please?

DEVONthink is not watching the folder, but you can attach a Folder Action to a Finder folder. Right-click > Services > Folder Action Setup in the Finder is used to attach the action to the desired folder.

Thank you!

Am i right that if i don’t want dtpo to delete the item, i can delte this line in script:
if exists theRecord then tell application “Finder” to delete theItem

Right?

Thank you

No problem. Yes, you could.

Thank you very much, all works great.

Before, i used Hazel in order to copy files automatically in “Inbox”, but this didn’t OCR.
No i am happy :smiley:

Glad to hear it :smiley: