Am I missing something here?

rhkennerly · February 4, 2016, 12:59am

I’ve been vaguely uneasy with Evernote for awhile, the company seems more unstable by the week while all the development energy goes into getting stuff into EN & nothing has changed about actually using the data on the backend since it was developed. Essentially one note at a time goes out, no way to arrange, brainstorm or storyboard or use the data you’ve collected to create something new.

So some folks over at the EN forum had said some good things about DTPO, so I’ve got a trial version set up.

My first shock was a test importing my 300+ business cards from EN. Did I do something wrong? dT stripped out all of the OCR & hand edited data EN had gotten from the photo scan in the EN iOS app. All I have are pix now.

So I took a pix of a business card with the dT iOS app (after I’d connected it & synced). I got a pix.

I tried a Voice Memo in the DT iOS app, and it doesn’t take advantage of the iOS native speech to text Conversion.

So I tried importing my entire EN dB to see how the rest of the conversions go. DT has been at it all day. In DT’s defense, there are 8300 notes to bring over (I’m a Researcher).

Anyway, I’ve read the manual and I’m backtracking through it again. But at this point, compared to the other systems, DT just seems “primative.” Particularly the mobile systems.

What am I missing? (Oh, and I had a 24 yr. career as a SQL dB admin in another life).

BLUEFROG · February 4, 2016, 1:39am

DEVONthink didn’t do anything. Evernote keeps that data to itself. They don’t make it easy to get your data out of their system and there’s only a limited Applescript interface to pull data with.

Also, the current version of DEVONthink To Go was made as a capturing companion to DEVONthink. Version 2 is a completely new application that has far more capabilities and acts more standalone than version 1.

And DEVONthink is hardly “primative”.

rhkennerly · February 4, 2016, 1:55am

So I still have the cards in a file & the ocr software installed. How come DT hasn’t detected what they are & started OCR & recreating the contact data?

In EN I have to fight with it to keep it from trying to convert any B Card sized pix into an OCR file.

BLUEFROG · February 4, 2016, 2:07am

Part of Evernote’s hype is their “OCR everything to our servers”.

DEVONthink is not actively polling your databases looking for things to OCR. Files can be converted via right-click > Convert > To Searchable PDF as needed.

Files not coming through this process can also be OCR’d on import through:

File > Import > Images (with OCR)…
Through the scanning interface
When Synced to the Desktop through a capture in DTTG when DEVONthink’s Preferences > OCR > Incoming Scans > Convert to searchable PDF is checked.

Bill_DeVille · February 4, 2016, 2:40am

I’m a researcher, too. The database in which I spend most time holds some 30,000 documents, ranging from abstracts to book-length, and rivals the total word count of Encyclopedia. All of those documents remain in their native filetypes, and can be recovered or exported in those various filetypes at any time (unlike Evernote).

DEVONthink makes Evernote seem very primitive by providing artificial intelligence assistants to help file new content, or to compare the contextual relationships in a document to all other documents in the database and make suggestions of others that may be similar. I do my draft writing within DEVONthink, because all the information content of the database is at my fingertips. For example, I can select a few paragraphs I’ve just written, Control-click (right click) on the selection and choose the AI assistant, See Related Text. DEVONthink will suggest other documents that may be contextually similar, so that I can explore how others have dealt with the topic or concept at hand. That’s not only useful for exploration, but a great way to overcome writer’s block. Those suggestions typically pop up within a second, and I can examine them and “attach” those I find useful as new tabs in my draft note.

For good reasons, DEVONthink Pro Office doesn’t automatically OCR every PDF added to a database. But my favorite scanner, the ScanSnap iX500 is paired with DEVONthnk Pro Office via ScanSnap Manager, and I do want all my scans to be OCRed as they are created.

If you wish to have PDFs already captured to a DEVONthink Pro Office database, that’s easy. In the full Search window (Tools > Search) define a search set that looks for PS/PDF documents that have a zero word count, and save the results as a new smart group. Open the smart group and select all the items listed (if there aren’t thousands), then choose Data > Convert > to searchable PDF. For such an exercise, I would set Preferences > OCR to delete the original document after text recognition, but leave unchecked the option to add metadata after text recognition (as I want the batch processing to go ahead without stopping at each new document and waiting for me to do something). I would also set the dpi of the saved searchable PDFs within a range of 130 to 200 dpi and 50% image quality, to result in view/print quality that I find acceptable and with relatively small file sizes.

As you gain experience with DEVONthink Pro Office, I think you will come to agree with my own opinion of Evernote in comparison to DEVONthink. Evernote is far less sophisticated and limited as a document/information tool. Moreover, Evernote is designed like a roach motel; it’s easy to add documents but difficult and sometimes impossible to get them back out in usable form. (I think that was deliberately done by Evernote’s developers.)

gg378 · February 4, 2016, 3:42am

Bill summarized it all.

Concerning the business cards: I believe that has a lot more to do with EN than DT. I assume with “pix” you mean that the cards are scanned into an image format. As far as I know, DT and other major IM software only consider OCR text layers in pdfs. I believe (but don’t know for sure) that the common image formats cannot provide a text layer. At most you can store the recognized text in the metadata. So searches will find the card, but there will be no way to point in the document to the specific location of the text. Of course, since most image files are single sheet, that’s not too severe. Nevertheless, I’d call that half-a**ed OCR. Other programs, including DT, don’t do this and only consider pdf for OCR. Pdf is the archive-safe, transportable, standard solution for OCR. While this is clearly inconvenient for EN converts, I think this is not unreasonable.

EN makes it hard (but apparently not impossible) to export OCR’ed pdfs. That probably makes sense from their point of view, because otherwise the free version of EN could be used as a convenient, free, high-quality OCR engine (within their free upload limits, which are not too generous).

DT will not auto-initiate anything on your imported docs, and that is GOOD! Nothing should touch your important data, until you say so. What if the pdf is an important legal document, and then it was suddenly fiddled with? Also, OCR is CPU-intensive; many of the incoming scanned pdfs don’t need OCRing in the first place. So it is best to leave this to the user.

You can “semi-automate” this easily:
-Make a smart group with the conditions
— Kind is PDF/PS
— Word count is less than 5

This will show you in a nice list all the un-OCR’ed docs. You can then batch-convert them. I go further:

Replicate all pdfs that fall into the smart group above, but which, for whatever reason, you don’t want/need to be OCR’ed into a group names “dontOCR”.

— then add the condition “tag is not dontOCR” to the smart group

Then those unwanted, un-OCR’ed files will not clog that OCR to-do list.

My suggestion is to play with DT for a while and seek some advice on the forum and you will get good answers.

eboehnisch · February 4, 2016, 8:11am

That is correct. The iOS text recognition works from the little microphone symbol in the keyboard and is technically restricted to that. Apple has not opened it to e.g. sending it a sound file (voice memo) and it returns a text transcript. Other products run their own online service for this, technically recreating what’s already there.

FROBGOBLIN · February 4, 2016, 11:32am

can you export ocr’d stuff from evernote? yes.
discussion.evernote.com/topic/4 … -evernote/
are you locked into evernote? no. in the menu, there is an easy way to export everything as .html
is any of this easy? it depends on what you have in your account. for exporting ocr’d pdfs, it appears to be relatively easy, but you may have to do it note by note to get those ocr’d files (i’ve done it this way before). for exporting images (.jpeg, .png, etc.) you can use the api, but that might not be a terribly enjoyable experience. i always assume i will have to leave an app someday (unfortunately, i’ve always been correct) and i designed my usage so that i could easily get out of evernote – that meant ocr on my own for pdfs before i put them in there and making no use of the image ocr or tags. i’m not a huge fan of relying on stuff that i can’t (easily) take with me later.
is devonthink primitive? i don’t think so. it’s different. that’s all. many tasks are simply impossible to accomplish in evernote. if i decided to abandon devonthink tomorrow, it would literally take me no time to move out of it, because everything is “indexed.” If I drop 100 files into it, I can probably sort them into groups in a couple of minutes using the “artificial intelligence.” if i want to sync securely (no cloud) with my ipad even when i have no internet connect, i just pair my phone and use “bluetooth sync.” there are all kinds of things that evernote cannot do, will not do, and doesn’t want to do. the same could be said for devonthink. play with it a while, keep asking questions, and see how it works for you.

as far as “portability” goes, i’d say devonthink is about as portable as you can get and, while evernote does a pretty good job, a lot of things are unique to it and simply won’t make the trip out of it into another app very well. that’s the price you generally pay with any app. devonthink is a rare one that has no barriers at all. in this sense, it might be the most “advanced” app out there

by the way, you’ve got the head of the company replying to you in the post above mine. it’s nice to have access to the folks at devonthink when you have a problem / concern / question. they may not always agree with you, but they are here, and they will listen.

rhkennerly · February 4, 2016, 6:18pm

I’ll respond to the other comments later. I’m in the field today.

However, I checked the MBP (latest OSX) at lunch & 28 hrs after starting the import process it is acting as though it is STILL importing my 8300 EN files, neither DTPO or EN are shown as having stopped responding. But file totals in DTPO have not updated since the process began. There are still pop-ups for EN & DTPO appearing & disappearing & the EN icon bounces.

I got impatient the last time I tried to import EN and, on rebooting, had thousands of orphaned files in DTPO, so I determined I’d be patient this time.

Any advice? Abort? Let it run? Other ideas?

Whoever said EN was the “Roach Motel” of data systems got that right.

BLUEFROG · February 4, 2016, 7:04pm

If the dock icon is still bouncing, I’d let it run. I have no metrics to determine the time on an import of this size, but if it appears to be responding, I’d let it go. Sorry.

rhkennerly · February 4, 2016, 7:38pm

Tnx.

rhkennerly · February 4, 2016, 10:14pm

I might be the first person in history to timeout your 150-hr trial period while still importing EN. ;->

A couple of points on the “primative comment” I was referring to the iOS app. Unfortunately, most of my work is field work in a mobile office: iFon, iPad, and a Bluetooth Keyboard, a couple of pretty good quality Zoom mics for the iPhone (one for interviews the other for videos), some tiny tripods all in a rucksack.

I’ve come to depend on “on the fly” sync from the field to safeguard my material, wifi when I can, cellular when I must.

I don’t doubt the sophistication of DTPO, but don’t doubt the ease or sophistication of EN either. You can decide where and if “primitive” applies:

Attached are two screen shots of one business card I took with the EN iOS iFon app and the automatic OCR (a top & bottom half) plus adding it to my Contacts. This was detected effortlessly.
Today I took a scan of a Death Certificate using the Evernote Scanable app. This certificate had all of the usual state paper filigree printing and security paper issues. I’d just read the comment on the board that EN “claimed” to scan all the PDFs with OCR, so I found the name of the cemetery “Resthaven” and searched through my entire database with no hits. I scanned the document and sync’ d up & searched again. No results. However, three hours later I checked and EN returned “Resthaven” in the PDF of the Death Certificate.
Also, I read the “File This” thread with interest last night, since I use it. And today a question came up about a charge I’d made back in September. Even though I’d never opened the PDF VISA statement that File This had swept into my EN account, when I searched the VISA folder in EN for the name of the company, the charge was highlighted when I downloaded the PDF to my iPad in the field.

I think it’s clear EN OCRs & indexes PDFs (at least those of Premium account holders).

So, as I said, every effort that EN has put into the front end of sweeping up and indexing information is top drawer. There is nothing wrong with admitting that.

Their problem is, as someone above put it, the Roach Motel status of the backend. If you collect to no purpose, or not much purpose, or are only concerned with archiving, EN is a great product.

It remains to be seen if DTPO is going to be a spectacular backend. My problem, like most contractors, is that I live and die by the job. Taking time out to learn the in’s and out’s of DPTO is going to cost me big time if I change over.

I really wish you had a pre-populated sandbox to test drive.

What I’m really interested in is the “connecting the dots” aspects of DTPO, I spend a lot of time drawing out timelines and diagrams of actions/inactions or missed opportunities before I create my final reports. That’s where EN really fails, at the production end.

rhkennerly · February 4, 2016, 10:16pm

Didn’t realize I could only add one photo. Here’s the other:

BLUEFROG · February 4, 2016, 11:19pm

Just a note: If I snapped a picture of a business card with DEVONthink To Go on my iPhone and synced it to the Desktop, DEVONthink would OCR the file when it received it. It will not add it to my Contacts. This doesn’t mean there aren’t things for us to consider but that kind of thing is generally not the focus of our software or our User base.

In fact, I just imported with OCR the JPEG you posted and immediately found the card with a search string of frank m*aci (because I couldn’t recall his last name but remembered it started with “m” and ended with “aci” )

rhkennerly · February 5, 2016, 9:49am

That very well could be…my laptop is still grinding on the EN transfer so I haven’t tried to sync it with the MBP & run theDT OCR over it. But, as noted, I’m really focused on mobile office field work & preserving data.

As I mentioned, the MBP is still grinding on the EN import. I don’t have time to fool with it. I need to be on the road in about 15 min. I may have the rest of the afternoon to look at it, but I can’t let it go on much longer.

Two things I noted, it’s so occupied I couldn’t get past the screensaver login screen, in a reasonable amount of time.

Second, I went to the EN web interface & noticed my file count has more than doubled from 4300 to 9660. I couldn’t find a file that looked “swollen” so it must be hidden.

All in all, looks ominous.

BLUEFROG · February 5, 2016, 2:19pm

I was referring to our mobile product, not syncing Mac to Mac.

I would force quit it then. Also, note that DEVONthink is doing nothing more than asking Evernote for data then making records in its own database.

FROBGOBLIN · February 5, 2016, 2:37pm

conversions are sometimes tricky and you have to budget a lot of time not only for the process of moving from app a to app b, but also for getting the hang of a new app. it isn’t necessarily going to be accomplished without bumps along the way. word to pages, pages to pdf, etc. often run into difficulties. and, each app has different strengths and weaknesses so that a seemingly simple and obvious feature in one (vertical text support for asian languages in word) is surprisingly absent in another (pages). it’s the nature of the beast.

devonthink mobile is adequate for my needs, but it isn’t designed to stand on its own, and it hasn’t received an overhaul since it was originally released years ago, so it may not perform as well for your stuff. evernote’s approach to development be everywhere with amazing sync and regular overhauls of the app (the rapid development cycle often resulted in product releases with major deficiencies). devonthink generally takes a more cautious, methodical approach, and the focus has clearly been on the desktop experience. it is not well-suited to a mobile-driven workflow and i think devonthink in general is going to be a tough fit for such a use case – it isn’t cloud based. i consider this a huge strength. others might consider it a weakness. It’s a matter of perspective.