I realize this subject has been raised previously but wondered if there is any new thinking with regard to using D T as the primary tool to convert handwritten notes into plain text files.
Currently I’m using ChatGPT to transcribe files from a scanned jpeg file. I haven’t done a quality check but a superficial look at the results indicates high degree of accuracy.
My concern with this approach, because I’ve a lot of ignorance on the subject, is that I have no idea what happens to the processed material on the ChatGPT side. To be on the safe side I would prefer to have a local solution and Devonthink 3 is my preferred tool for the job but my experiments indicate that the OCR in Devon cannot successfully translate handwritten notes to text files.
Does anyone have a working solution in Devon that will transcribe images of handwritten notes to plain text files.
NOTE: In my case I’m using images scanned from a paper notebook. I’m not dealing with soft/electronic handwritten notes.
What’s your handwriting like? That’s going to determine how “easy” it is to OCR, and therefore how much horsepower a successful tool will need.
Also, what do you need the results for? I’ve struggled with this question for pretty much my entire career, thanks to a combination of a terrible scrawl and a strong preference for handwriting. In my experience, the GoodNotes engine does a decent job for paragraph-length material, but wholesale conversions just aren’t worth the bother. Topical tagging is almost as searchable with much less effort.
I have tried writing on iPad with pencil, ‘paper like’ tablets, pens that supposedly pick up your writing as you use them on special paper, but found nothing better than pen and paper itself. Sadly, my handwriting is like a spider on drugs and, at times, even I struggle to read it (a legacy of too many years at a keyboard). OCR most often gives up or comes out with some hilarious interpretations.
I keep my scanned writing as it is and either annotate/summarise or create a typed version if it is important. This forces me to condense my scribbles to what is essential. Think of it like people condensing their ideas into atomic notes.
I still write as, for me, somehow the act of writing (rather than typing) gets ideas into my memory.
None of this helps your original question, but as @kewms points out, it is mainly an issue on how good/bad your handwriting is. I have a long term goal to re-learn how to write neatly.
Handwriting: Poor but everything is in capitals so there is no cursive to deal with. Reason: Fiction writing. Preferred method is ink on paper. I need to transfer my originals into text for either Devonthink 3 or Scrivener. Quantity: with ChatGPT I’ve uploaded as many as six handwritten pages as jpegs. They are processed as one file. On screen the results come back as markdown. I’ve also had the results returned as a plain text file.
For note taking, typing works well for me or perhaps using the Apple Pencil/iPad but for fiction writing I find the slow pace of my handwriting helps with forming ideas as I go.
I’ve also found dictation of hand written material works well but the approach of just giving the files to ChatGPT takes far less time and requires less clean up editing.
FWIW, I’ve found the “transcription edit” of handwritten material is an important part of my process. It’s the first time I see what I actually have, as opposed to the image in my head that I may or may not have successfully captured. YMMV.
Thank you for the files! On macOS Ventura the transcription of handwritten notes contains a lot of the usual OCR issues (e.g. wrong case or mixing 1, I and l). But on Sequoia it’s more accurate but refuses to recognize unclear notes (e.g. different orientation or color) but it’s unclear how this will be in the public release.
Additional Thought
In my limited experience the prompt is also important in getting the output you want. In my case there are at least four processes to produce what I want
Transcribe the handwritten text
Convert the upper case to lower case
Capitalize appropriately
Combine a series of images into one file
Here is the guidence from ChatGTP 4o, although I found I had to add the instruction to add appropriate capitalization.
Putting It All Together
Objective: I need the transcription of my files to be in plain text format.
Formatting Details:
1. The entire text should be in lowercase letters.
2. The transcription should be returned as one single document, not separated into pages or sections.
3. There should be no code formatting or windows (e.g., CSS, SQL).
Example of Desired Output:
Also laughing about “spider on drugs”. This not about conversion of handwriting scrambling OCR, but about handwriting itself.
I was taught cursive in school, but my handwriting had deteriorated after decades on a keyboard. Une amie introduced me to notebooks used by French schoolchildren with two faint lines between darker ones, which is astonishingly effective in teaching/improving cursive. Within a matter of weeks, writing away at random hours when I couldn’t face a keyboard for another minute, I was writing like I used to – haven’t tested it with OCR, but I bet it can handle it (or at least better). I searched on handwriting, looked at Images, and picked a sample I liked and copied upper and lower case, etc. Then picked a poem or a few paragraphs of text and copied that, and Voilà – readable writing!
If you’re curious, look for Clairefontaine, French-Ruled (Séyès) notebooks. And BTW, Clairefontaine paper has been made in France since 1858, and is preferred for anyone who likes a fountain pen. If you want a nice notebook, see Rhodia; they use Clairefontaine paper – no bleed, no shadow on other side. Beautiful to write on.