Is there a way to OCR inline image

Mostly only for the work one.

@cgrunenberg : Interestingly, you can use a PDF with an image link, e.g., ![](Assets/Screen%20Shot%202022-01-19%20at%2009.42.22%20AM.pdf) and it will display as an image :thinking:, however, you can’t drag and drop a PDF and have it insert like images do. Not sure if this is something we want to add support for… or if it’s even intended.

I wouldn’t worry so much about the Assets groups. Your screencaps are all using a common naming convention, so I’d use that as the criteria.

Here’s an example with a standard macOS screencap…

And the tagging is optional.

This is a serious testament as to just how flexible DevonThink is. The trick with OCRing images is really slick! Thank you for the suggestion and all the details.

But…

While this is a great hack to get closer to the end goal of effortless note taking, its still a hack that leaves some issues on the table:

  • I don’t think in Markdown when I’m taking notes. I’ve really tried, but I just can’t use it in real time. (Too many years with WYSIWYG tools has broken me…)
  • Images should be inline (and OCRed) in the context of the document they are included in for searching context. Using the Links inspector to see what document is using an image is useful - but loses that context.
  • Lastly, if you delete a note that has images referenced in this way, you orphan a bunch of images instead of deleting them. Yes, you can probably search for orphaned images and delete them, but now I’m doing yet more work the computer should be doing for me.

I have a few thoughts for how to proceed with editing.

  • Make the markdown editor truly WYSIWYG in real time and allow for inline embedding of objects including images. Think of something like the Craft (note app) or Slab (content manager) editors
  • Or put in a enhanced RTF editor that doesn’t depend on Apple’s libraries, so you can deal with all the limitations.
  • I would love to see OCR on all images as a built-in system function. Including the cute piglet pictures and every single other image, including embedded, anywhere in DevonThink. Too much good data is getting lost in unsearchable images and CPU is cheap, especially if done in the background.

Thank you again to everyone that helped on this thread.

Why not? OCR will (obviously) find nothing and then mark the picture as done. A few seconds of CPU per image is relatively cheap.

Enhancement: Do OCR in the background or when the machine is idle to minimize user impact.

Have a great day!

You can inline images in MD or formatted notes. However, this increases the size of the documents considerably. Also, how exactly do you imagine the “OCRed” part to work? Currently, OCRing images converts them to PDF, AFAIK. Do you want to inline the PDF? Doable, with the same consequences as for images, but then you have a soup of Base64 characters that is not searchable.

As I said before: You can embed anything with a mime type in a MD file (using data URIs). But that will make editing the MD file incredibly hard because you’ll have huge blobs of incomprehensible stuff in them.

What you want has been discussed already multiple times. MD is a format for structured, semantic markup. Like HTML. It is not, was never meant to be and hopefully never will be, a replacement for wysiwyg formats. If someone wants wysiwyg, they should use a format that was explicitly meant for that: PDF, Pages, Word, Star/Open/LibreOffice, whatever. Forcing MD into something it was not meant to be is not the way, in my opinion.

In my opinion, RTF is a dead end. There are enough formats already permitting the same functionality (and then some). Today, even Word is not as proprietary as it was, so what would be the point of improving on a format that is rapidly using its usefulness?

AFAICT, you can already use OCR on all images. Built-in.

After some time and trying different things suggested above, I would like to revise my original requests.

Use case update: Rapidly taking notes during online or in-person meetings. This includes the inclusion of screenshots, PDFs, and other artifacts. PDFs are already OCRed. I would also like images OCRed, just like PDFs, so I can search everything. Lastly, I want to be able to “just take notes” in the moment. I can’t be fiddling with the tool.

I agree with prior comments that RTF is a technological dead end. Markdown is “where it’s at” moving forward. But as noted above, Markdown, by itself, is insufficient.

Some really cool suggestions were made above how to make Markdown and images mostly work. Thank you for all the thought that went into them.

So, what went wrong?

After weeks of trying, I just don’t think in markdown codes. (I don’t think in HTML, Troff, or LaTex either!) Having to stop and think what the codes are breaks the note taking flow.

A single central Assets directory for images will grow very large over time. Also, if notes are deleted, there is no good way that I can find to automatically delete the associated images in the Assets folder. Also, in search, OCRed assets are disconnected from the Markdown document that references them.

Revised Requests to make taking notes “just flow”:

  1. Evolve the built-in Markdown editor a bit more to realize and display markdown formatting in real time. No more separate preview window. Typora is one existing example of this. So is the Slab content management system.

  2. For us markdown impaired folks, please add a proper formatting toolbar for headings, lists, bold/underline/strikeout, etc. to apply the markdown formatting. Craft.do and The Slab content manager system do this well - copy from them. Continue to allow direct markdown input too.

  3. OCR for images should happen just like OCR for PDFs. I.E. Automatically, wherever they are located in DevonThink’s database.

  4. Finally, full textbundle format support in DevonThink. Now markdown and the linked assets (images and other included artifacts) can all be kept together and treated as a unit. And searched as a unit.

RESULT: Markdown becomes notably better than RTF for taking notes. Taking notes and including artifacts becomes a very simple workflow for the user. DevonThink hides the complexity under the covers so the user (me!) can “just take notes”. And it’s all cross-platform portable since textbundle and markdown are both well-known formats.

Before anyone says it: I am aware that DevonThink can “kind of” support some of this manually today - look no further than the excellent suggestions earlier in this discussion thread. The problem is that DevonThink currently “gets in the way” of the note taking flow instead of streamlining the process.

——

Thank you for listening - and thank you for making a product that I like enough to spend this much time commenting on!

amen to that! I am also struggling with exactly this. I like the tool but for quick note taking these additions would be so great.

1 Like

what is good is you can pick most any note taking tool you prefer and save the files (or index to them) in DEVONthink. The tool just has to save files (and not have its own special file saving method.)

But the problem stays when you are doing a lot of screenshots… this you cannot add easily to DT if you are using markdown or something else.

If I was doing a lot of screen shots into a notes document, I’d use Pages or Word which are both designed to take embedded images and save those files somewhere (eventually reaching DEVONthink I guess). Other products are available that do same. Screen shots (image files) and MD, as discussed so often here, are sort of an issue as MD not really designed for that sort of thing. But of course some products do their kludging to make it work.

Up to you how hard you want to make it. Me … I focus on the note-taking and content in those notes.

1 Like

Yes, I could use any tool of choice for taking notes. Folks above have suggested Pages or Word. Not only is that using a cannon to kill a mosquito, but they aren’t the best tools for taking notes to begin with.

Why should I use “yet another tool” when I really should be able to do this function natively in DevonThink? It’s not a huge stretch from what already exists. The textbundle format, as documented in http://textbundle.org/, cleanly addresses how to keep the markdown text and all the related assets such as images together in a standardized manner.

In the end, we still go right back back to the original problem: The process and tools “getting in the way” of the note taking flow. Can we please address that?

Thank you!

why should you? DT is a document Manager that also permits to take notes. You can store images in it too – should it therefore become a photoshop clone?

Different tools have different (dis-)advantages and focus on different things. Since you can integrate other programs easily with DT, I don’t see the point of copying their features in it.

3 Likes

Define this a bit more clearly, please, e.g., if you’re rapid fire screencapping things or taking one at a time in a more leisurely manner.

chrillek

why should you? DT is a document Manager that also permits to take notes. You can store images in it too – should it therefore become a photoshop clone?

The photoshop clone comment is a bit much. :wink: That would be making DevonThink into something it’s not. This is taking what DevonThink already has and building on it to make something new and powerful.

Why would we want this?

  • Simplicity for users (me for one…)
  • Equivalent capability doesn’t appear to exist elsewhere.
  • Enabling a more modern replacement for RTF in DevonThink.
  • macOS / iOS portability.
  • Useful capability, which is good for acquiring new users and keeping current users happy.
    • Money: New capabilities help justify product upgrade fees, which are eventually needed to help keep Devon Technologies a going concern.

I’m not gnaffle, but i have the same issue

I tend to take notes during videoconference meetings (a LOT of them this past 2 years) and also during training. Process is grabbing a screenshot of a slide or sequence of slides interspersed with the text notes. In a presentation, it can be several screenshots per a minute if the presenter is hustling. I’ve been using RTF notes for this, but its definitely limiting.

Markdown is not something to do what you want. Use something fit for purpose if you want to have fast, efficient, mixing of images and text in the same document/file.

Good news is that DEVONthink already supports use of other tools. We’ve said that before.

2 Likes

Full support for this format is actually planned for future releases.

3 Likes

The less friendly word for it would be “feature creep”.

Regardless, I’m not so sure what “This” really is. DT already can automatically run OCR on imported images (cf. “File/Import/Images with OCR” in the menu).

Then there was this request

Which begs the question how I would see the markup in the markdown file. In the case of Typora the answer is simple: I do not see it. It’s just there, and I can remove it by pressing backspace – contrary to any idea of discoverability.

Which is in my opinion the worst thing you can do: have a textual markup and hide it from the user. They’d have everything literally at their fingertips and it is taken away from them? Thanks, but no thanks.

So moving your hand away from the keyboard to reach for a toolbar with the goal to achieve a fast workflow? I can’t think of anything slowing me more down than being forced to move the mouse cursor away from the current line of text in order to click on a button.

In my version of DT, OCR does not happen automatically for PDFs. I can make it happen with a smart rule or by right clicking on the record. But so can you with an image. What exactly makes OCR happen automatically for you with PDF?

1 Like

For reading it does work now.
I used to have a script that copied the markdown to one folder and the assets to another.

But that is good news

1 Like