"Summarizes Highlights" don't work properly on pdfs

Hi there,

I’m pretty new with DT Pro (attempting to switch from Evernote) and still finding my way around.

However I find this “Summarizes Highlights” as very useful, potentially, but it seems it’s not working as I would expect.
This is what I do:

  1. I print an email from gmail (Save as pdf)

  2. I read and highlight in DT

  3. I do “Summarizes Highlights” - I tried all three (sheet, rich text, md) but some letters are missing which makes the summary not useful

  4. However if I do OCR on the pdf and then do “Summarizes Highlights” then it’s fine.

Question: is there a way to skip 4. and still have useful highlights ?

Thanks
Cheers

Welcome @Abdt

Using the web interface of Gmail isn’t the same as using an email client. The file coming out of the browser does not appear to have a proper text layer.

We strongly suggest using a proper email client. Apple Mail is the best of breed for inter-application communication.

1 Like

Ahem……MailMate

Mailmate is a great app but certainly (1) the UI and experience are not for the faint of heart, and (2) its abilities in inter-application communication are not for the uninitiated.

Apple Mail is a simple and straight forward UI and we have an Apple Mail plugin that works with it specifically.

Very good points, sir. I just have never thought of Apple Mail as best at anything- let alone “best of breed” :grinning:

1 Like

I’d suggest taking a second look then. It’s not perfect (but what app is?), but it’s still a solid performer.

  • Flashy? Nope.
  • Trying to be something it’s not? Nope.
  • A good, reliable email client that plays well in the Mac ecosystem natively? Absolutely. :slight_smile:
1 Like

May I add that since its inception this program is not able to manage IMAP subscriptions (though there’s even a dialog for them ;-). On the plus side, I seem to remember support for S/MIME signing/encryption. Which is probably even more esoteric than IMAP subscriptions :wink:

Isn’t Mail one of the programs made by Apple that lack a “share” menu/button? Right, it is. How to save an e-mail to Notes, another part of the Mac ecosystem? I know, I could pseudo-print it to Preview and share to Notes from there. Which is a hack, in my opinion, not an example for “playing well”. And the scripting part of Mail is … well, not so well, I’d say. But maybe it is only badly documented :wink: In any case, there’s no interface to programmatically sign/encrypt an e-mail, it seems. But who’s doing that, anyway :sob:

I’d say things like signing / encrypting emails are a minority activity. :slight_smile:

1 Like

As I said: who’s doing that, anyway

Hey @BLUEFROG

Thanks for your reply.

I understand about Apple Mail being native tool, however, for my corporate gmail account this is a no go.
Meaning that , I will use “summarize highlights” with limited capabilities (no summary), unfortunately :frowning:

cheers

1 Like

Could you please send one or more of such highlighted documents to cgrunenberg - at - devon-technologies.com? In addition, which version of macOS do you use? Thank you!

To be precise: You will be using Summarize Highlights working as expected—just on damaged PDFs.

I am mentioning this not because I’m pedantic but to point out that it’s not DEVONthink’s fault you don’t get a proper summary from PDFs generated from GMAIL mails. On the contrary: It’s DEVONthink that will solve your problem!

I’d suggest a solution you already found, just automatized:

  • Create a folder in Finder where you save all your GMAIL PDFs to.
  • Have DEVONthink index that folder.
  • Create a Smart Rule in DEVONthink that runs OCR on every new PDF in that folder.
  • Let the same rule move the OCRed PDF into your database and into the Global Inbox or where ever you want it to be. You might even have a number of Smart Rules sorting the PDFs by content to different locations.

The OCR will take a moment, true, but unless somebody sends you whole novels by e-mail it won’t be significantly more than just a moment. Worth it, I’d say, as then you will have a collection of PDFs with congruent image and text layers. Which is crucial for every future use case, DEVONthink involved or not.

3 Likes

@suavito hi,

thanks for your reply, really appreciated. Suggested steps make sense and I will try that approach.

Though I’m pretty new in DT, I do believe it’s very powerful tool. However I was not aware, tbh, that gmail is generating damaged pdfs … (ok, just another entry in my list “frustrations with gmail”)

Cheers

hi @cgrunenberg ,

I use:

  • MacOS Catalina 10.15.7
  • google chrome Version 90.0.4430.212 (Official Build) (x86_64)

I am not able to send you those particular pdfs, but I will send some mail to self , print to pdf and highlight and send those.

thanks
cheers

This would be great, thank you!

It’s been a while …

So I think that I have made this working but today it seems it crashed, and I tried to recreate, but ended up pulling my hair out:

So this is my rule:

image

“OCR_for_Devon_Local” is folder that has been added to Devon database via “Index Files and Folders”

What I do:
From Chrome I print some email and save as pdf into above folder.
Then I got Alerts being displayed and as result I have pdf file into Devon Inbox.
I do few highlights and then try to summarize highlights but obviously pdf is not OCR-ed.

If I OCR manually and then highlight → summarize highlights then it’s ok.

What do I do wrong ??

Thanks guys
Cheers

For a start, I think you want All of the following are true, followed by Kind is PDF/PS and Word Count is 0 in order to trigger the rule on an unscanned pdf.

I’m a little confused by the folder into which you are importing and its relation to the folder you quote. I’d import into the global inbox and, once the file is OCRd, move from there to the folder you quote—but perhaps I’ve misunderstood something.

Stephen

Hi @Stephen_C , thanks for your reply.

First, why it’s needed Word Count to be 0 ?

Second I think should be Any of following is true cause for ex sometime I want to drop an image inside and OCR it (so I get similar functionality like in Evernote).

On folder: so that is my dedicated folder so I put in there emails printed from gmail and other stuff that need OCR. It seems that print to pdf from gmail produces pdf-s that needs to be OCR-ed.

While global inbox for me is to put diff things that don’t need to be OCR-ed, including pdf-s which don’t need OCR.

Hope this makes sense

Cheers

First, why it’s needed Word Count to be 0 ?

Because not all PDFs need to have OCR done on them. This criterion eliminates matching files that already have a text layer. This is just a generally useful assumption to make.

Second I think should be Any of following is true cause for ex sometime I want to drop an image inside and OCR it (so I get similar functionality like in Evernote).

This is acceptable.

so I put in there emails printed from gmail and other stuff that need OCR. It seems that

Emails printed from Chrome or anywhere else should not need OCR done on them.

PS:

  • Why are you printing emails to PDF from a browser instead of using a proper email application?
  • And why Chrome? Have you tried output from Safari?
  • If you are running the Pro or Server editions of DEVONthink, you can import those emails into your databases.

I suggest this criteria block in the smart rule…

_Hold the Option key to change the + buttons to an ellipsis (…) and create subcriteria.

_Hold the Option key to change the + buttons to an ellipsis (…) and create subcriteria.

^ good tip :slight_smile: , thanks !

Emails printed from Chrome or anywhere else should not need OCR done on them.

^ actually earlier in this thread we concluded that when I try to summarize highlights from printed to pdf gmail , summarized highlights are broken. If I OCR and then do summarize highlights all is good

  • Why are you printing emails to PDF from a browser instead of using a proper email application?
  • And why Chrome? Have you tried output from Safari?
  • If you are running the Pro or Server editions of DEVONthink, you can import those emails into your databases.

^ these are corporate emails so for some reason I cant use email app or Safari.
I run Pro btw

Thanks
Cheers