DT3:no results when searching in PDF+Text

  1. open pdf file in DT3 into preview window
  2. open search sidebar and
  3. enter search term

–> no results

Just because there is text visible in the preview doesn’t mean the file is searchable.

What type of file is reported in the Info inspector - PDF or PDF+Text?

Also, there are many variations on PDFs made over the years. A PDF is not necessarily a good or usable PDF.

I see there is a URL partially shown. Where is the file from?

I actually have one that’s the opposite - the file is “PDF” but it’s searchable. :crazy_face:

Just proves my point :stuck_out_tongue:

  1. As described in the title the file is marked as PDF + Text
  2. PDF file was opened in Firefox 70.0.1 and stored in DT3 via print function
    Here is an excerpt of the file when converting it to plain text:

7 %&58 1,9:- 2.;(9.< =->6 2?9@A.B- 12’+&%)&&56 C@B. 5D4 %&586 '.EF :–GFHH///60,9:->69?AH
!"#$ %&&&I (< , J.K(<-.J.> -J,>.A,JL ?M -:. !,-(?B,N ",J(B. #N.9-J?B(9< $<<?9(,-(?B6 O.,P,NL !+ (< , J.K(<-.J.> -J,>.A,JL ?M 3,0A,J(B. QR =(A(-.>6 +,JA(BI (< , J.K(<-.J.> -J,>.A,JL ?M +,JA(B =->6

3$.4’.4#
SB-J?>@9-(?B
‘,JJ,B-0 ,B> P.9:B(9,N O@GG?J-
!" #$%&’() +,(-.(/)-%0
SS6 SB<-,NN,-(?B ?M 2.;(9. ,B> V?BB.9-(?B -? !"#$ %&&& !.-/?JL
SSS6 '()
( O.–(BK<
!1" 2%0.3’$/)-%0 %4 5++6-(/)-%0 #$%)%(%67
X6 !“#$ &5YZ ,B> !”#$ %&&& ".<<,K. *(N-.J<
X

  1. to check the original file I downloaded it and opened it with Adobe Acrobat:
    file was created with Acrobat 5.x, PDF-version 1.5, all fonts included
  2. file is searchable in my Acrobat X (version 10.1.16)
  3. imported this PDF then into DT3 → searchable too
  4. it seems that on the way “show content of pdf” in Firefox + “save PDF to Devinthink 3” something goes wrong
  5. tested same procedure with Safari 13.0.1 and PDF file is searchable too
  6. problems also exist when loading PDF from the local disk in Firefox.

It’s obviously a problem between the Firefox version and the DT3 add-on.
That the file is marked as “PDF + Text” is faulty.

By the way: if you want to store the file from Safari to DT3, you can not assign or change a file name (field is missing)

And the URL of the file?

PDF file was opened in Firefox 70.0.1 and stored in DT3 via print function

Why did you print the PDF to DEVONthink?
This is not 1:1 with downloading the PDF.

how to download a PDF file in Firefox if you browse it actually ?

With the DT Add-on only the url is saved not the file itself. If this is later deleted, you hove no access anymore.

With the DT Add-on only the url is saved not the file itself

That would only be true if you’re capturing it as a Bookmark.

Also note we can’t control if a browser will allow clipping or not. This clipped in Safari as PDF (Paginated), though the site is noted as Not Secure.

the DT Add-ons for Firefox and Safari actually not save a browsed PDF file to DT3

In Firefox, no dialog is displayed, or even just a reaction when you click on the add-on.
In Safari no option to save a pdf file in DT3.

So I dont´t understand your question " Why did you print the PDF to DEVONthink?"

The text layer of the document is either broken and/or not compatible to macOS’ PDFkit framework.

In Safari no option to save a pdf file in DT3.

Indeed there is when you’re using the Safari browser extension…

nice … but what should I do with “one” page PDF ?
May be the translation was not so perfect. So I didn´t use this option (but it actually works).

But I do not want to use Safari either.
Therefore, the question arises for me why the add-on does not work properly in Firefox.

That’s showing the option in the dropdown. It’s not the only option, as is easily shown.

Therefore, the question arises for me why the add-on does not work properly in Firefox.

That would be a question for the Firefox people.

Did you try re-OCRing the PDF? I have encountered several cases of PDFs that were marked “PDF + Text” but turned out not to be searchable. In earlier times, DT did not permit re-OCR as far as I remember, but for the last year or two I did it from time to time and it worked.

(@Knappe , I didn’t assign correctly before)