I’m sure of this, tested with multiple pdfs with different nature, It can be easy replicable with obsidian and that popular addon.
Other apps don’t look like they suffer from this issue but this library in particular can’t index the text inside the pdf.
@chrillek I correctly reported other dt bugs in the past so I’m not here in first instance.
Before here I talked with the plugin developer ( here: [BUG] can't index and find inside pdfs · Issue #163 · scambier/obsidian-omnisearch · GitHub ), in the discussion you can find a report of the “error”. As “indexing” I’m referring to the possibility to read text inside the pdf in order to search for specific words.
I didn’t think the problem was in the library because the same pdfs don’t suffer the same problem if they do not transit inside DT so I was wondering what DT changes in pdf structure.
Well, we’re talking about this issue. And after I read through the issue thread on GitHub, I still don’t see why that’s a DT problem. You said yourself that other apps can find and use the text layer in your PDF. So, it seems quite obvious to me that this library can’t find the text layer. Others can.
I’d suggest that you create a single, reproducible case where
– a PDF is indexable by this Obsidian plug-in outside of DT (where exactly is it residing, then?)
– import the same PDF into PDF and then export it again with a different name into the same folder as the original one.
Now, run cksum -l original.pdf transited.pdf in Terminal – what is the result? If the results are different for the two files, DT did something to them. Otherwise, it didn’t.
In addition, you might want to run xattr -l original.pdf transited.pdf in Terminal as well.
Now check if the plug-in can index the second file, i.e. the one that you exported from DT.