DT3 searching markdown footnotes fails

I just came across a problem when searching my database for a word that is in a markdown file footnote. The search did not find it.

For instance
[^2]: Trovisco é o nome vulgar de várias espécies de plantas arbustivas da família Thymelaeaceae encontradas na ásia e Europa muito cultivadas para a utilização em jardins. Incluem-se _Daphne lauerola_ e _Daphne gnidium_ bem como _Thymelaea villosa_, as quais podem ser encontradas em Portugal. A maior parte destas plantas são tóxicas.

I replicated the behaviour in all markdown files. Nothing in the footnote text seem to be found.

This should not happen. Can it be fixed?

I’m on DT3 3.6.3.

A screen capture of your search would be helpful.

I had to experience the exact same situation quite a while ago. I knew that what I was searching for was definitely in the footnotes of markdown records which I created from OCRed books but it didn’t show up in DEVONthink’s results.

I then did the same as you now, I tested and …

… with the same result …

This happened shortly after the DEVONthink 3.5.1 update so I searched the release notes. All that’s mentioned there is:

With Markdown documents, non-rendered words are no longer indexed. For existing documents, this requires reimporting the files or rebuilding the database to ensure the index is accurate.

Without explicitly mentioning it the ability to search markdown footnotes has been removed.

I was so upset I didn’t even try to start a thread, asking

  • why the ability to search footnotes has been removed
  • why it’s not mentioned in the release notes
  • why footnotes are considered to be "non-rendered words"

I didn’t start such a thread since then. Too upset.


Edit:

  • The lost ability to search markdown footnotes is just a bug.

  • Everything below is still true: since DEVONthink 3.5.1 we lost the ability to search for metadata and comments in Markdown.

  • GREAT NEWS:


Some months later a user asked why searching for x-devonthink-item links in markdown records doesn’t work anymore.

I replied

and in turn got this reply

I honestly couldn’t believe it.

Before that I thought the loss was due to some technical decision or whatever. It wasn’t.

Because some users requested to remove non-rendered markdown from the index other users lost the ability to search

  • markdown meta data
  • x-devonthink-item links
  • footnotes

I still don’t understand how that could have happened.

Without your thread I probably would have never mentioned this because I still get upset when I recap.

Of course DEVONtech can do whatever they find the right thing to do, even removing capabilities users relied on, but it might be a good idea to at least inform users properly.

Would be even better if implementing feature requests wouldn’t remove capabilities but add options without breaking other users’ usage.

Reverse feature request: @cgrunenberg please bring back the capability to search the whole markdown source.

I’m not sure why footnotes are considered non-rendered but there’s no indication we have many people searching the source of Markdown files for things like MMD metadata or item links. We do have data people were having problems searching Markdown since it was searching the source and also bloating the concordance.

Although I’m not directly concerned, my 0,02€ on it: „not-rendered“ text is completely bogus in the context of Markdown. What is rendered or not is fully under the control of the renderer. Which DT knows nothing about. If I where to use
pre {display: none}
in a CSS, would that make all fenced code in my MD files disappear from DT’s index?
Conversely, if in my CSS I chose to display a DT link in plaintext after the link text, would that make the link appear in DT’s index.

Although it might be boring to reiterate: MD is about semantics, _not _ representation.

The next release will fix this.

1 Like

@cgrunenberg I understand why some users don’t want non-rendered Markdown to be indexed.

For years DEVONthink used to index the whole source, one can say that it has been the default. I relied on this and never imagined that Markdown could ever become a problem in DEVONthink.

From one day to the other, you’ve canceled the normal Markdown search behavior. Simply removing the ability to search what I have in my databases is an absolute no go.

It’s not too much to ask that the old functionality as we it knew for years will become available again. Make an hidden option or whatever.

2 Likes

Which one exactly?

We can’t search the whole source anymore

Usually DEVONthink indexes only contents, not sources. In which scenario does this cause problems?

What exactly is non-rendered markdown?

Anyway, I just tried searching for “x-devonthink-item” (i.e. internal URLs): DT finds the pseudo protocol as well as the parts of the URL path. So maybe not finding footnotes is more a bug than a feature?

I guess @pete31 means “the source of the Markdown file”. Which is its content.

1 Like

Due to the recent change, DEVONthink indexes the rendered HTML, not the source. So you couldn’t search for H1 in the source but it also wouldn’t be added to the concordance. And the fact is, the majority of users want to index the rendered content, not the underlying Markdown code.

Content, yes, but only rendered content? Footnotes are content and so is metadata even though the latter is not visible in the rendered HTML.

1 Like

Rendered by DT, using my stylesheet? That would explain why I was able to find a x-devonthink-item:// URL when I used CSS to output it.

Since “H1” is not part of the MD source but of the markup for the rendering, that makes perfect sense.
However, I’m not so sure about the idea to search the result of the rendering. I don’t even think that there’s any clear definition of that.

  1. I can search successfully for x-devonthink-item: although it is not rendered in the HTML
  2. I can equally successfully search for a white text on white background (which is arguably rendered, but I just can’t see it)
  3. And I can also successfully search for text that is not visible at all because its display attribute is none.

So rendered could mean “it’s there, but you don’t see it because the browser doesn’t show it”, “it’s there, but you don’t see it because it’s background-on-backbround” or “it’s not there, the browser doesn’t even bother to render it, but DT considers it rendered.”
In all of these cases, one gets more hits than expected. Unless in the case of footnotes, apparently :wink:

So maybe there’s not really that much of a difference between indexing the source or the rendered version. Except for MD metadata, of course.

1 Like

At least in the last years DEVONthink indexed the Markdown source, not only the content. That’s what I‘ve build my whole Markdown system on.

I can’t search comments anymore.
I can’t search meta data anymore.

I used both extensively, mainly because

  • I like the idea of having everything in plain text which should never be a problem to access.

  • I don’t want to solely rely on e.g. an app’s meta data. Apps are discontinued etc.

Incidentally the fact that DEVONthink doesn’t treat Markdown anymore like it did in the last years just proved my fear of relying on an app.

„Unfortunately“ there‘s no other app I could use as there’s simply nothing that would even come close to DEVONthink.

I won’t give exact use cases, all I can say is that DEVONthink used to search the whole Markdown file, I‘ve built everything around that fact and suddenly what was good for years has been replaced.

I really don’t get why the new behavior replaced the old one. Adding would have been fine. Replacing is still unbelievable for me.

It’s actually not the real rendering but the parser strips certain elements which wouldn’t be part of the rendering. In case of foot notes this is just a bug and fixed.

We might add a hidden preference but you would have to rebuild the database afterwards.

1 Like

Yes, only the rendered content. As I mentioned previously…

I’m not sure why footnotes are considered non-rendered

@cgrunenberg

Just noticed some extra strange behaviour. When I make a search targeting a text in the footnote, sometimes it does recognise it; other times it does not.

I’m attaching two screenshots that exemplify this. First I searched for Gibraltar and the term was found. Then I searched for Marinid (also in the same footnote text) and the term was NOT found.

Hopefully these examples will help on the issue.

We might add a hidden preference but you would have to rebuild the database afterwards.

That would be very helpful, Christian, thank you.

In the meantime, it would be helpful to have a full article on the implications of the changes, explaining exactly which elements of markdown files are no longer searchable, and whether this applies to all forms of search / smart groups / rules etc. E.g. At the moment, it’s not really clear what ‘rendered content’ means, and people are having to do their own testing. It would be helpful if there was a definitive statement.

Thanks.

3 Likes