DT3 searching markdown footnotes fails

pete31 · March 17, 2021, 1:41pm

At least in the last years DEVONthink indexed the Markdown source, not only the content. That’s what I‘ve build my whole Markdown system on.

I can’t search comments anymore.
I can’t search meta data anymore.

I used both extensively, mainly because

I like the idea of having everything in plain text which should never be a problem to access.
I don’t want to solely rely on e.g. an app’s meta data. Apps are discontinued etc.

Incidentally the fact that DEVONthink doesn’t treat Markdown anymore like it did in the last years just proved my fear of relying on an app.

„Unfortunately“ there‘s no other app I could use as there’s simply nothing that would even come close to DEVONthink.

I won’t give exact use cases, all I can say is that DEVONthink used to search the whole Markdown file, I‘ve built everything around that fact and suddenly what was good for years has been replaced.

I really don’t get why the new behavior replaced the old one. Adding would have been fine. Replacing is still unbelievable for me.

cgrunenberg · March 17, 2021, 1:43pm

It’s actually not the real rendering but the parser strips certain elements which wouldn’t be part of the rendering. In case of foot notes this is just a bug and fixed.

We might add a hidden preference but you would have to rebuild the database afterwards.

BLUEFROG · March 17, 2021, 3:10pm

Yes, only the rendered content. As I mentioned previously…

I’m not sure why footnotes are considered non-rendered

valente · March 17, 2021, 5:59pm

@cgrunenberg

Just noticed some extra strange behaviour. When I make a search targeting a text in the footnote, sometimes it does recognise it; other times it does not.

I’m attaching two screenshots that exemplify this. First I searched for Gibraltar and the term was found. Then I searched for Marinid (also in the same footnote text) and the term was NOT found.

Hopefully these examples will help on the issue.

brookter · March 17, 2021, 7:40pm

We might add a hidden preference but you would have to rebuild the database afterwards.

That would be very helpful, Christian, thank you.

In the meantime, it would be helpful to have a full article on the implications of the changes, explaining exactly which elements of markdown files are no longer searchable, and whether this applies to all forms of search / smart groups / rules etc. E.g. At the moment, it’s not really clear what ‘rendered content’ means, and people are having to do their own testing. It would be helpful if there was a definitive statement.

Thanks.

cgrunenberg · March 18, 2021, 8:58am

Basically URLs of links/images (including the ones in the footnotes) and HTML tags. This effects all index based operations (search, see also, classify and concordance) and is actually intended to improve especially the results of these features and make them more consistent to RTF/HTML.

chrillek · March 18, 2021, 9:23am

I seem to be able to search successfully for x-devonthink-item URLs, though. Regardless of the fact if they’re visible in the rendered file or not.

cgrunenberg · March 18, 2021, 9:25am

This might be possible if the files were indexed by older versions or if the link is part of the Markdown’s text.

chrillek · March 18, 2021, 9:32am

It seems that it is also possible in other cases. In my Markdown file, I have this line:

[This is a DT-Link](x-devonthink-item://1E23F171-736F-4D63-9F91-376EA58CE131)

If I search for “x-devonthink-item”, I get this same file as one of the hits:

As you can see, the creation date is quite recent, and the last changed date even more recent (that’s when I added the DT link to the file).

brookter · March 18, 2021, 9:41am

Thanks.

But that doesn’t include Metadata, which are neither URLs nor HTML.

Is internal metadata searched for or not? There’s a suggestion up thread that it isn’t, but I seem to be able to do it fine. This is a search for journal entries on Fridays…

Is that because I haven’t (knowingly) rebuilt this database? Or is metadata in fact still searchable?

Secondly, as far as I can see URLs are picked up — a search for ‘https’ or ‘x-devonthink-item’ seems to work fine. (It’s the ‘seems’ which is worrying… are some being missed?)

Again, is this by design (you do search certain parts of links), or will this (helpful) behaviour go away if I rebuild the database?

So, I think we need a far more detailed explanation for the changes than a sentence or two in this forum.

At the moment, this is causing some people to query whether the searches they have made are accurate or not — rightly or wrongly. DT3 is built on the idea of robust searching — a change in that process needs to be fully and loudly documented, even if (especially if) the concerns are largely illusory. I’m sure you don’t take these decisions lightly, but the consequences and implications need to be explained properly.

Finally, please could we have a preference to turn back to the old behaviour soon?

Many thanks.

cgrunenberg · March 18, 2021, 9:45am

It should be. Otherwise try a rebuild or, if that doesn’t help, please post the source of the document.

brookter · March 18, 2021, 9:55am

Thanks, Christian.

So, as far as you’re concerned, Metadata is searchable and will remain so?

That’s a relief, but it only underlines my point — someone coming across this thread will be 26 posts in before they get that confirmed.

BTW: it seems to go against the idea in the Release notes and earlier in the thread that only rendered text is searched — or it uses a definition of rendered which some will not have expected (does ‘rendered’ mean shown on the screen or not).

It seems to me that at the least help files need an explicit note in both the markdown and Search sections listing the new restrictions (not just saying ‘rendered’ as that’s open to interpretation). Apologies if it’s there, but I couldn’t find anything.

An article in the regular feed would also be helpful — if you’ve been using markdown search successfully for years, it’s unlikely you’ll check the help.

Thanks!

cgrunenberg · March 18, 2021, 10:00am

At least there are no plans to change this right now.

pete31 · March 18, 2021, 10:05am

Are you saying metadata should still be searchable?

I can’t search metadata since DEVONthink 3.5.1.

metadata: findMe

### Test

Text

cgrunenberg · March 18, 2021, 10:11am

I’m sorry, I shouldn’t handle too many things at the same time But metadata is indeed currently not indexed like links.

pete31 · March 18, 2021, 10:12am

No problem, you can be sure that I’ll point out that it is not searchable

Please make it searchable again, I really need this.

cgrunenberg · March 18, 2021, 10:22am

The next release will include a hidden preference.

pete31 · March 18, 2021, 10:26am

THANK YOU CHRISTIAN!

brookter · March 18, 2021, 10:44am

Hang on, now I’m really confused… The position seems to be changing and I can’t keep up.

Please can we have a definitive statement:

Can we currently search markdown files for metadata: i.e. data at the beginning of a file with the form Keyword: Data. Yes or no?
Are words within markdown links in the form [title](url) searchable or not? E.g. can we search for all items which contain say ‘https:’ or ‘x-devonthink-item’ within the (url) part of the link? Yes or not?
If the answer to either is No, then why can I find examples of both where the search definitely works? Is it because my databases haven’t been rebuilt yet (I assume)? Will I lose this behaviour if I do rebuild them?

The problem is, forum threads really aren’t a good way of getting accurate information out about this sort of issue: that’s why it needs to be addressed in one place as a specific, definitive response to which people can be pointed.

Thanks again.

cgrunenberg · March 18, 2021, 10:53am

No.

Only the title is indexed.