DT3 searching markdown footnotes fails

brookter · March 18, 2021, 1:13pm

Thanks! That explains the x-devonthink-item issue nicely.

But how does explain the screenshots?

Searching for items known to be within a metadata line:

Text: “Markdown search test” returns both the file and highlights the text within the file – including the string `Title: ’ (screen shot 1)
Text: "Title: " or “Title: Markdown search test” returns only the file, but does not highlight the text within the file!= (even though search 1 does!) (screen shot 2).

So DT3 is clearly finding the text on both occasions (the file is returned), but only highlights it in the first case.

This seems a bit confusing.

cgrunenberg · March 18, 2021, 1:30pm

Due to the --- lines the parser indexes the following lines, call it a bug or feature

brookter · March 18, 2021, 3:07pm

So, basically, for the time being we should treat any markdown search on existing data as being unreliable: negative results don’t mean there is nothing useful to find in the metadata (unless the metadata was in YAML format).

Can we assume that a detailed explanation of the problem will be added to the Help file, so that users who need such searches can use external tools (grep etc). A Tips and Tricks article would be useful, too.

But thanks for promising that the old behaviour will be returned soon (a hidden preference is fine by me) — that’s very welcome and I appreciate it.

Regards.

BLUEFROG · March 18, 2021, 3:13pm

I wouldn’t assume it but this will be discussed and considered.

Also, please clarify why you “need such searches”. Thanks!

brookter · March 18, 2021, 4:58pm

Jim,

Why would people would put metadata into their markdown documents according to the established format and NOT want to search for it? That doesn’t seem to make any sense, does it?

Metadata is part of the content of the document and if it’s there, it’s there for a reason and it should be retrievable by a search.

TBH, I’m struggling to see any benefit at all in ignoring it for searches compared to the serious disadvantage of making DT’s search results unreliable for existing documents, which is what the new behaviour in effect does.

Thanks.

BLUEFROG · March 18, 2021, 5:05pm

MutlMarkdown metadata isn’t used in one single way so I guess it depends on what it’s being used for individually.

And again, since it’s not rendered, we have had requests for it to not be considered when searching.

Beyond that, it’s Development’s call on implementation.

brookter · March 18, 2021, 5:14pm

I’m sure you have: I have a lot of respect for the team and your desire to fulfil user requirements.

But as it stands, “metadata isn’t used in one single way”, but you’ve just removed the possibility of using it for one of its (intended) purposes…

A change like this, which breaks a fundamental part of DT’s appeal for many people (searches are a reliable guide to content of the markdown file) without a detailed explanation of exactly what is excluded and a warning to users that searches may no longer work according to expectation isn’t helpful.

I like Christian’s idea of a preference to revert to the old behaviour but I still think you need to explain in detail to users what has changed, and not expect them simply to ask the forum when they get an unexpected result.

Thanks.

chrillek · March 18, 2021, 5:27pm

As a borderline case where not rendered meta data is in fact rendered, consider the usage of Hugo (a static webpage generator). They employ the MD metadata (“frontmatter” in their parlance) to contain all sorts of data which may (or may not) be rendered.

Imagine a Hugo user indexes their MD files in DT (not completely unthinkable, is it?). Now, wouldn’t they expect to be able to find words for example in the “Title:” or “Description” field of said frontmatter if they search for it in DT?

Yes, it might be a borderline case. But (I know, I’m repeating myself): MD is a text format. If and what parts of it are “rendered” has nothing to do with the file itself. Something might be rendered here today and not there tomorrow.
I’d rather have software not second guessing me

OogieM · March 23, 2021, 4:26pm

You do not know all the use cases that your users have.

I’m in the process ofmoving all my plain text documents into markdown, adding metadata and doing other things to make them more compatible with other tool ssicne DEVONTHink is moving further away from keeping old things working and becoming far less mportant. It’s a paibful process, I’ve invested heavily in DT over the years but the current things like DTTG breaking and loss of documents, the loss of the 3-pane view that was central to how I worked, the inability toeven acknoel;edge that there are use cases for features that you have not considered has me rethinking my whole system. I would have fully expected to be able to search ANY text in the newly created markdown file sand the fact that you havne’t heard from me on this bug is that I haven’t yet moved enough into markdown to experience it. Given that I know it; yet another DEVONTHink deciding to remove features issue I will probably move faster to get those items out of DT and into something else where the tools support such searches.

BLUEFROG · March 23, 2021, 5:12pm

You do not know all the use cases that your users have.

I didn’t say we did. I said, “there’s no indication we have many people searching the source of Markdown files for things like MMD metadata or item links.” I didn’t say “No one is doing this…”. I said, we have no indication, i.e., not a bunch of forums posts and supports tickets, etc. pointing to this as a normal behavior or complaining about it not working the way described in this thread.

On the contrary, we have had requests for the current behavior, understood the logic of the suggestion, and have responded.

As they say, “It’s a definite ‘No’, if you don’t ask.”

brookter · March 23, 2021, 10:28pm

Jim,

The change was announced in a short release note, without giving details of exactly what had changed. (You still haven’t been explicit about what has changed — it varied over the course of this thread — despite requests for a formal article.) Most people will probably have missed it — I did.

Of course people didn’t call you beforehand to say they used metadata and comments for searching — it never occurred to me you’d do something so daft as to make data I’d put into a file unreachable to me, let alone that you’d make a fundamental change without asking users whether it would cause any problems.

I haven’t called you since to say that searches were failing, because I hadn’t noticed that they had — if a search returns 30 files, I’d have to know in advance that the real figure should be 33, or otherwise notice that a particular file was missing. That’s the problem: the results of searches can no longer be trusted.

AFAICS DT3 now gives me no way of knowing how many of my 3,000+ markdown files contain metadata at all, never mind what’s in that metadata, so I can’t even upgrade the files to ensure that data is no longer ignored, without going through each file manually.

If I’m missing something obvious and there is a simple workaround to correct this problem, please let us know asap. (I’m taking about the interim while we wait for Christian’s very welcome offer of a hidden default to restore the old expected behaviour.)

And, please, treat this issue for what it is: a serious risk to trust in DT3’s reputation for reliable searches.

Thank you.

pete31 · April 28, 2021, 8:36am

What’s the name of the hidden preference? It’s not mentioned in DEVONthink 3.7 help.

cgrunenberg · April 28, 2021, 8:39am

IndexRawMarkdownSource

pete31 · April 29, 2021, 5:18am

For other users who want to search the raw Markdown source:

Quit DEVONthink
Open Terminal.app
Run command
defaults write com.devon-technologies.think3 IndexRawMarkdownSource -string yes

Afterwards it’s possible to search the raw source of new Markdown records.

In case of existing records it’s necessary to rebuild the database, from help:

Rebuild Database: Completely rebuilds the database by exporting all items to a temporary folder in the file system, creating an empty database, and reimporting all items. This removes any structural problems. Depending on the size of your database, this can take from a few seconds to several hours. This option is typically only used in a troubleshooting situation.

DEVONthink menu File > Rebuild Database

pete31 · April 29, 2021, 5:25am

Thanks so much! I’m a very happy DEVONthink user again Honestly couldn’t enjoy new releases anymore but now am looking forward to make use of all the nice things you’ve put into DEVONthink 3.7. Congratulations!

mattts · May 27, 2021, 2:48am

I’m new to DEVONthink and was excited to see Markdown frontmatter highlighted like DEVONthink expected it. My blind hope was that frontmatter might be indexed similarly to DEVONthink’s explicitly-defined custom metadata fields, which would be useful to me for two reasons:

In a Markdown-centric database, custom fields wouldn’t be relegated to (often-obscured) sidebar UI but always front and center; more visible and easier to edit.
I could define custom fields per database rather than having to create global custom fields that would be irrelevant to the databases where I have no plan to use them.

I don’t have a clue how DEVONthink’s indexing works under the hood and imagine complex YAML arrays could be a can of nightmare worms, but even one-dimensional key/value indexing would be useful. (With the keys being searchable properties and the values being indexed as content.)

My use case is moving a home inventory into its own DEVONthink database, where I’ve got CSV exports from an app’s SQLite database I’m using some rudimentary AppleScript to split items out into uniform Markdown files that are mostly frontmatter.

Wound up here with empty search results and I’m relieved there’s a way to get all that stuff indexed.

I’ve worked with a handful of web-based content management systems and static site generators that lean heavily on Markdown frontmatter as a structured way of storing content, taxonomy, and other details beyond a blob of text content. So it’d also be useful to have DEVONthink be the source of truth for that content in its existing format and simply export it wherever it needs to be published.

I could well be an edge case, but I’d love it if Markdown frontmatter indexing could be exposed as a proper setting (index/ignore) and would be even more thrilled if any of that YAML could be indexed like a light version of the fancier custom metadata fields DEVONthink already offers.

cgrunenberg · May 27, 2021, 6:51am

Certain (but not all) MultiMarkdown metadata is indeed indexed and can be e.g. viewed in the Document > Properties inspector.

After enabling the hidden preference IndexRawMarkdownSource the unfiltered source of Markdown documents is indexed but that’s only available for text searches contrary to indexed metadata which is available in the Properties inspector, can be viewed in List view columns and can be used by smart groups/rules or placeholders.

mattts · May 27, 2021, 11:36am

Thanks! I’ve enabled that preference and I’m a happy camper being able to search everything.