DEVONthink 3.8 - Added script Scripts > Edit > Replace Text in Documents

hertzs · October 7, 2021, 2:17pm

Added script Scripts > Edit > Replace Text in Documents. This script supports replacing text in both plain and rich text documents.

Is there any advantage in using the script over the search inspector ?

BLUEFROG · October 7, 2021, 2:24pm

The script does a global replace.
The search inspector allows you to change instance-by-instance or even just in the current selection.

cgrunenberg · October 7, 2021, 2:25pm

And the script supports multiple selected documents, the inspector can only replace occurrences in the current document.

hertzs · October 7, 2021, 2:27pm

Thanks

Peter_Gallagher · July 1, 2022, 2:31am

I guess it’s obvious to most users – but it caused me some puzzlement for a while – that markdown files (.md) in the database are not of type ‘txt’ (nor ‘rtfd’) and so are not found by the script.

I had to change the line in the main loop of script that searches for ‘txt’ type files to:

else if type of theRecord is txt or type of theRecord is markdown then

to make it work for me.

P

BLUEFROG · July 1, 2022, 6:22am

Noted.

cgrunenberg · July 4, 2022, 2:17pm

This is intentional as it doesn’t support the Markdown syntax and therefore might cause unexpected results.

Peter_Gallagher · July 4, 2022, 11:57pm

I know what you mean, Christian: the script doesn’t contain any safeguards against changing markdown tags.

Still, .md files are plain text format and it was a formatting string that I need to correct (to ensure that DT found the images in a sub-folder).

“caveat emptor…” as always.

mhucka · October 25, 2022, 3:30pm

I just ran into this same problem today – I wanted to replace in Markdown documents, tried the script and found it ~~failed~~ did nothing at all, couldn’t understand why, finally looked at the source code and (like @Peter_Gallagher) realized what was going on. Now I see that it is in fact documented behavior in the DEVONthink manual (p. 243 of the PDF version of the user manual for version 3.8.4) that it only acts on documents of type plain text or rich text.

I guess the reason that the script could have unexpected results for markdown is due to how the text replacement is performed by the script. The code for Replace Text in Documents uses a common approach to doing text replacement:

on replaceText(theString, find, replace)
	local od
	set {od, text item delimiters of AppleScript} to {text item delimiters of AppleScript, find}
	set theString to text items of theString
	set text item delimiters of AppleScript to replace
	set theString to "" & theString
	set text item delimiters of AppleScript to od
	return theString
end replaceText

It’s not immediately obvious (to me) what could happen, but I guess the problem must be that it could behave unexpectedly for some combinations of user input and file content and delimiters. I don’t know enough AppleScript to figure out what combination that would be. Anyway, I wonder if there might be some ways to make this more robust so that it would work safely for Markdown documents too.

cgrunenberg · October 25, 2022, 3:49pm

Additional details like the find/replace strings and the source of the document would be great, thanks!

chrillek · October 25, 2022, 3:51pm

Well, I’d re-write it in JavaScript because that avoids doing the weird set text delimiter gymnastics necessary in AS. Something like

function replaceText(string, find, replace) {
  const RE = new RegExp(find,"g");
  return string.replaceAll(RE, replace);
}

Boring, I know. Not even worth putting in a function, in my opinion. But a lot more versatile than the AS version since it can handle regular expressions …

mhucka · October 25, 2022, 4:11pm

When I wrote “failed”, I simply meant it did nothing (because the script explicitly ignores documents of type markdown). Sorry to be unclear; I’ll try to edit my post if it will let me.

chrillek · October 25, 2022, 4:17pm

You could make a copy of the script and change this line
else if type of theRecord is txt then
to
else if type of theRecord is txt or type of theRecord is markdown then

But of course, that risks mangling MD metadata or tags, if your find/replace strings are buggy.

mhucka · October 25, 2022, 4:22pm

Apparently I’m not being clear enough.

I know the script condition would have to be changed to consider markdown documents as well. @Peter_Gallagher already wrote about that – he already wrote about the same change you just wrote.

What I’m trying to focus on is why @cgrunenberg wrote the script the way it is (i.e., to explicitly exclude markdown) in the first place. @cgrunenberg already stated upthread that the reason was due to potential unexpected behavior of text replacement when the content is markdown.

I’m trying to understand what the “unexpected results” might be. It’s obviously going to be due to the way the text replacement is being done in AppleScript. That’s why I focused on that in my posting.

AW2307 · October 25, 2022, 4:34pm

Could this script posted by Chris a while back be the solution? It does explicitly note at the top that it’s applicable to plain text as well as Markdown documents.

I have used this several times with no unexpected side effects to batch replace text in Markdown (which of course doesn’t necessarily mean it’s safe for all use cases).

chrillek · October 25, 2022, 4:40pm

I suppose Markdown is excluded to protect users. Suppose you were to replace ‘author’ by ‘authors’ throughout a file – that would possibly change the metadata line author:, too. Or, stupid example, changing ‘#’ to ‘number’ – that would malign all headings. Even dumber: replace “fig” by “peach” to break all HTML figure and figcaption elements (and MD may contain HTML).

There is, I think, no technical reason to exclude MD files. Especially since the other script mentioned by @AW2307 uses the same logic to handle markdown and text files.

mhucka · October 25, 2022, 4:40pm

It’s basically the same script, with the condition changed to work on markdown files, not just plain text and rich text. I’m not sure why @cgrunenberg had posted that version before, but the (newer, I think?) version included with DEVONthink does not treat markdown documents. This exclusion is by design, due to an explicit if-then condition in the script itself. This condition is easily changed – but that’s not the issue I’m trying to focus on.

Like you, I’ve used a version of this script to do text replacement in markdown documents and it has worked fine for what I’ve done so far. When I tried the one included with DEVONthink, it ignored markdown files. How it ignores markdown files is obvious from the code. What I’m trying to learn is why it was designed that way is, and maybe find ways of making the replacement procedure more robust so that the script conditions can be changed to include markdown files, and thus make the script more widely applicable.

mhucka · October 25, 2022, 4:44pm

Ah, yes, you’re right. Those are very good examples :-). It would be hard to guard against these possibilities.

chrillek · October 25, 2022, 4:46pm

You’d have to write an MD parser and an API that returns chunks of text. Way too much work for a simple task like find/replace. I’d simply use the script as it stands (the one working on MD) and try to take care. Or use an editor like VS Code, Coderunner etc. that shows the matches of a search string before doing anything to them.

cgrunenberg · October 25, 2022, 4:50pm

The Markdown syntax is indeed not supported at all by the script (e.g. metadata, escaping etc.) and therefore a simple find/replace might break these things.