Building a word dictionary as Devonthink database

I’m trying to figure out how to make a word dictionary (for foreign words and phrases) in Devonthink. As there’s no notion of a beginning of a line or wild card, is the only way for me to put a start-of-line character in front of each entry, such as $སྟོང་ or #སྟོང་ ?

How is the fuzzy search, could it operate outside the words themselves finding fuzzy matches for larger matches such as partial sentences, does the fuzzy search work reliably using any UTF-8 characters outside the normal range (see the odd letters above which are Tibetan)? Thx.

No idea how to do that but it’s a very interesting idea.

Special characters that are indexed

(but you know that from the other thread)

Thanks, hmm, means this is somewhat restricted concerning fuzzy searches then. And I need to custom edit various glossaries for the beginning of line indicator. Oh well.

Yes, it seems to be possible (if I understood correctly what you’re trying to do).

An asterisk between your start-of-line character and the word will find this.

2019-12-04_00-07-44%20kopieren

An asterisk followed by a space and another asterisk will find sentences (and all other results from above).

2019-12-04_00-10-21

Thanks, yes I know of the regex * notation. Fuzzy searches are more in line in how TMX records operate with translation software. You type in a string and it matches both exact and close enough matches based on ranking. I was hoping fuzzy search in DevonThink would operate close enough but most likely I need to experiment more.

PS: If I need to custom prepare glossaries I will use a $ prefix in honor of how that’s the symbol for start of line in regex anyway…

?
:stuck_out_tongue:

Interesting translation, especially as སྟོང་ means empty in Tibetan. Suspect the translation part picked up some other encoding than bo (Tibetan).

FYI I’m trying to find a good app for Tibetan translators that deal with lots of unstructured glossaries and TMX records (not ready to write this myself yet…).

Interesting translation, especially as སྟོང་ means empty in Tibetan. Suspect the translation part picked up some other encoding than bo (Tibetan).

Haha! I just roughed that in manually, as a proof-of-concept. I’m not a Tibetan language scholar, so it was just words typed as an example.

No problems, let me know if find a good template. I suspect there are others who would like to make DevonThink databases as open-ended language dictionary systems. Or use བཀྲ་ཤིས་བདེ་ལེགས། for hello!

1 Like

I suspect there are others who would like to make DevonThink databases as open-ended language dictionary systems.

If so, it’s definitely a new contigent of users. :slight_smile:

Better than grep. (Or ripgrep that I’m using).

What’s better than grep ?

Oh, you have not found ripgrep yet? :-).

Never heard of it.

Also, professionally speaking, I rarely use third-party tools like this for work purposes. Dependencies lead to fragile situations and extra, uncontrollable variables. If there’s something our Development team or Apple hasn’t included it out-of-the-box, I will only rarely use (and potentially advocate) such things.

Pretty much, if it involves statements like, “You just need to download this first…” or "Open Terminal and run brew cask…", I shy away from it for all of you. :slight_smile:

Fun stuff off to the side? For sure.
Fun for everyone, including people expecting us to explain why it’s not working or worse, to fix it? Nope. :stuck_out_tongue:

Speaking as an ex-fruit company engineer for 16.5 years, we used a lot of tools outside XCode.
ripgrep is robust, about 10x or more faster than grep, and it written in a proper language like Rust.

But I can’t recommend this to Tibetologists that barely could open up a terminal window. That’s why I’m looking for alternatives – or worst case I have to write one myself.

Understood.
Yeah, tools are fun indeed. Fun and often terribly dangerous :stuck_out_tongue:

I have to think of a time when a command line tool bite me… hmm, can’t remember. Maybe I’m getting too old… If it does not work, then it just crashes. Most tools won’t format the hard drive or install viruses. And if they are curated, on github, widely used and so on, it’s hard to sneak in something nasty.

Especially with ripgrep who is the author for the regex support in Rust.

Years ago, I misplaced a single forward slash in an rsync command and arrogantly didn’t add -n or --dry-run before testing it… and blew away three years of research in about 30 seconds :open_mouth: :stuck_out_tongue:

I agree, I don’t use rsync for that reason, it’s too powerful and requires careful testing. Depending on the copying it’s only useful for shallow trees with few files being updated. Otherwise it eats up time where a plain cpio pipe works better - such as copying files in larger build systems.

Now, if I remember correctly XCode use rsync internally (unless they removed it…). It’s a common trick with large builds to move files around, where cpio pipelines (see above) might work better. Anyway, now we are deviating…