Semantic Search in DT (Not Generative AI)

mmoren10 · January 24, 2024, 2:28pm

There have been a few recent posts (and heated debates) on the integration of generative AI in DEVONthink. Privacy and sub-optimal results have come up as cons.

I am no expert in the topic, but I think it might be worth distinguishing between mere search/retrieval of relevant documents and generation of a response based on the retrieved documents. The first one might fit DT better, avoiding privacy and quality (hallucinations) concerns.

DT is currently my best bet for finding a document in my ever-growing collection of documents. Wildcards and proximity operators (although faulty when using several words) are extremely useful. Perhaps the recent AI boom might make DT search even better.

Again, no expert here, but I imagine using local embeddings to improve DT’s search and ranking algorithms. I do not want to chat with my documents, but to find every single document (or even better, a specific chunk within a document) that’s semantically relevant to my search terms. A simple example: searching for “contract” would retrieve documents/specific chunks that contains semantically similar terms, like “agreement”.

Would that be possible and, unlike generative AI for now, within DT’s future plans?

Thx!

cgrunenberg · January 24, 2024, 2:54pm

We might consider this for future releases but as usual no promises.

mmoren10 · July 3, 2024, 9:39pm

Resurrecting this thread to share a free app with local embeddings, that might be a useful reference when considering a release like this: BeyondPDF