DT4b2 with LM_Studio, observations and questions

stefanoaz · May 11, 2025, 3:06am

My normal usage of DT is on engineering and other databases, but with DT4 I’ve been giving LM_Studio a try with a 64GB Mac Studio. Outside of document summaries and interactive chat-type probing of a subject, DT4+LLM appears to be able to do more targeted web searches than would be possible with a traditional search engine. I have two general questions about using DT4 with local LLM:

With databases - it seems like the most efficient usage of the LLM is to use DT4 word search to sift for the relevant documents, then to select the bundle of them and hope that the selection fits within the context limit of the LLM + HW. Are there other ways to use LLM with the database - perhaps with scripting? I couldn’t find a list of the DT4 LLM commands - just “perform_web_search.”
With web searches, there is the AI setting to aim which general sources it uses. There doesn’t seem to be a global setting for how much searching I want it to do - a la DevonAgent. Have other users here developed general strategies for depth of search, when the user doesn’t know which particular URLs to aim at?

General observation: With 64GB memory, 50GB allocated to VRAM, I can efficiently run 32B parameter models, and even 70-billion parameter llama 3.3 with constrained context. I have had the model generate a report containing a hallucinated source - from a journal that doesn’t have an issue in the month that was “cited.” So, as expected, the user has to ride herd over the tool. But LLMs are a useful and impressive tool.

cgrunenberg · May 11, 2025, 8:22am

Which search options are enabled in Settings > AI > Chat? Scripting is of course possible, see get chat response for message AppleScript command. Finally, what’s actually the desired usage scenario?

The depth depends on Settings > AI > Chat > Usage and on the context window of the model. Depending on the model and the prompt multiple web searches might be performed (e.g. Claude 3.7 Sonnet does this frequently after telling it to perform a deep web research and/or to discuss a paper/article/document in detail)

stefanoaz · May 12, 2025, 12:09am

Hi Christian:

For working with the database here’s an example: I want to know about the characteristics of a amplifier circuit architecture called “nested miller,” such as the variations of that architecture, and the range of performance of certain parameters of each variant. A boolean phrase search in DT4 could find the journal articles, but the number might be large enough to exceed the LLM context length (presently limited in DT4 to 32K, which is I think is something like 35,000 words). So - I might need to have the LLM go through each article sequentially with some scripting commands. I don’t know how to do that - maybe your site has some scripting tutorials which have some LLM usage examples. Also - I’m not sure why DT4 constrains itself to 32K when many smaller models can handle 128K. If I were scanning through an e-book, the larger length could be helpful.

For web searches, there’s not much information about where it’s specifically going if I have “Web” and “Wikipedia” checked as selections. I find that when I prompt it in the chat to “Cite Sources,” depending on the model, it will give me generic root URLs and often stale “error 404” links. With this, it’s hard for me to tell if the LLM is drawing on its own training, or actually using the info it finds when I tell it to “perform_web_search.” I do see (not always, depending on the model), the LLM form a search string and state it’s searching the web. The depth of the search being related to the context length, doesn’t really tell me much in quantitative terms. Maybe there is a way to have the LLM list every URL it used information from? I haven’t figured out how to do that yet.

As for DT4’s AI settings - I’ve only used the “Default” Role. Per the DT4 Help document, it looks like I need to go through the AI Assisted Scripting section to learn to handle larger numbers of documents, and possibly assistance in setting conditions on when to continue or stop web searches. I haven’t seen a glossary of all the scripting commands available.

Also in settings, the “Custom” field is pre-filled with a template I’ve not touched. It starts with “Make a summary of %@ …” I assume that %@ is the chat prompt response, but the syntax is unfamiliar. Is there guidance or examples of prompt templates?

Thanks,

Steve

cgrunenberg · May 12, 2025, 6:20am

Actually DEVONthink doesn’t contain such a limit, you could enter larger context windows in Settings > AI > Chat but only if it’s supported by the model and configured the same way in LM Studio (as only Ollama supports the possibility to specify the context window via the API). And of course larger context windows require much more time.

Usually invalid links are not search results as DEVONthink both anonymizes and simplifies (item) links and email addresses before sending them to LLMs and then restores the originals in the response. Especially as LLMs are having a hard time handling things like session identifiers, UUIDs or complex parameters in URLs.

Just drop the DEVONthink.app onto the Script Editor.app in the Dock or Finder to view its complete script suite.

This is only used by Edit > Summarize via Chat… and Tools > Summarize Documents via Chat… (in the second case only if Settings > AI > Chat > Summaries is set to Custom)