DT4 - AI chat thinks it can't access documents?

olivers · April 10, 2025, 9:32am

Hi,

I started running DT4 yesterday, there’s certainly the possibility that I’m doing something wrong. I have AI configured to use a local LMStudio model called gemma-3-12b-it and it works fine if I select an individual file and ask for a summary or some detail from the file - excellent.

However, what I’m really trying to do is find stuff more easily in this database of 47k documents. So I asked the chat this question:

What was the name of the solicitor who handled the house purchase for Lochinvar? And in what timeframe did this process take place?

This is the reply I get:

I need more information to answer your question. I don’t have access to any documents related to a house purchase involving “Lochinvar” and a solicitor. Could you please provide me with the relevant document or context so I can search for that information?

Alternatively, would you like me to perform a database search using DEVONthink? If so, could you give me a query to use? For example: name:"Lochinvar" AND type:MD (assuming “Lochinvar” is in the name and it’s a Markdown note).

So… at this point I have not tried many other queries to debug the problem. I searched everywhere I could think of to find example conversations, but I didn’t find any. What am I doing wrong? It seems to me that “I don’t have access to any documents…” is nonsense - clearly the chat does not understand that I’m referring to the entire database (which I’ve tried to do by not selecting any specific document, or by just selecting the database node in the hierarchy, but I think it should also be the default assumption in the absence of other context like a selected file?). And then it wants extra prompting to perform a search (I thought that was implied), and for me to construct a query (well, that would actively defeat the purpose of the AI chat in my eyes).

I’ll dig around some more, but meanwhile I think it would be great if you could point me to a place where expectations are clarified - what sorts of query prompts should I expect the chat to understand? Is it on purpose that the system wants very specific instructions like “go search the database” instead of simply going away and doing it when required?

Thanks
Oli

cgrunenberg · April 10, 2025, 9:45am

Did you enable the database search in Settings > AI > Chat? In addition, the search is limited to the current selection in the item list or sidebar.

BLUEFROG · April 10, 2025, 9:56am

Getting Started > AI Explained

olivers · April 10, 2025, 10:43am

Yes, I have enabled this. As I explained, things work if I select a single file. How do I have to select things to apply my query to the entire database?

olivers · April 10, 2025, 10:44am

I read that, thanks. Not sure what you’re getting at.

olivers · April 10, 2025, 11:15am

Follow-up. So, I did a few more things. First, I prompted the system to continue and actually search the database:

Yes, of course I want you to search the database, that’s what we’re here for! And no, I hope you’ll find the documents for me - sure, “Lochinvar” should be in there, the rest I’ll leave to you.

The reply found the address of the house in question correctly, it pointed out that it had searched the database and analyzed the first four results. However, it also mentioned that some search results (like an LPG contract) were clearly quite irrelevant and that another 4276 results would need analyzing for more details.

So I prompted again:

Well I guess it would help if you adjusted the query a bit to break it down to actually useful results - I don’t know why you’re digging around in LPG agreements if you’re looking for communication with a solicitor.

Now the funniest thing happened. The query was actually modified in a reasonable manner now to say Lochinvar AND (conveyancing OR solicitor OR "legal documents") – cool.

However, then the chat reply showed this:

Okay, I’ve reviewed the search results displayed in the user interface. The solicitor’s name is Anderson Strathern .

The name didn’t sound familiar to me at all, so I modified the query to read Lochinvar AND (conveyancing OR solicitor OR "legal documents") AND strathern – and I found that not a single document was found!

I did a few test searches and found that “strathern” is a name that is mentioned in a few PDF files in my database, but they are completely unrelated. Even the search “Lochinvar AND strathern” does not return anything. I believe there’s something to be learned here from a technical point of view, and that’s why I’m describing the scenario in detail - obviously the model must have had access to a document at this point that included the name, although such a document was not part of the result set returned by the query.

And finally, one other thing: after I asked the chat to show me the exact documents where the information in question was found, it returned this reply:

The details are all wrong, including the dates. But again from a technical point of view I found it interesting that none of the links work. They are formatted like this: x-devonthink-item://UUID-VALUE-HERE??page=0 - but when I click on them, all I get is a “bing” sound from the speaker and nothing at all happens.

cgrunenberg · April 10, 2025, 11:20am

I would highly recommend to try a better model, e.g. at least Gemma 3:24b or a commercial model. DEVONthink fixes a lot of nonsense in chat responses (e.g. poor Markdown formatting, invalid links & references etc.) but can’t fix all of it.

olivers · April 10, 2025, 11:26am

Okay, interesting. I’ll give that a try.

Edit: btw, do you have plans to integrate some sort of vector based embedding support?

olivers · April 10, 2025, 11:44am

Right, so I tried this just now. It hasn’t done any good so far, but I have another technical detail that seems to be going wrong. Here’s what happened:

I started my chat exactly as before
The reply said: Okay, I have shown the search results for “name:Lochinvar AND “house purchase” AND solicitor”. Please review these and let me know if you see the information about the solicitor’s name and timeframe.
The search was displayed in the window as promised, but the result set was completely empty.
I told the chat that the results were empty and then it went on to run two other queries - in both these cases the query text was only displayed for a moment before being replaced by the reply in the chat. However, the query was NOT actually executed in the DT window as the first one was - the results were also negative, probably understandably so since it appears to me that the first empty result set has somehow created a state where further queries are not executed correctly.

tomkerswill · April 10, 2025, 11:47am

Just to chip in — I am seeing exactly the same issue as the OP.

I don’t think this is to do with the quality of the model. I’m using another local option (GPT4ALL) but am passing the query to a hosted model behind the scenes. I think that for whatever reason, DEVONthink isn’t passing through the tools to the model, if you select one of the three local options. E.g. it isn’t making available the get_content tool.

tomkerswill · April 10, 2025, 11:54am

Sorry - one more update; on my other thread, it’s been confirmed that these tools are indeed not made available for any of the local model options; and I think this is by design:

cgrunenberg · April 10, 2025, 12:26pm

Models supporting tool calls can both open a toolbar search (to present results to the user) or search the database on their own (if enabled) and use the results.

cgrunenberg · April 10, 2025, 12:34pm

As far as I know GPT4All doesn’t support any tool calls at all (unless a recent model added this).

olivers · April 10, 2025, 12:42pm

Right. Not sure what you’re meaning to say though. Database access is allowed, and the model managed to run a search the first time, as I was able to see in the search box in the toolbar. Only when the model claimed in following steps to run new queries, the search box did not change and kept displaying the first query the model had run.

I thought the model acted strange in multiple ways here:

First, it ran a search and told me in the chat what the query string was while entering it in the search box.
It ran that search, which produced no results. Then it asked me to check those results for the information - the model apparently didn’t know that the result was empty, and it wasn’t going to attempt to answer my actual questions either.
Second, after prompting it decided to use a different query string, but the only reason I know this is because I saw that string being displayed momentarily as part of the “now searching the database…” info in the chat - a second later this was replaced by the reply from the model.
And critically, the search string never made it into the search box. The reply did not find any results - I have no way of telling whether that’s because it really never ran the query, or because it once more didn’t find anything.

cgrunenberg · April 10, 2025, 12:51pm

Tool calls (and also e.g. errors) are logged to the file ~/Library/Application Support/DEVONthink/Chat.log.

Even in case of commercial state of the art models tool calling is not even close to being reliable and DEVONthink is already able to fix tons of poor or completely wrong tool calls to improve the overall reliability.

Therefore the performance of a small, local model like Gemma 3:12b isn’t really surprising in the end.

cgrunenberg · April 10, 2025, 1:46pm

Seems still to be the case:

olivers · April 10, 2025, 2:20pm

Therefore the performance of a small, local model like Gemma 3:12b isn’t really surprising in the end.

As discussed before, I was now using gemma-3-27b-it for the last few tests.

Tool calls (and also e.g. errors) are logged to the file ~/Library/Application Support/DEVONthink/Chat.log .

I checked this in conjunction with the console logs from LM Studio. I can see that the log file says this:

2025-04-10 12:39:28,981 INFO: Element Labs (gemma-3-27b-it): 1854 input, 120 output tokens used.
2025-04-10 12:39:28,982 INFO: Element Labs (gemma-3-27b-it): Tool call 'perform_database_search' (642877806): {
    query = "name:Lochinvar AND (conveyancing OR completion OR \"property transaction\")";
}
2025-04-10 12:39:42,148 INFO: Element Labs (gemma-3-27b-it): 1733 input, 136 output tokens used.

Clearly this is in response to these log items from LM Studio:

2025-04-10 12:39:17  [INFO] 
[LM STUDIO SERVER] Running chat completion on conversation with 6 messages.
2025-04-10 12:39:22  [INFO] 
[LM STUDIO SERVER] Accumulating tokens ... (stream = false)
2025-04-10 12:39:28  [INFO] 
[LM STUDIO SERVER] [gemma-3-27b-it] Generated prediction:  {
  "id": "chatcmpl-6nmlgx0adxgbkrn628zsd",
  "object": "chat.completion",
  "created": 1744285157,
  "model": "gemma-3-27b-it",
  "choices": [
    {
      "index": 0,
      "logprobs": null,
      "finish_reason": "tool_calls",
      "message": {
        "role": "assistant",
        "tool_calls": [
          {
            "id": "642877806",
            "type": "function",
            "function": {
              "name": "perform_database_search",
              "arguments": "{\"query\":\"name:Lochinvar AND (conveyancing OR completion OR \\\"property transaction\\\")\"}"
            }
          }
        ]
      }
    }
  ],
  "usage": {
    "prompt_tokens": 1854,
    "completion_tokens": 120,
    "total_tokens": 1974
  },
  "stats": {},
  "system_fingerprint": "gemma-3-27b-it"
}
2025-04-10 12:39:28  [INFO] 
[LM STUDIO SERVER] Client disconnected. Stopping generation... (If the model is busy processing the prompt, it will finish first.)
2025-04-10 12:39:28  [INFO] 
[LM STUDIO SERVER] Running chat completion on conversation with 8 messages.
2025-04-10 12:39:34  [INFO] 
[LM STUDIO SERVER] Accumulating tokens ... (stream = false)
2025-04-10 12:39:42  [INFO] 
[LM STUDIO SERVER] [gemma-3-27b-it] Generated prediction:  {
  "id": "chatcmpl-7kav12aqkmaph6z3x52rpm",
  "object": "chat.completion",
  "created": 1744285168,
  "model": "gemma-3-27b-it",
  "choices": [
    {
      "index": 0,
      "logprobs": null,
      "finish_reason": "stop",
      "message": {
        "role": "assistant",
        "content": "Okay, I apologize for the previous overly specific queries. It seems we're still not finding what you need. \n\nLet’s try a broader approach. Could you please provide me with any text related to this house purchase? Even a few sentences or keywords from a relevant document would be helpful. I can then use that information to search more effectively, or even attempt to extract the solicitor's name and timeframe using natural language processing if enough context is available.\n\nAlternatively, do you remember *where* in DEVONthink you might have stored documents related to this purchase (e.g., a specific group)? If so, I can focus my search there.\n\n\n\n"
      }
    }
  ],
  "usage": {
    "prompt_tokens": 1733,
    "completion_tokens": 136,
    "total_tokens": 1869
  },
  "stats": {},
  "system_fingerprint": "gemma-3-27b-it"
}
2025-04-10 12:39:42  [INFO] 
[LM STUDIO SERVER] Client disconnected. Stopping generation... (If the model is busy processing the prompt, it will finish first.)
2025-04-10 13:39:42  [INFO] 
Unloading model gemma-3-27b-it due to TTL expiration.

Not sure what can be learned from this log, but it seems to me that the tool call succeeds – what still makes no sense, I think, is that the part name:Lochinvar seems to assume that “Lochinvar” must be part of the file name. I don’t know why this is assumed…

A general comment: since you don’t have a vector database (any reply to my query above about this?) and you rely on your own index queries instead, successful queries will need to be very… I don’t know, detailed? Creative? You know, to simulate the fuzzy and relation-based results you would obtain from a vector database. So I’m surprised by what I see here, so far. The query above has the inexplicable assumption that a search string must be part of the document name - that kills all results in one swoop. (I had another example where I asked about invoices for camping products, and the query I saw assumed that there was a tag “camping”. Hm.)

And while you can see that there are a few possible terms queried at the same time (i.e. conveyancing, completion, “property transaction”), the model only started doing that after I sent it this prompt:

Well… I think you’re doing this wrong. I’m talking about a house purchase, but why would you think that legal documents about this would contain the specific string “house purchase”?

Prior to that, “house purchase” was the only string it was searching for at all! So… what I would expect is that the query would combine lots of relevant synonymous terms to cover as much ground as possible. I ran a test and asked my own model (still gemma-3-27b-it) for interesting search terms like this:

Let’s say you have a large database of indexed documents. A user says to you “What was the name of the solicitor who handled the house purchase for Lochinvar? And in what timeframe did this process take place?” – it is now your job to answer this question, and the first step is to search the index for terms that are relevant to the query. Show me at least 20 such terms, but add more if you think they’re relevant. Cover all aspects of the query posed by the user.

I won’t repeat the results here, you can run it yourself – but they are pretty good even though they need refining, and so I should think that a more useful query would perhaps look like this:

Lochinvar AND (house OR estate OR conveyancing OR completion OR property OR transaction OR purchase OR transfer OR solicitor OR lawyer OR attorney OR "legal counsel" OR "completion date" OR "exchange of contracts" OR "date of completion" OR "offer accepted" OR mortgage OR "stamp duty" OR "property tax" OR "land registry" OR "title deed")

I wonder why such detailed queries are not created? I ran this and it renders pretty good results - I’m not saying it’s the end of the story, it needs refining again to exclude irrelevant items etc… but the contrast to the non-functional queries I actually see running is pretty severe.

Anyway, enough said. I understand this is work in progress and that’s why I’m spending time offering suggestions and feedback. I hope this functionality can be evolved to a point where queries such as mine will simply be answered without problems! And please consider using a vector database!

cgrunenberg · April 10, 2025, 2:34pm

DEVONthink doesn’t use a vector database so far. In addition, in case of local models the search results are more limited, especially if the default token limit isn’t increased (both in LM Studio and DEVONthink’s settings before loading the model).

The queries are created by the model, not by DEVONthink (nor based on its index). Works quite well in case of powerful models but not that well in case of limited local models. Or let’s say that in case of local models a good, comprehensive prompt is more important right now.

But in the end it’s a beta, we’ll check if this can be improved…

olivers · April 10, 2025, 3:02pm

The queries are created by the model…

Yes, I assume so. But the prompt which DT sends to the model (I can see quite easily with LM Studio) has absolutely no instructions for the model to construct a query by being creative about the terms it uses, as I suggested. I don’t think powerful or not so powerful models come into the picture much - as long as you don’t instruct the model to create a good query, it won’t do it.

In fact… it appears to me that the only instruction the model has to create the search query is what the description says: A valid DEVONthink search query. The search syntax supports prefixes (e.g. modified:, size: or name:), operators (NOT, AND, OR, OPT, XOR, NEAR, BEFORE, AFTER, NEXT) and wildcards (*?)

Unless I’m missing something, it’s no surprise that prefixes such as name: are used for unexpected purposes, since they have no further description.

Meanwhile, care to explain why you keep going on about powerful models? I was using the 12B version of Gemma, you said try the 27B one. Now I’m using that, but it’s not good enough? Please let me know what you’d like me to be using for better effect!

cgrunenberg · April 10, 2025, 3:17pm

That’s a basic issue of tool calls. Extensive descriptions of tool calls increase the token count and make responses slower, even if you just say Hello! In addition, in case of local models the messages, the tool calls, their arguments & descriptions, their results and the response all have to fit into the small context window. That’s also the reason why not all tool calls that we tested are part of the beta.

The 27b version is still a relatively small model compared to commercial models like e.g. GPT4, Claude or Gemini. These models have a much larger context window, a lot more knowledge, including of DEVONthink’s search syntax, and a better “understanding” of the user’s intentions usually.