Hi! FIrst off - i’ve owned DT for a while but the lack of AI integration has been off putting.
With it - potentially game changing…when the wrinkles have been ironed.
Trying to understand why my AI queries yield different and inconsistent results.
eg
Look at the list of bills in the bills direction and make a list of the amount due for all the bills. There are about 68 bills in that directory. The first query game me 2 bills in response and the 2nd time it game me back 7 bills. I’m trying to understand if there is anything I can do for the results to at least be consistent. Am I exceeding an individual context window? I wonder if that could be tracked for a query and warn if that is the case.
As it stands - I don’t think the results can be trusted at the moment.
I use AI a lot in other domains. As it stands it would be simpler to just drag the files into a Chat window (external) and query like that.
For queries like the one above, looks like Google AI (Gemini Flash) is the way to go - as I can only assume that the context window is the issue. Although it wasn’t smart enough to collate the information and create a file - had to use Pro for that.
Perhaps the data could be fed to the AI such that it wouldn’t burst the context limit for the smarter AIs.
So perhaps my actual query is - could we please have better error reporting rather limiting the output of an AI query, so we can better debug why something isn’t working as you’d expect.
Other than that. Bravo. Clearly NOT folding some kind of LLM integration would have been a death knell to DVT. I appreciate this might not have been an easy decision.
AI is optional and there are many people who don’t want to use AI in DEVONthink at all. And DEVONthink 4 has not become “an AI application”. It has some access to external AI as a complement to its already powerful (and still improving) internal AI.
Also, LLMs are not static repos of information. Their responses can and do vary, so expecting the same response from multiple queries shouldn’t be expected. Can some operate in this way? Sure, some may… at times. But certainly not all, all the time.
PS: Did you read the Getting Started > AI Explained section of the Help?
All I wanted to do is extract the Issue Date: of the bill and rename the bill with it
If I asked the LLM to do it - it would do it perfectly - but only do a few files and stop. So I asked the LLM to describe EXACTLY how to set up a smart rule or EXACTLY the process it used to do its steps. Sadly it could not, and I burnt through hours trying to get it to work. In the end I ended up finding someone else on this forum that had done something similar. (Sidenote: still took me a little while to understand the intracacies)
I’m really just asking if there coulr be better error reporting for AI queries. I don’t care that the responses are both different (i get it, I use it A LOT) - but I do care they were correct. The responses were correct - but overall the response is very incomplete, but that’s not always going to be obvious. TO BE CLEAR: the “inconsistent” aspect of my query is that the number of files processed varied - not the individual processing of a file.
As a programmer I’m thinking - well each note is a lot less than the context window, and I’m not asking the LLM to do anything that requires the context of all the files (ie just extract a date a rename the file). All I’m wondering is, perhaps multi file queries could be tweaked to make this work? Perhaps ALL the files are being loaded into a context window and then the query is being executed?
Please don’t take the “death knell” personally. Your application is excellent, but in my line of work, my overall productivity and enjoyment has improved multi fold from using AI. Compute and storage costs will be such that you can just dump ALL your files into a RAG (or whatever) and you’ll probably be able to replicate a lot of the search functionality in many of the PKW systems. I suspect many in this space are finding ways to reinvent themselves as AI augmented PKW system designers.
A smart rule and the document date placeholders might be sufficient actually and would be much more reliable. Also, it is unclear which AI you are using.
Who are “expert users”? That would be a highly subjective title to apply to someone, and likely only to our CTO. And our AI is proprietary, predating LLMs by two decades.
And no offense detected or taken. Just clarifying not everyone is a fan of or wants to use AI, and it is not the core focus of DT4. Also, if you read the Help section I pointed you to, you’ll see you’re not going to be “dumping all your documents” into some RAG setup and making queries about them all.
Yep - but it was a useful test case to see how integrated the LLM is into your system. The point of the query wasn’t really the query, rather can it do it. ie am I going to use it system or look for something else.
About using smart rules and date placeholders - for sure it’d be more reliable.
Honestly - (for me at least), the knowledge of how the results of the scan text are reused later wasn’t easy to find. I did find it on the forum and then all was straightforward - but the AI wasn’t able to give me correct answers on how data was captured. (and yes I did look at your help)
I do hope we could get the ability to query about multiple documents. In the UI, it feels natural to select a few documents or a group and ask questions about them just as you would one document. I understand you’re feature locked for 4.0, but it would be a nice feature you no longer have to save for 5.0.
Well, yes, but I’d think the typical user has related documents that all have some text content, e.g. research reports about the same industry, a folder of news articles about a local issue, a folder of bills as in the OP’s case.
It does work now but I had to try the various AI. imo the error reporting needs a tune up, otherwise the query terminates silently, and nothing output in the log.
All I ask is that we at least have error reporting - fully accepting that LLMs have been provided without warranty, but if they are there we are going to use them.