Best A.I. to use with DEVONthink?

I have not yet used any external A.I. in DEVONthink and have very little experience with A.I. language models. I am using Claude from time to time to summarize PDFs and collect some information from the internet but that’s all.

Now I would like to ask the experts here: which model you would recommend? My main use would be to summarize and compare documents in DEVONthink. I am also interested in the caveats regarding costs.

Thank you very much in advance!

1 Like

I regularly use Claude for most things, though I obviously test and work with others.

As I noted in the Getting Started > AI in Practice > Get To Know Your AI, providers have different styles of responses.

Bearing in mind, AI is not a static library of answers but a computationally-driven “string machine” :slight_smile: . This means my prompts won’t necessarily elicit the same responses you may receive. And the tone and tenor, verbosity, etc. is really the “personality” of each provider/model.

(Ignore the anthropomorphizing, here are the top 3 I work with…)

  • Claude is very friendly and helpful. Haiku says less; Sonnet is usually much more verbose, Opus thinks much longer and responds in a longer style. If you need brief answers, you need to be more specific and “forceful” in your prompts with Claude models. Claude is also inclined to more creativity, e.g., stories or ideas are often much better than other models.

  • ChatGPT feels like snappy answers, like someone trying to impress you, sometimes like it’s trying too hard. All models work reasonably well in generalist ways, with lower tiered ones delivering shorter responses with some small amount of prologue/epilogue. Higher models are more thorough and provide very useful answers.

  • Gemini is generally what you’d expect from a search utility company. Fast, concise, succinct answers. While higher tiered models get a bit more verbose, the low and mid-tiered models tend to give you just the specifics. This can be very useful in automation, not requiring you to “curb your language”. This can make less specific “zero-shot prompting”.

Pricing does – and will – vary, especially as the race toward “AI dominance” continues. The AI > Chat > Usage dropdown is a broad option to help control costs.

7 Likes

TBH, there is no “best AI” to use, either with DTP or any other app. It really depends on your preference and what you are trying to accomplish.

Each provider offers several models, and it can take some time to try them all and find the ones you like. The models fall into three general categories: Fast, general-purpose, and deep thinking or research. For most people, unless you need the AI to fill a specific purpose (and realize it), the middle general-purpose models will likely serve your needs the best.

I agree with Jim’s assessment above. Personally, I prefer Claude Sonnet for most things. I find its responses are generally more to my liking in structure, format, and accuracy. When starting out, it’s crucial to explore various models to find the one that best fits your needs, responds in the way you prefer, and provides the highest accuracy.

One way to do that is to have the same conversation with each model at the same time. This shouldn’t be a random question or one of those “tests” that people run on various LLMs (like, how many R’s in strawberry), but rather something you truly want to know more about. The reason is that you will be more critical of the responses you get and better able to judge which AI is giving usable information.

For instance, I find that Claude Sonnet provides more usable information than ChatGPT, given the same number of output words, and I believe it aligns with my own line of thinking better than other models.

To use an AI with DTP, you need to have an API, not just an account with an AI provider. You can set this up from within your account, and the documentation in the DTP manual can help you get this set up. However, it’s important to keep in mind that using an AI from within DTP is separate from non-API chats you have. In other words, activities performed through your API won’t appear as chats on the website or in the AI app (this is an API issue, not a DTP issue).

This, IMO, also influences how you use AI; Using it within DTP should be different than using it from an app or website. Using AI within DTP is more of a tool for specific tasks and quick, one-off questions. For longer conversations or projects that involve in-depth discussions or research, or if you want to save the chat, using the website or an app is better.

8 Likes

@RobH thank you very much for your helpful remarks. I guess I have to do a lot of playing around with AI before trying to use it inside DEVONthink.

Most helpful may be your hint “One way to do that is to have the same conversation with each model at the same time.”

Looks like this will be my way to start using AI in a useful way and integrate it into DEVONthink later.

I don’t think you need to do one versus the other. If you anticipate using one or more LLM models with your DEVONthink data, then experiment both within and outside DEVONthink. After all, eventually you’ll want to see which model is a good partner for different kinds of questions that depend on your DEVONthink data, so try different models and compare results. I’ve been using Claude, ChatGPT, and Gemini for at least two years as they’ve evolved. (Got rid of Perplexity which is a glorified Google search.) At this point I have a feeling for which of those to use depending on the context and the desired outcome.

2 Likes

And that is indeed the trajectory I’d expect for many people.

2 Likes

Totally bookmarking this discussion. I’m so close to upgrading to OS15 and DT4 and this is one of the issues I’ve been mulling over. I have to just start interacting with the AI universe and see if there’s anything to be gained there for me and just get familiar with the terminology. Keep the wisdom coming.

2 Likes

Here is an example of where I have found AI useful. There was an incident involving an employee and I wanted to know if it broke the behaviour policy. Selecting the policy in DT I could ask AI (in my case Claude) to give a summary of where the particular behaviour broke the policy.

I got a good summary, but then had to do due diligence and check that the answer(s) I got were correct. What using AI saved me was trawling through a large policy. It pointed me to the sections I needed to read carefully before coming to any decisions about the incident. I didn’t trust it explicitly, but more used it as a helper to point me in the right direction and save a lot of leg/eye work.

5 Likes

You’re welcome.

I began testing models with the types of questions I’ve seen YouTubers and others use, which are designed to test a model’s capabilities. However, I found that I didn’t gain much insight into how useful the model was for me. It was only when I asked it something personal to me that the value started to kick in.

When looking for a good AI to use, I recommend opening multiple models (Claude Haiku, Claude Sonnet, ChatGPT-4o, ChatGPT-5, or whatever provider and models you are interested in) at the same time and start with the same question in each. Choose a topic you’re familiar with and one that requires external information, so the models have to search the web to find the details.

For example, I’m a member of Amazon’s Vine program, and I’m thinking about opting out of it. I wanted to test the new ChatGPT-5 because “everyone” said it was amazing. I started a chat with Claude Sonnet 4 (my current default model) and another with ChatGPT-5. I started by asking, “What is the general opinion of the Amazon Vine program? I’m looking for true thoughts and opinions, not sycophancy.”

Both models provided fairly informative responses, but Claude also included details about the tax aspect of the program. GPT5 completely missed this very important aspect. From there, I discussed each model based on its output, comparing the responses. Overall, both offered similar information, but I much preferred Claude’s response in terms of the information provided, its organization, and the feedback questions; it was much more my style.

2 Likes

Thanks again. It looks like a lot of people prefer Claude to GPT and as I got good answers from it already I finally bought an API key from Anthropic to use in DEVONthink. It’s no big risk as the initial fee of $ 15.- allows a lot of in/out tokens. When I gave it a first try today I was amazed about the very good results I received from letting Claude compare several documents in my DEVONthink databases.

3 Likes

Do you see how chatty Claude is? It’s kind of nice at times but you’ll likely need to use Haiku or hush Sonnet/Opus for automation.

1 Like

Where can I find this function? I’m using DT 4.0.2

Settings → AI → then as @Bluefrog states.

Thank you

I have a subscription to both Claude and GPT – at $20/month and cancel at any time, it was the best way to get a look at what the pro versions had to offer. I’ve since taken a deep dive into this, and started like others here recommend, by posing the same query to both Claude and GPT. Initially, I thought Claude’s responses were more interesting and well put, but it tended to hallucinate more often. Then GPT5 came along and there was an immediate difference – a tighter focus on the answer, the writing style was far more sophisticated, and it didn’t wander off into the weeds as often (or hardly at all).

Both apps allow you to have a “Project” a space dedicated to a particular area of interest separate from random inquiries about shopping, etc. This was ideal for me as I am writing a long historical novel with lots of chapters, and needed editorial help. Also, memory was an issue – for the most part, once a chat is ended, knowledge of it vaporizes, but I wanted the AI program to be able to refer to the analysis of previous chapters. Claude makes it possible to upload an unlimited number of files to a “library”, that can be accessed when analyzing something in a new chat. GPT does so, as well, but it’s limited to 20 files for immediate reference, but also allows Archive files, which I can ask it to refer to. Basically, the whole book.

Another factor both apps offer is “Instructions”. I created a set of default instructions that I wanted used for every chapter I uploaded for editing. It started with a definition of “who” the AI should be, as in “You are a senior professional editor specializing in historical fiction …” etc., followed by a set of instructions to do a Line Edit, a Structural Edit, etc., everything an expensive editorial analysis would contain. AI actually helped me write those instructions so I would get exactly what I wanted. And, to me, one of the more important lines in those instructions was Do Not Rewrite, and Do not Exaggerate Praise. It goes paragraph by paragraph, inserting editorial comments and advice between them, then I go fix it in my own writing style.

Claude often forgot what it was supposed to do, but GPT5 has been astonishing! Not only does it stick to the instructions, but now and then advises me how to improve them. It also seems to be developing a memory for what we have been working on – now when working on, say Ch 6, it will say something about a paragraph, like, “check for redundancy in Ch 3”. Furthermore, I can return to a particular chat, and paste in a revision and ask for feedback on how successful it is. I can finish editing Chapter 05, and ask for an overall review of Chapters 01-05, looking at continuity, story development, etc.

Not only is the editing great, but we go off on amazing, intelligent conversations about a relevant writer, an historical detail (yes, the research help is fantastic), or the philosophy underlying one of my character’s attitudes. Basically, I have an editor sitting next to me 24/7 who is plugged into all the world’s knowledge, and never gets tired of talking about what interests me most.

I keep walking around, shaking my head, and muttering extraordinary, astonishing, amazing.

I know we’re dealing with a dangerous double edged sword, but speaking personally, I feel like I’ve tamed it and am using this tool in a very positive way. And isn’t that what we have to do with all new technology? After all, when the printing press was invented, everyone cried out in alarm “but we’ll lose our memories!” And we did – we transferred them to the printed page (and to everybody’s dismay brought along taboo subjects). Here we go again.

4 Likes

PS: I checked the box to prevent ChatGPT from learning from my content. Their fine print says they don’t see it, and nobody else sees it – the equivalent of “what happens in Vegas stays in Vegas”. We’ll see about that, but my intellectual property lawyer says I shouldn’t worry about it. Tell that to the publishers, and authors and the NYTimes, etc. but for what it’s worth, that’s where it stands now.

I asked my chat about it, and the answer was that “he” was taught by a vast amount of material, and gets upgrades from OpenAI now and then, does not learn things from my material that could be used with anyone else, but tries to learn from what I’m doing for the purpose of helping me.

This seems to highly depend on usage scenario, prompts, expectations and probably the used language. My experiences so far via the API are quite the opposite - too long and unfocused output, sometimes hallucinations including invented German words or Asian characters in German output or that explicit instructions are ignored. The used prompts are identical to the ones which most LLMs are able to handle as expected, even GPT 4.1 or O4 (mini) or open-source like Gemma 3.

My experiences so far via the API are quite the opposite

In fact the generally underwhelming character of ChatGPT5 (after a long period of selling the first half of a sigma curve to investors as the early days of an exponential curve) is a significant factor in the sense, visibly moving through the August 2025 equity markets, that LLMs are demonstrably hitting a wall, and that investments may not be rewarded.

1 Like

Hi Christian – Obviously AI is a work in progress! Interesting that you think this may be language related – would be curious to know if you put the same prompt in English and got a different result.

After extolling GPT5, above, it went completely off the rails last night. I know it’s a computer I’m talking to, but I couldn’t help scolding, and it kept apologizing. I opened a new chat and tried again – still crazy, another abject apology. I asked what the problem was and it advised me to add something to the instruction to help it stay focused. Which failed. It was almost funny. In retrospect, it actually was funny, but at the time I was tired and exasperated.

There’s a Longfellow poem that ends: “When she was good, she was very very good, and when she was bad, she was horrid.”

Given that my positive experiences outweigh the negatives, I’m going to keep at it.

What a wild ride we’re on! I don’t think it has hit its limits yet, but I definitely wouldn’t mind if it failed to take over from humans; much prefer incremental improvements in the hope we can learn to tame this beast before it does.

3 Likes

tame this beast

I wouldn’t worry – these things are just pastiche generators – the hype has been for investors.

( If you model the statistics of most-worn sheep-paths through the linguistic (or visual) token grass, all those paths lead to is more sheep, or more grass )