Really impressed by AI implementation in DT4 beta

Charles56 · April 7, 2025, 4:47pm

I’ve been using DT4’s AI implementation, or portions of it, for a couple of days now. I’m really pleased. I’ve been going back through my collection of PDFs on a specific academic topic, producing terse summaries and key points lists and also posing questions about the articles and saving ChatGPT’s answers. It’s been a really helpful way to bring myself back up to speed on the articles I’ve worked with this way. Of course, I had already read them and, in many cases, annotated them in some detail. But I felt like DT4’s AI component helped nail down a number of ideas which I’d mentioned in my notes. I’d say it brought some structure to what had begun to seem like a huge morass of information. It’s been great not to have to leave DT in order to work with ChatGPT, and it’s been really helpful to be able to easily save ChatGPT’s responses alongside my PDFs and existing notes.

So far I’ve been using ChatGPT 4o and ChatGPT 4.5 (preview). When I have some spare time, I might experiment with the other available models, since I know each model has its own strengths.

Thanks, DEVONthink devs!

BLUEFROG · April 7, 2025, 4:50pm

Thanks for the nice feedback!
Do note 4.5 is pretty pricey

As of 04.2025…

Charles56 · April 7, 2025, 4:52pm

Thanks for mentioning the price difference. For my purposes, I think I can get by just fine with ChatGPT 4o and possibly 4o-mini. I just wanted to try out the cutting-edge tool for a bit.

cgrunenberg · April 7, 2025, 5:02pm

The chat assistant is also able to summarize annotations of PDF document, by the way.

Charles56 · April 7, 2025, 5:14pm

Now my problem is that I’m already hooked on DT 4 and don’t really want to go back to DT 3, as nice as it is. But DT 4 is installed on a test machine, and I can’t spend the day there, just an hour or so here and there. The rest of my life is on the machine where I use DT 3.

I suppose if that’s my biggest problem in life, I’m doing OK.

rkaplan · April 7, 2025, 5:27pm

1+ What Jim says about cost of OpenAI 4.5

For querying academic PDF documents, I think you will find Claude 3.7 to be the sweet spot between good analysis, minimal hallucination, and reasonable price.

Grogol · April 7, 2025, 5:50pm

Same boat. I’d love to know when it might be OK to throw everything at DT4, and whether anyone here has already done that.

Charles56 · April 7, 2025, 5:51pm

I’ll test that out. Thx.

Charles56 · April 7, 2025, 5:52pm

I’ll try Claude in the next day or so. I haven’t seen obvious hallucinations in the way I’m using ChatGPT with these PDFs, maybe because I’m not turning it loose on the entire web.

BLUEFROG · April 7, 2025, 6:07pm

The beta is stable and some people have already done it but that does not relieve your responsibility in making the choice, no different than ignoring any other warnings.

And if you haven’t read it yet…

rkaplan · April 7, 2025, 6:13pm

To test hallucination ask questions for which the document has no answer.

Charles56 · April 7, 2025, 6:27pm

Ha! I’ll try that. So, it can’t admit that it just doesn’t know something?

On the other hand, not long ago I asked ChatGPT about a bit of epic oral poetry from Siberia, and it told me it needed more information to give a good answer. (I’m not pretending to be a specialist in Siberian epic poetry; I was only proofreading a friend’s doctoral thesis/dissertation.)

Grogol · April 7, 2025, 6:33pm

Thanks, @BLUEFROG Jim, yes, I tried it out under those conditions and was suitably impressed, hence wanting to go all in. It seems to help remove the process I use now, throwing PDFs at apps outside DT, and seems to be offering some real benefit from AI, rather than some of the less well-thought through bolt-ons elsewhere.

BLUEFROG · April 7, 2025, 6:34pm

Just make sure your local backups are on schedule, for my sanity as much as yours, and make your decision

NickLowe · April 7, 2025, 7:50pm

There seem to be a lot of people here who are all in on DT4 now, and no reports that I’ve seen of any data loss or corruption. I threw caution to the winds on Saturday and it hasn’t so far blown back in my face. But obviously make sure your backup regime is bulletproof.

rkaplan · April 7, 2025, 8:53pm

There is a parameter called Temperature which I think defaults to 0.3 - that allows some creativity/variation in the LLM response. Setting the Temperature to 0 will reduce hallucination though the response may not appear as “human.”

I have found that when querying regarding academic journals, when an LLM hallucinates often it can be subtle but signifcnant so you need to always be on guard. One of the more common situations I have seen is where AI generally know the answer to your question but does not have an academic reference so support it. Thus is may answer your question correctly but fake the citations.

What is really interesting though is that a fake academic citaiton my be subtle enough to fool you. I have seen both ChatGPT and Gemini provide a reference with the name of an author known in the field but with the article name and details fake - the familliar name can be enough to fool even someone knowledgable of the field involved.

rkaplan · April 7, 2025, 8:54pm

Data integrity seems as good as DT3. Be cautious though with letting the AI change your metadata or content. That is the one area where a misunderstood AI prompt might lead to some unexpected results.

celsee · April 7, 2025, 10:19pm

Like others, I went ahead and just jumped into the “realtime” environment, knowing I can revert to backups if something catastrophic happens. Haven’t had any issues yet.

jonmoore · April 8, 2025, 6:46am

@BLUEFROG

Regarding specific LLM integration. I’ve increasingly been using Gemini, which was already excellent with “v2.0 Flash Thinking”, and since last week have moved to using “v2.5 Pro”, which is relatively cost-effective in comparison to premium OpenAI models.

It would be great if the next beta update to DEVONthink 4 could include v2.5 Pro to the default AI options (2.0 Pro is the current maximum Gemini configuration available in DT4’s settings).

On a separate but connected note. My PDF workflow now includes NotebookLM+ analysis, as this has very generous usage quota’s as part of the Google One subscription. It would be great to see NotebookLM integrated into DT4. I don’t believe that Google offer a direct API for NotebookLM integration, but all that’s needed in the short term is the ability to submit PDF’s directly to NotebookLM from DT4. Integrating analysis findings back to DT4 could be accomplished via the Clip to DEVONthink web browser extension, although, in my case, I create summary PDF’s first before integration back to my knowledge management databases, as NotebookLM has no problem creating detailed chapter by chapter summaries of references containing 1000+ pages.

cgrunenberg · April 8, 2025, 6:49am

This is indeed planned for an upcoming beta.