OpenAI in DEVONthink?

I can understand your frustration reading through this thread.
As for the comment about incompetence of doctors, all I will say is that very uncommon diagnoses will usually have a delayed diagnosis, and AI will definitely help doctors and their patients. Anyone who thinks otherwise simply does not know, what they don’t know, and is probably not a doctor.
The DT team will come on board eventually am sure, when the time is right, it’s important to understand though, that one of the benefits of DT is it’s privacy protections, so uploading any particular set of documents, would have to be selective and specified, privacy would be my principle concern.
As for how AI works, yes it was designed to predict the next words, but there is also an element of mystery how it functions, and the black box nature of it is a bit unsettling. Interesting to see how it evolves given the incredible leaps forward we have seen.


A few years ago Google developed AlphaZero, an AI chess engine (that is, a program tasked with finding the best move in any given position) that they claimed was better than the best “conventional” chess engine, Stockfish. They achieved that by using much better hardware for their own AI engine in the competition. However, the open-source version of AlphaZero, Lc0, has not been stronger than Stockfish (which is also open-source) in engine tournaments that mandate comparable hardware for all participants.

AI is hyped. AI is impressive. AI seems to be the future. But they’re not necessarily better than their conventional, non-AI counterparts. Especially so when you do not own and operate a data center.

A media trick being employed more and more these days: collect a “fact” from some sheet or sub stack or whatever, then the “fact” is endlessly replicated as “news.” This is made all the worse from the devaluation and/or laying off of actual journalists or qualified researchers.


My personal rule of thumb: If there is any hype boiling, wait until things have cooled down before making important decisions.

I think the artificial intelligence that is already built into DevonThink is enough for most use cases, especially because it works in a way that does not stop me from thinking myself (as ready-made, well-formulated replies tend to do).

And nothing hinders us to ask Chat-GPT oder Bard or any other AI, if we want; it’s only two clicks away. But I wouldn’t buy an upgrade that let an external AI read all my documents, no thanks!


Whatever will be part of a future version, it’s going to be optional and has to be activated first by the user (and then used via commands, smart rules, scripts or whatever the user has in mind).


I’d like to share a little bit more about my interest in AI working with DT. Many compelling arguments above, both for and against. And yes, eyes wide open about receiving back incorrect information. But say you are an expert on the subject you are researching, and are looking to squeeze out any missed ideas from your collection of research and personal writings. I expect AI to return as much as 99% garbage back to me, and because I am the authority on the subject I can often tell what is hallucination over reality. No matter what you get back you need to fact check thoroughly! That said, once in while an AI system with give you a kernel of an idea, a thread to pull on, that will lead you to someplace new and great. But it won’t happen without putting in the hard work. No one will ever be replaced by AI. Read why in John Seely Brown’s thoughts on centaur chess. What I’m searching for is also similar to Steven Johnson’s ideas about serendipity in research as well.




I started digging. I’m not sure exactly what the technical approach would be, it might be that we extract all text from PDF’s (recursively, ChatGPT can make the script for this), clean (might need to pay humans for manual review and correction), then “fine-tune”. It might be too big to undertake, I’m not sure because I’m too novice at ML.

But I found these.

And then ask questions of the model using LM Studio.

I absolutely hate the “all it does is guess a string of letters” viewpoint. Knowing how something works behind the scenes doesn’t render it useless.

You ask the AI a question. It looks at your words, how they’re positioned relevant to each other, how punctuation modifies them, then builds a query. It then checks its database of years or decades of acquired knowledge, finding where the answer for this query could be. It pulls from the sources it has, then generates an answer. It then looks through its language database, uses knowledge of how words pair together both locally and in the context of a greater sentence or paragraph, and it constructs a sentence. It also references the original query to determine which emotion to use in the response, which it adds in by looking at everything it knows about how emotion modifies the words and structure of a piece of communication. Done.

You as a human a question. They listen to your words, how they’re positioned relevant to each other, how punctuation modifies them, then builds a query. It then checks its brain full of years or decades of acquired knowledge, finding where the answer for this query could be. It pulls from the knowledge it has, then generates an answer. It then looks through all of its memory and knowledge of language, uses that knowledge of how words pair together both locally and in the context of a greater sentence or paragraph, and it constructs a sentence. It also remembers the original query, determining which emotion to use in the response, which it adds in by remembering everything it’s learned since birth about how emotion modifies the words and structure of a piece of communication. Done.

It’s easy to get caught up in our own intelligence, forgetting that the word salad that pours out of our mouth doesn’t just magically appear. We have our own internal database of sources that tell us about language, past events, emotion, etc. We get things wrong when we don’t have the right information all the time. When we aren’t sure of something, we often hallucinate an answer. In our case though it’s called “guessing”. Our emotions are built on the information we’ve accumulated on how to respond to certain input. That’s why a toddler will throw a screaming fit when frustrated, while an adult may just take a breath and try again. The adult has more data on how to handle situations where the correct answer isn’t able to be found.

Again, it’s easy to say “I know how AI works. It just tries to determine word order based on patterns.” Having knowledge of the inner workings makes it easy to dismiss it. But our brains work the same way, we just don’t think about it because it happens behind our own scenes. Much like AI doesn’t think about how it works unless you make it reference that data by asking about it.

I still think that there’s a difference between stringing woods together because of some probabilistic inference and understanding the meaning of the words. And there are already enough examples out there showing that current “AI” does not understand anything.

Also, there’s a difference between “getting things wrong” and hallucinating. That might have become a bit blurry with certain politicians recently (think the infamous 350 million £ available to the NHS after Brexit). But lying is still not the same as erring.


I’d be interested in hearing how you view “understanding”. From my view, when you understand a word you know what it means, how it’s used, how it can be casually used even if it doesn’t match the exact definition, how it impacts the subject you’re speaking about, the history that informs how the word is used, and of course when it’s appropriate to use the word.

All of that is achieved by AI. It has all the information and queries it when needed. It even knows how the word has been used over time, so in the case of language evolving, slang, etc it figures that out too. Based on the knowledge it has available to it.

How do you see the concept of understanding as different? (Legit curious, as it’s an interesting concept)

The net is full with funny examples. Just take this one:

Well, it does not. Trivial: It can, at best, have the digitized information that it is allowed to see. Which is, I’d guess, far less than “all”.

Understanding is not about a single word. Take “ball” or “suppe” (German for either “soup” or colloquial for a dense fog) – neither I nor an “AI” would know what it means out of context. And no, I can’t tell you what “understanding” is – but we both know, I think, that the average english-speaking person could tell you how many “e” there are in the word “ketchup”. Because we understand the question.


And just now…

LOL :stuck_out_tongue: I guess it needs some help counting those "e"s

On a side note: While the first reponse is legitmate, it’s 100% literal. It also shows how AI doesn’t extrapolate because (1) it doesn’t understand anything, and (2) it has no experiential framework to know what I’m asking, it can’t go beyond the question and infer the intent of my query. Whether written or spoken, most first or second graders would be able to answer, or at least understand the question. :slight_smile:

1 Like

You’re also quoting out of date information. GPT-4 is fully capable of counting letters.

It understands the question exactly how it’s supposed to.

As for having all the information, I meant the information available to it. Just like no human has all the information out there, just what’s available to them.

As for “ball”, you’re correct that it’s not just about a single word. It’s about the context, which both AI and a human would use to inform their response.

It is difficult to define what “understanding” means, but I feel like my attempted definition is pretty solid. I can’t think of a definition that doesn’t match up to what both AI and humans can do.

In GPT’s defense, I was confused too. It reads like a riddle. Without the context of our counting conversation, it looks like an incomplete sentence.

That being said, it looks like you also potentially used GPT 3.5. Here’s what I got when I asked GPT 4 out of the blue, in new conversation:

It knows what you want it to do, but is aware that there isn’t any context to it. If I was sent that message, I’d also ask for more context.

I also did an experiment to see how it would handle being asked that after asking it to count “e”s in another word. The output is reasonable:

(Ignore that middle part, I accidentally hit the “tell me more” button). Directly following a counting of the “e”s, when you as how many “e”/ are in between it reasonably assumes you mean in the original word, which is the same assumption I would make.

Agh, accidentally used the same screenshot twice. Here’s the intended second screenshot:

I dont see any AI going to a library and requesting a book or a journal. Nor calling a person to ask them something. Humans have more ways to gather information. And they sometimes know that they don’t know, instead of simply inventing. Humans can also understand by conducting an experiment, for example.

What is the purpose of calling a friend here? It’s to gain information you don’t already have in your brain’s data set. The newest AIs have access to the internet and absolutely call when they need information.

As for checking out a book, again, what’s the point? It has data you need to add to your data set as you haven’t read it yet. AI has access to hundreds of thousands of full books already processed inside its model. There’s no need for it.

With all that though, I will say, it rarely ever gives up. Even if it doesn’t have internet access. It will tell you “I don’t have that info. Here’s historical context with what I do have, and you can check out these books/sites/etc for more up to date info.” With internet access it of course looks it up for you. So it does know when it’s reached a limit of its information, it does reach out for help, and it even tries to give you the info needed to get to the answer.

Forgot about the experiment thing. All an experiment does is give more data points. AI already has access to way more data than we do. The only reason it doesn’t experiment is because it doesn’t have the tools. It can already interact with documents and other input. It can output images, video, and audio. All you need to do is hook it up to something that takes in input and spits out output. It already knows what needs to be compared, it just sends it into the tool and reports what comes out. As many times as needed for more data.

It’s not a limit of AI, it’s a limit of what we’ve hooked it up to so far.

Or it will simply hallucinate. Which seems to happen more often than saying „I don’t have that info“.