A Proposal for the Integration of DEVONthink and ChatGPT API

cgrunenberg · May 3, 2023, 12:37pm

Which data do you actually have in mind?

thother · May 3, 2023, 1:00pm

My 2c. But first, for context. I’m a long time DEVONthink user and fan. If I’m cranky on here about issues it’s just my personality; please know that I am frankly in awe of the scope and endurance of this project. Dt, Emacs, and Firefox are basically the only software I regularly use.

ALSO, I have a very strong personal and professional interest in what’s going on with AI, I spend a lot of my recreation time fiddling with downloading models from huggingface and chatting with OpenAI’s model both through the browser and via api.

ALL THAT SAID:

The situation here is fluid and moving quickly. We are at a point where it is probably trivial to set something up scripting wise for Dt to interact with ChatGPT. But it’s an open question what things are going to look like in 6 months, a year, etc. This tech may end up being locked down so that only big players like MSFT can incorporate it into their software. Or we may be looking at a world where good enough models are running on everyone’s box as part of the OS. Who knows. In the meantime, I would not advise anyone to build a business model around assumptions about what OpenAI’s api is going to look like pricing or access wise in 2024.

rmschne · May 3, 2023, 1:21pm

See from @thekok

pacificera · May 3, 2023, 5:41pm

Meanwhile, I’m enjoying using the SmartConnections chatgpt plugin for Obsidian:

rkaplan · May 4, 2023, 12:31am

I have not seen anyone suggest a business model that depends on OpenAI.

But pretty soon software that does not integrate with AI may be at a disadvantage. That is clearly the way the industry is moving at present.

thother · May 4, 2023, 10:49am

The entire thread is about integrating their api with Dt, so I’m kind of confused by your comment. In any event, the way the industry is moving I think not having some kind of rushed to market integration with an LLM is going to be a positive differentiator soon.

I am very excited about the potential of these models. But the current context window size and per token pricing scheme puts a pretty hard limit on what is useful to do with the 10G+ buckets of pdf we’re all lugging around. And that’s what we’re all dreaming about. The “write a letter to mom” stuff is going to be baked into your favorite text editor by the end of the month, if it hasn’t been already.

thother · May 4, 2023, 10:56am

A little confused about this Obsidian plugin. It uses the user’s api key, but advertises GPT-4. Surely that’s available only if the user has api access to 4? I’m still on the wait list for the api, and they appear to be rolling it out pretty slow; they put a cap on ChatGPT usage.

chrillek · May 4, 2023, 11:06am

Not to curb your enthusiasm, but I seem to remember “the industry” moving towards Blockchain before. And towards XML. And towards XHTML. And to SOAP.
Just because something is talked about a lot does not mean that it is financially or technically viable.
That’s not to say that these text analyzing programs are not going to take off. But “industry” interest in them is not a strong argument for their eventual success, IMHO.

thother · May 4, 2023, 11:15am

If I’m wrong, someone please correct me, but as I understand it the issue is the context window, which is currently vanishingly small for GPT3 and a bit bigger for GPT4 (the 32k version seems huge rn, but…) The way “chat with your pdf” and the Obsidian plugin linked about work is they put a larger text corpus in an indexed database, translate your chat prompt into search on the database, and then feed the search results only to the LLM, ie., much smaller snippets of text. For some use cases this could be sort of useful. But I currently cannot, for example, ask the LLM to interact with an entire legal opinion meaningfully. Much less the average user’s Dt database.

thother · May 4, 2023, 11:17am

I think I’m the one with the curbed enthusiasm here, and hard agree.

rkaplan · May 4, 2023, 12:49pm

That stuff was theorized by the computing media. But very few apps actually added those features.

With AI take a look at ProductHunt - the number of new and existing apps with AI features is stunning.

rkaplan · May 4, 2023, 12:55pm

Yes that is an issue. Short-term you can work around that with recursive summarization. GPT-4 will also have a larger context window. And surely it will continue to grow.

No question a context window that can hold an entire PDF is the holy grail for many uses and that has not arrived yet. But still there are significant uses of AI at present.

If legal applications are of interest, certainly current AI is advanced enough that you are at a disadvantage against your opponents if you do not use AI in addition to standard legal research techniques. It’s not advanced enough at this point to replace Lexis and other traditional legal tools; but at the same time it clearly has capabilities right now which surpass anything else out there for legal research.

[For legal research BTW - Bing AI Chat is far superior to ChatGPT since it has access to the Internet and gives references. Perplexity.AI integrates ChatGPT with the internet and is another option. The soon to be widely released plugins to ChatGPT may turn out to be superior to either of those options.]

rkaplan · May 4, 2023, 12:59pm

Obviously false. Betting the farm on AI is of course a bad strategy. But software is going to be dinged for not simply having the option to interface with AI.

Like so much other technology, the tool itself is neither good nor bad; it’s a matter of how it is used. Give the user the credit and option to choose what is best for his use case.

thother · May 4, 2023, 2:25pm

As I said, I am very interested in AI and am actively exploring its use. On the other hand, I don’t think my contention that the market is about to be flooded with poorly thought out, rushed integrations to the api is “obviously false.”

To the legal point, historically my main advantage against my adversaries in litigation has been the fact that I take the time to actually read and understand cases in their entirety and in context instead of just looking at whatever section of the opinion the search engine points me at, and let’s just say I don’t expect that to change any time soon as a result of AI uptake given current constraints. And currently, the problem with scaling the context window has been that the resource use scales exponentially, although I think there are interesting results out there that may change that soon. Re progressive summarization, I’m extremely dubious, although some of the stuff re AI assisted context input compression looks very interesting.

I like perplexity’s results a lot more than Bing, but my understanding is both use more or less the same OpenAI product? Bard is the differentiated product in search. I have access to the web plugin for ChatGPT as well and it is…slow. Even without search you can get some pretty impressive results, although the truth is what you are getting is basically a summary of blog posts on law firm websites as opposed to anything I would call actual legal research. And I am already seeing evidence that they are training the open access models not to answer legal questions, with I’m sure a “safety” justification but it’s difficult not to be aware that they are working on products to monetize the AI for legal research and don’t want the competition from an open model.

rkaplan · May 4, 2023, 2:52pm

Sure some of the AI integrations are likely of questionable use, but surely that is not a negative for the software - just don’t use it if the feature is not helpful to you. I do agree though that in the software market where AI is 99% of the purpose, e.g. AI “blog writers” to write 5,000 blog posts a day, many of those will fail - as they should.

I agree 10,000% about reading cases in context and in detail… FWIW, I am not an attorney but rather a subject expert in litigation. I find Bing and Perplexity (and some custom scripts I have written for my specific uses cases) to help me find medical or legal citations that I otherwise would not have found or would not have found as quickly. Yes, I still have to read them completely and sometimes they turn out to not be useful- just as with a Google, Lexis, or PubMed search.

I like the Perplexity.AI interface as well but it operates by feeding finite results of a Bing web search into ChatGPT. Bing AI Chat on the other hand has access to all of the Bing search database. So I find on obscure points Bing AI Chat tends to be more likely to find stuff that Perplexity.AI does not.

Yes it is all changing and there may turn out to be business reasons to stratify access to data. Though it may become more costly, from a business/professional perspective if someone can make professional search data more easily/accurately available then I would be glad to pay that cost; the ROI no doubt will be worthwhile.

thother · May 4, 2023, 2:56pm

Interesting re the distinction between Perplexity and Bing, I was wondering why they give such different results. Can you say more about what you are using this for? Do you work as a retained expert in litigation matters relating to e.g. some sort of engineering specialty?

rkaplan · May 4, 2023, 2:59pm

I am a retained expert in medicine related to future treatment needs of individuals involved in catastrophic injuries (“life care planning”) and related to estimation of life expectancy.

rkaplan · May 4, 2023, 3:12pm

Forgot to comment on this.

Bard has no access to the Internet and I know of know plug-in or workaround or API to resolve that.

For this and other reasons, I consider Bard to be [surprisingly] light years behind ChatGPT. A very surprising fail for Google.

pacificera · May 4, 2023, 3:16pm

I think using the DT corpus as multi-shot ammo to ChatGPT is worth exploring.

thother · May 4, 2023, 3:16pm

BTW, I work with life care planners all the time.

And yeah, you’re mostly right about Bard, except of course that it has “access to the internet” in the sense that it’s originally trained on a massive, unspecified amount of material from Google’s vast storehouse. It does turn your results into a proposed search input, which is basically what Bing and Perplexity are doing, it just isn’t turning around and feeding search results back into the model. It’s an interesting question whether they are doing that intentionally crippled or whether they are holding back something that is integrated more tightly.

Also, of course, we have no real idea how much AI is going on behind the scenes in generating “normal” search results at Google, but one can assume quite a bit.