A Proposal for the Integration of DEVONthink and ChatGPT API

rkaplan · May 4, 2023, 3:21pm

As for Bard - Aside from the issue of Internet connectivity, it does not give any references as to where its info comes from (just like ChatGPT without a plugin or extension). So for that reason it’s DOA for information gathering although it may be useful in many cases for summarizing or editing suggestions.

thother · May 4, 2023, 3:28pm

I mean, this is one of the issues with these models, the information doesn’t “come from” anywhere except the entire training set. Attribution is a whole complicated problem that we will be litigating in the IP space for years.

rkaplan · May 4, 2023, 3:37pm

True - but Bing AI Chat and Perplexity do a great job offering references to sources.

Thus I consider those to be the only potential options at this time if using AI to search for information.

ChatGPT (and maybe Bard and a distant second) may be useful for editing or summarization but not for seeking actual data or references.

piotr.czapla · May 9, 2023, 2:30pm

Besides the described use case that helps organising the documents it would be useful to be able to query bunch of documents and extract data from them in form of csv / excel/json.

Using the invoice database example, I can Imagine I might want to categorise each item in my invoices for further analysis. This could be as easy as asking chatgpt something alone this lines: "Check the invoice below and give report fill the following json for each invoice line: {"company" : "<the company I'm paying to>", "amount": "<the net amount I paid>", "item_category":"subscription, purchase_physical, purchase_virtual, shipping, purchase_services", "date":"<invoice_date>", "service":"<what kind of service I'm getting in 3-4 words>"} .
I could obviously ask my accounting to get that info, but asking chat gpt is just faster and sometimes I don’t need exact values.
You can go wild with a simple ability to first filter the documents (using embeddings or chatgpt) and the asking chat gpt to do data extraction and formatting for you.

If you combine this with a reliable PDF/HTML to markdown so that chatgpt consume pdfs in a better format you get a quite useful tool, that can do the text manipulation, data extraction. What is even more interesting is that once you have the data extracted you can ask chatgpt to do a summary for you.

I’ve just recently spend 30 min trying to get a pdf with blood tests of my child to markdown table so that I can feed it to chatgpt-4.0 to do initial analyses. It was as accurate as the checkup with the physician the next day :). But the ability to save 30 min by nice pdf/html to markdown conversion is something that could be part of DT, as the medical checkup is ending there anyway.

This is way more exciting now when I can run quite powerful models on my m1pro without any network access. I can see chatgpt being a stepping stone to really powerful and private assistant that can be build later on once you have the use cases figured out.

piotr.czapla · May 9, 2023, 3:08pm

Yeah, It is about right. It is one way to go around the length limitation of transformer. But It is a bit more nuance than that. Firstly you are not searching using the crude keyword search. Those databases are able to understand the text and query on similar level to chatgpt-3.5. So they are more accurate at finding the correct snippets that you might interpolate from regular keyword search / snippets. (The database can be used to improve the DT search as well.). Secondly the gpt-3 can/should be used to formulate the right search query so that your prompt is trimmed to just relevant sentences and then passed to the database to search for snippets. (not sure if obsidian is doing this though.)

But you are right it is impossible to feed the whole database to chatgpt and this is a limiting factor in analysing a long legal opinion but the point is to try and find out for yourself what works and how to work around the limitations. The snipped search is only a part of what can be done. Another way to work with long documents is to first ask chat gpt to summarise it from some perspective and then do the analysis on summary.

It takes considerable amount of time, especially if you do that for one document. But it is a skill that I think is worth mastering and It would be lovely if DT would help us with this.

Regarding the “6 months perspective”, the open source version of chatgpt are being build and we can run first versions of them on our laptops right now. (OpenLLaMA, StableLM, OpenAssistant to name the few). It wont be long when you will have an option to chatgpt like responses that are running completely on your system. Which would alleviate much of the risks.
Open AI api give you the ability to figure out how your work will look like in the future.

thother · May 9, 2023, 3:54pm

Yeah, I’ve been playing with the open models, they aren’t ready for prime time just yet but progress has been rapid. Progress is happening pretty fast with the enterprise models too though! And I feel like we aren’t 100% sure the extent to which improvement of the models will adapt to the resource constraints of home systems. I frankly suspect that alternative models on various cloud platforms will be where the interesting growth is there.

I get that embeddings are a little different. It’s still a hack though, and what you get isn’t worth the effort rn for me. Like I can feed the api a 3 page form through langchain, and it’s amazing that it can chat over the form, but frankly if it’s a 3 page document it’s not worth the trouble as anything except a proof of concept. Right now. And I’m not convinced the skills I’m acquiring in the process are going to be useful to me next year; am I really going to be playing around in a Jupyter notebook to use this stuff in 2024?

rkaplan · May 9, 2023, 6:14pm

Excellent examples - completely agree

piotr.czapla · May 10, 2023, 1:22pm

I really hope not :), This is the whole point of this thread.

But maybe Obsydian is the way to go as markdown seems the way to go for the time being.

thother · May 10, 2023, 2:25pm

I don’t use Obsidian, I haven’t gotten much use out of the “network of note snippets” approach they seem to embody. Are they doing anything that would help me, say, summarize a 200 page deposition transcript?

rkaplan · May 10, 2023, 4:17pm

5-hour webinar yesterday in Apple style by box.com for their new “Box AI”

Huge marketing push - “Enterprise class document AI at petabyte scale”

Incredible potential to shake up the AI world - and take on Dropbox at the same time

But when the dust settles… they have no clue when the product will be released even in beta. And they have no clue how long a document it will actually be able to summarize

So total vaporware

thother · May 10, 2023, 4:29pm

Disappointing but unsurprising.

BLUEFROG · May 10, 2023, 5:48pm

Hopefully this makes you appreciate our “silence” and approach regarding tech, especially nascent tech.

rkaplan · May 10, 2023, 6:01pm

I can dream

In the spirit of full disclosure - I also concede that while I was successful in writing a script to summarize short documents or abstracts (up to 4,000 characters), so far my follow-up effort to recursively summarize larger documents is not practical given response time issues with the OpenAI API.

thother · May 10, 2023, 6:41pm

I’ve tried it with the open source models and it’s just brutally slow, when you can even get it to work. OpenAI is sort of almost usable but you burn a lot of tokens to get basic answers you could just as easily get by scanning the document yourself. I do think there’s insane potential here, but it’s early days.

thother · May 10, 2023, 7:42pm

speak of the devil Transformers Agent

rkaplan · May 10, 2023, 8:03pm

That’s an intersecting experimental API at Hugginface - but if I am reading that it can answer questions of short PDF documents in image format but not long documents. Am I missing something?

thother · May 10, 2023, 10:15pm

haven’t played with it yet

cgrunenberg · May 11, 2023, 5:40am

Sounds like bullshit bingo at its best

MsLogica · May 12, 2023, 7:02am

Readwise’s Reader app has an AI (called Ghostreader) built on GPT-3 that will summarise entire PDFs, or parts (and other tasks). It is designed to do one job (help you read) and do it well. I think the future of AI is in this sort of focussed skillset, personally.

I just tried it on a book and it condensed the entire book into one paragraph in about 1 min. Obviously there’s not really any value in doing that with a book (the publisher already performs the same task on the cover), but you’d mentioned wanting to do it for big files so a book seemed an obvious test. I’ve thrown a few academic papers at it just to see how it works (I’m a scientist) and it’s fine.

It has far more useful applications though (I think summaries are at best just a precursor to actual reading, and I’m not convinced they save me much time). It can offer in-text references, offer Q&As, etc. I have to say though, so far other than playing with it, it hasn’t actually changed how I read and think at all. I can see it being easily abused by someone wanting to get the answers without doing the work (which to be fair if you just e.g. need to find out how to do X, that’s a very reasonable use), but for many knowledge workers their value is presumably in the context, connections and ideas they produce, and you can’t do that work without having done the reading yourself (I.e. you need to be an expert in your field and AI can’t do that for you).

On an entirely separate point, I agree that we have no idea how this is going to shake out in the wash and I think @thother has summed it up well, but no-one in this thread has pointed out that much of the public do not quite support the use of this tech at present. Polls from the EU for example suggests the public wants far tighter regulation (e.g.). It’s very likely that responsive democracies (like the EU at least) will bring legislation, and we don’t know how that would affect apps that integrate AI. Some countries have already banned ChatGPT (e.g. Italy), but I suspect over the longer term the legislation will be far more nuanced, and tech companies that what to implement AI will have to figure out how to navigate that, and could potentially waste millions implementing an integration which ultimately becomes illegal.

Whilst I don’t think AI is a fad, it’s worth noting as well of course that every few years the tech industry heralds some new product as “the future”, and most of that has not come to pass.

rkaplan · May 12, 2023, 8:41am

You had me quite interested for a moment so I tried it out.

I understand and agree with many of your thoughts in that the mainstream media is over-hyping AI in particular by promoting its use for the wrong tasks. But that said, summarization is immensely useful in specific defined workflow situations. As but one example, a “Summary” that includes every reference to the word “angiogram” in a medical record and a link to those pages can be extremely useful. That does not mean that a doctor should not read the whole record; but it does mean it is possible to very quickly get a big picture of which documents may have pertinent information in some cases.

I gave Reader a try. Unfortunately it has locked-down its implementation of GPT to a very limited set of conditions. It will not accept any arbitrary prompt but instead restricts its use to pre-fab or custom questions likely to produce fairly limited output. You could not use it - as @thother and I have both proposed elsewhere - to “summarize” by extracting specific occurrences of certain text in CSV or JSON format.