DT and Large Language Models

Altostratus · October 23, 2024, 3:29pm

I’ve been using Obsidian alongside DT for a long time for research and writing and it has been working beautifully (using DT item links in markdown). I’m curious as to the current possibilities for LLM/AI analysis of PDFs which could complement DT (until DT 4 or 5 integrates a LLM, that is…) Is there a local, non-subscription, integrated and non-technical macOS app which can ingest dozens or hundreds of PDFs, and then answer questions about them, with references? Searching for keywords seems so 2010 all of a sudden

Thanks.

BLUEFROG · October 23, 2024, 3:32pm

Searching isn’t being replaced by LLMs. In fact, searches are still far more efficient.

which can ingest dozens or hundreds of PDFs

I think this is an overreach for the technology and would be prohibitively expensive for most if using public LLMs like ChatGPT. Even a single document may be better served in portions, especially for long documents.

Most local options are technical in their installation or use. You’re also limited by your hardware. LLMs use extraordinary amounts of electricity to run.

You could try Ollama, which supports multiple models, some of which can process a document. But bear in mind, this requires multi-gigabyte models to be downloaded to your local machine.

Altostratus · October 23, 2024, 3:37pm

Efficient, maybe. But IMHO going forward it doesn’t seem like science fiction anymore to have a prompt such as “Which one was older, Mr. Smith or Mr. Brown, when they met Mr. Jones?”, assuming the answer lies in cross-referencing textual data from different sources.

BLUEFROG · October 23, 2024, 3:38pm

The context of that question is not the same as a normal search in DEVONthink. Look at the search prefixes. Those are about discrete properties of items, not asking for conclusions to be drawn.

cgrunenberg · October 23, 2024, 3:40pm

You might have a look at GPT4All.

Searching has its advantages (fast, precise, local and low energy consumption)

Altostratus · October 23, 2024, 3:41pm

I know that. I was asking whether current technologies existed that could made such as prompt feasible in the context of a personal PDF collection (or in DT in the future). Asking basic questions has made ChatGPT become a Google replacement (for some tasks) and it would be wonderful to be able to use them on a specific body of texts.

Altostratus · October 23, 2024, 3:42pm

Thanks, I have. It had some rendering issue on my MacBook Pro which made all dialog boxes unusable, and I couldn’t be bothered to troubleshoot it. I was hoping there were other applications around…

cgrunenberg · October 23, 2024, 3:45pm

LM Studio is another alternative, it’s possible to attach documents to chats (but I never tried this). Or AnythingLLM.

BLUEFROG · October 23, 2024, 3:45pm

Many of the new crop of apps are AI aggregators, putting a pretty face on connecting to online LLMs. HuggingFace may have up to date info.

Have you looked at AnythingLLM?

Altostratus · October 23, 2024, 3:49pm

No, thank you both, I’ll have a look at that.

BLUEFROG · October 23, 2024, 3:51pm

You’re welcome and apologies if I sounded terse.

I could swear I just ran into another option recently but the name escapes me. I’ll poke around a bit later.

rkaplan · October 23, 2024, 4:11pm

I can give you some detailed suggestions on how to do exactly what you want.

But none of them will be local

All will require uploading documents to cloud AI of some sort if you are dealing with hundreds of PDFs

None of them will have no subscription - the infrastructure cost of AI makes that prohibitive

As a first pass I would suggest you look at PopAI

Altostratus · October 23, 2024, 4:14pm

Thank you! I will have a look at PopAi as well. I understand the issue with running it locally, I don’t follow the technology very closely and I didn’t know how far we’d come in a year.

Please do Thanks.

rkaplan · October 23, 2024, 4:28pm

Welcome

BTW you will come across a number of local AI wrappers which run a variety of local LLM models.

But for the use case you describe I believe you will be sorely disappointed with the performance.

BLUEFROG · October 23, 2024, 5:05pm

mmoren10 · October 23, 2024, 10:47pm

This looks awesome! Wondering if you or anyone else here has any experience with Msty.

A similar one is BoltAI. Also looking forward to reading anyone’s experience with it and how it might compare with Msty.

Altostratus · October 24, 2024, 9:22am

Thanks a lot for all the above suggestions. My first experiences with some of them in case someone else is interested (please bear in mind that although I code I’m not involved in LLM in any way, my comments are those of a complete beginner in the field):

popai

No pricing mentioned on the front page, I don’t like pages that propose to “create an account” and only then tell you how much it would cost. Also, it might be nitpicking but I don’t like it when I’m proposed to “read PDFs with AI”, because I can read perfectly fine by myself…

anythingllm

Looks very promising, easy install + YouTube tutorial but needs LM Studio for local LLM. Unfortunately, LM Studio only runs on Windows or Apple Silicon, and I’m still on Intel. Wouldn’t mine upgrading to a new Windows machine but DT has become essential to the way I work (I believe it’s the only app I use that I can’t run on Windows).

msty

Best experience for a non-specialist, fast and easy installation, very user-friendly, can add folder of PDFs (or even an Obsidian vault) to a “knowledge stack”, although import is painfully slow. 30 PDFs, 100mb total, all of them OCRed or text-based. The Intel MacBook Pro fan almost drilled a hole through the keyboard, and after 30 minutes was at 6 of 30 files. So no. Had to abort the import.

I tried importing a single PDF. Performance was also painfully slow when chatting, as suggested above, but gave good results with references to the original document. Maybe on Windows with a couple of RTX 5090 cards, I don’t know. It does look very promising.

It’s very interesting to try out new possibilities with these tools, but I guess we’re not there yet for what I had in mind, not for individual & local use anyway. Maybe next year.

Thanks for all the suggestions.

rkaplan · October 24, 2024, 1:42pm

Popai pricing is on their pricing page

“Read PDFs with AI” is precisely what I thought you are seeking - answering questions about them with references

The local LLM apps you are discussing have terrific user interfaces and are very helpful for generic “Chat” - they have a very nice place there. But when it comes to querying or summarizing a PDF file that takes quite a bit of procesing power so for now that requires uploading to the cloud.

I do suspect that in the near future we may see some hybrid software evolve which simplifies the process of sending the text layer of your local files to the cloud for analysis - that would give the practical appearance of a local app but still utilize the cloud.

kewms · October 24, 2024, 3:59pm

That’s something I would hope operating system-level integration – either Mac or Windows – would facilitate.

For any kind of AI “reading” or “summarization,” I’d recommend first testing on a sample of material that you’ve already read yourself. In my testing, results have ranged from “college student with decent study skills” to “are we even reading the same article?”

As noted, the computational resources required for this sort of task are enormous. I wouldn’t expect to see “query dozens or hundreds of PDFs” working on the desktop anytime soon.

rkaplan · October 24, 2024, 4:18pm

Totally agree there

I have tried all sorts of LLMs and apps and written some custom interfaces to LLMs to summarize documents.

At least for my purposes reviewing high-level medical and legal material, Claude Sonnet 3.5 is by far the winner and indeed the only one that I have found which works on a professional level.

The prompt given to AI is key. Even with Claude, the prompt I use with a medical record is different from the one I use with a legal deposition. But if you are willing to spend the time getting the prompt optimized, the results are impressive.

Obviously I would not rely on any of these summaries without reviewing the original. Indeed I only consider LLMs which have the ability to give links back to the source pages. That said - at this point the number of times Claude finds small but very pertinent details far exceeds the number of times that Claude misleads me. I have also written scripts/apps to retrieve and summarize peer-reviewed academic literature using AI and the results are equally useful. No question Claude can help in a such a high-level academic or professional workload.

That said - I view this purely as a “Google search on steroids” - it is for gathering information, NOT for medical diagnosis or legal analysis or original thinking. And surely it is not meant for any type of situation requiring serious risk-benefit judgment.