Feature request: AI prompt management - poor tagging/labelling/rating results out of the box

chrisgve · June 28, 2025, 4:36pm

I like the AI features that come with DEVONthink4, but I am somewhat underwhelmed by the results of AI tagging, AI labeling, and AI rating (although the rating is somewhat secondary).
One thing that I note in the application preferences is that only the summarization allows for a custom prompt; the other capabilities do not seem to be configurable.
Do you have a similar experience with ways to improve things without relying on the most expensive LLM?

BLUEFROG · June 28, 2025, 5:21pm

I’m not sure what you’re expecting but any kind of automated metadata like this is non-configurable. If you have specific ideas about that should and shouldn’t be, you should handle these things on your own manually or work to perfect prompts that deliver results closer to what you hope to see.

cgrunenberg · June 28, 2025, 5:49pm

Which model did you actually use? This can make a huge difference.

chrisgve · June 29, 2025, 9:24am

Thanks for your thoughts, actually what you mention “…or work to perfect prompts that deliver results closer to what you hope to see.” is what I’m asking, I don’t see a way to modify the prompts that generate tagging, labeling, and rating.

chrisgve · June 29, 2025, 9:27am

I’ve tried ollama with small models, and indeed qwen3:0.6b is (as expected) giving very poor results. I moved to ChatGPT 4.1 nano and the results were better but still, in my view subpar. But it is also true that I don’t know what exactly is fed into the LLM, so the prompt might be of limited impact.

cgrunenberg · June 29, 2025, 10:43am

Tagging depends highly on personal preferences which might also change over time. E.g. nobody in this company would tag like I do and I don’t tag like I did in the past.

One possibility to improve automatic tagging is to create the desired tags first and then to enable the option to use only the existing tags, see Settings > Files > Tags

Another possibility is to use batch processing or smart rules, here’s a basic example:

chrisgve · June 29, 2025, 11:07am

Thanks a lot, I will try this approach! Your help is highly appreciated!

chrisgve · July 4, 2025, 6:56pm

Your approach works wonders, I was wondering how I could apply it for labels and ratings. In case of labels how is it possible to provide the list of possible labels as configured in DT4 such that the LLM understands the labels and can answer correctly. And then how to format the response such that DT4 can apply it properly. Alternatively, can you help finding the answer myself such that I can learn while doing it.

cgrunenberg · July 5, 2025, 6:53am

Labels and ratings would require scripting (see get chat response for message command and label names property) as it’s not possible to set them via batch processing or smart rules to a placeholder (like the chat response)

chrisgve · July 5, 2025, 8:00am

Thanks a lot! That’s very helpful! I’ll dive into it!