Is there a way to see and set the timeout limit on AI requests? I’m using some complicated models via Ollama running on a local machine - so I know some can be quite slow in comparison to online ai services. I keep timed out on a few queries and would rather see what it comes up with than get timed out - at least in some use cases. If it’s in the AI settings then I keep overlooking it.
Thanks
No, there are no such controls and it’s the first request of its kind. Development would have to assess it.
What kind of Mac and which models do you use with how many parameters? Currently DEVONthink uses a timeout of 10 (reasoning) or 5 minutes.
Hmmm, perhaps something else is amiss then. That seems long enough. I’m using a MacBook Pro with M3 Max and 65 GB unified memory. I realize I’m pushing the limits on some of the local models which is part of my testing/experiments. The model I was using at the timeout is llama4:17b-scout-16e-instruct-q4_K_M which is about the maximum model I can run (albeit slowly) using openweb-ui/ollama. I don’t know what the timeout is there, but it does eventually respond rather than giving up.
Is it possible that the MacBook Pro went to sleep? Did you use a larger context window or the default size of 4096 tokens?
Yes I’m using a large context window (32K) as I’m working with longer documentation and I’ve had issues with smaller windows. This is likely the reason. I had a case yesterday where a particularly long session timed out using qwen3:32b. My uses/experiments are very much an edge case I’m sure.
Could you share the used prompt?
Sorry I didn’t save that one, but if it happens again I’ll share if I can.
1 Like
Just a follow up - I suspect Ollama is the issue. I’m now experiencing similar issues with OpenWebUI using ollama and have tracked it down to some memory issues now. I think they’ve changed some things in their recent releases as I’ve found other people noting performance issues where there used to be none. But definitely increasing the timeout in DEVONthink will not fix the issue.