AI request timeout

Mkcmobile · June 21, 2025, 6:31pm

Is there a way to see and set the timeout limit on AI requests? I’m using some complicated models via Ollama running on a local machine - so I know some can be quite slow in comparison to online ai services. I keep timed out on a few queries and would rather see what it comes up with than get timed out - at least in some use cases. If it’s in the AI settings then I keep overlooking it.

Thanks

BLUEFROG · June 21, 2025, 8:02pm

No, there are no such controls and it’s the first request of its kind. Development would have to assess it.

Mkcmobile · June 22, 2025, 1:32am

Thanks

cgrunenberg · June 22, 2025, 7:49am

What kind of Mac and which models do you use with how many parameters? Currently DEVONthink uses a timeout of 10 (reasoning) or 5 minutes.

Mkcmobile · June 22, 2025, 7:58pm

Hmmm, perhaps something else is amiss then. That seems long enough. I’m using a MacBook Pro with M3 Max and 65 GB unified memory. I realize I’m pushing the limits on some of the local models which is part of my testing/experiments. The model I was using at the timeout is llama4:17b-scout-16e-instruct-q4_K_M which is about the maximum model I can run (albeit slowly) using openweb-ui/ollama. I don’t know what the timeout is there, but it does eventually respond rather than giving up.

cgrunenberg · June 23, 2025, 5:55am

Is it possible that the MacBook Pro went to sleep? Did you use a larger context window or the default size of 4096 tokens?

Mkcmobile · June 23, 2025, 12:56pm

Yes I’m using a large context window (32K) as I’m working with longer documentation and I’ve had issues with smaller windows. This is likely the reason. I had a case yesterday where a particularly long session timed out using qwen3:32b. My uses/experiments are very much an edge case I’m sure.

cgrunenberg · June 23, 2025, 1:06pm

Could you share the used prompt?

Mkcmobile · June 24, 2025, 2:53pm

Sorry I didn’t save that one, but if it happens again I’ll share if I can.

Mkcmobile · June 25, 2025, 1:41am

Just a follow up - I suspect Ollama is the issue. I’m now experiencing similar issues with OpenWebUI using ollama and have tracked it down to some memory issues now. I think they’ve changed some things in their recent releases as I’ve found other people noting performance issues where there used to be none. But definitely increasing the timeout in DEVONthink will not fix the issue.

cgrunenberg · June 25, 2025, 6:17am

Thank you for the info!