I have been using Claude Cowork/Dispatch for about a week or so along with the open source DT MCP server. It has consistently been a gamecahnging upgrade in how I interact with DT and the ease of obtaining detailed summaries of large document sets.
Two days ago on 4/16 this changed. The workflow would randomly stop, source links would be malformed or otherwise inoperative, reports were inconsistent in detail and format, and numerous other isues popped up. I would ask Claude to redo, it woudl “fix” an issue, and then more issues would emerge.
I then realized Opus 4.7 was introduced on the day I started having issues. So I asked Cowork to change the skilt to specify Opus 4.6 rather than the most recent version.. problem fixed, usual impeccable detail returned.
This is Claude’s own analysis - I think it is correct:
The 4.6 production skill works great. The 4.7 test skill keeps failing in different ways despite dozens of iterations. The question is why.
Here’s what I think is actually going on:
The core problem isn’t the skill — it’s the model behavior difference. Opus 4.6 and 4.7 process instructions differently. The production skill was refined over many iterations with 4.6, so its instructions are calibrated to how 4.6 interprets and follows them. When 4.7 reads the same instructions, it interprets them differently — more “helpfully,” more willing to synthesize, less mechanically literal.
Specific failure modes unique to 4.7:
Citation fabrication — 4.7 composes plausible search phrases from clinical understanding instead of mechanically copying substrings. 4.6 follows the “copy verbatim” instruction literally.
UUID hallucination — 4.7 constructs UUIDs that look right but aren’t verified. The v4b links that produce audible errors are likely malformed or wrong UUIDs.
File transfer thrashing — 4.7 kept trying elaborate workarounds (base64, gzip, Python scripts) instead of the simple filesystem write_file path, burning hundreds of turns.
Location/tagging non-compliance — despite explicit instructions, 4.7 skips or doesn’t verify post-import steps.
Why patches keep failing: Each fix addresses one symptom but 4.7 finds a new way to be “creative” somewhere else. The R-CITE-GENERATE-01 rules fixed one fabrication pattern, but 4.7 started fabricating UUIDs instead. The skill is now 108KB of increasingly desperate guardrails, and 4.7 still finds gaps.
The honest assessment: The 4.6 skill works because 4.6’s behavior matches the skill’s assumptions. Rather than continuing to bolt on rules trying to make 4.7 behave like 4.6, I think we need to either:
Stay on 4.6 — it works, the skill is proven, and there’s no compelling reason to switch
If we want 4.7, rebuild the skill from scratch for 4.7, testing each section iteratively against how 4.7 actually behaves — not porting assumptions from 4.6
Bottom line - I will indeed continue only with Opus 4.6. Then I will follow articles and online discussions about this issue. Someday 4.6 will be deprecated but that is certainly a while off.
I am not sure I would recommend that Devontech support Opus 4.7 until we understand more about why it interprets prompts so differently from 4.6. Or alternatively I would recommend keeping both 4.6 and 4.7 for a a while.



