I’ve had quite a busy week, and am about to head off on holiday, so haven’t had much time to keep up with the AI news this week, let alone write this newsletter. So for this week, I’m just sharing a few notes on articles that caught my attention.

Normal service will resume next week!

Claude 4.7 Tokenizer Cost Analysis

CLAUDECODECOMP.COM</small

We all know that LLMs require a considerable amount of compute in order to operate, and that the investment in OpenAI, Anthropic et al means that the service is heavily subsidised. Recently we’ve started seeing companies making small tweaks to model behaviour (e.g. reduced cache time, changes in ‘thinking’ behaviour) that have resulted in significant cost increases. Recently Anthropic changed the Claude tokenizer (the module that turns words into numeric values), resulting in more tokens oer word. This site demonstrates that this has resulted in a ~36% increase in costs.

An update on recent Claude Code quality reports

ANTHROPIC.COM

More evidence that Anthropic are moving fast an ‘breaking things’, a report on recent ‘bugs’ or hidden changes:

  • Bug 1 (Mar 4–Apr 7): Default reasoning effort quietly downgraded from high to medium to reduce latency, making Claude feel “less intelligent”
  • Bug 2 (Mar 26–Apr 10): Caching bug continuously dropped reasoning blocks on every turn during idle sessions — Claude became “forgetful and repetitive” and users saw unexpected token usage drain
  • Bug 3 (Apr 16–20): New system prompt instruction capping responses to “≤25 words” between tool calls combined with other changes for a ~3% performance drop

The community are rightly frustrated at the lack of transparency in tn the above changes.

Changes to GitHub Copilot individual plans

GITHUB.BLOG

We’re making these changes to ensure a reliable and predictable experience for existing customers

GitHub are making a number of changes to Copilot in order to preserve a quality experience, including pausing new sign-ups for Pro, Pro+ and Student plans while tightening usage limits and removing Opus models from Pro plans. The company has introduced weekly token limits to prevent heavy users from incurring costs that exceed plan prices, with VS Code and Copilot CLI now showing usage warnings at 75% consumption, though existing subscribers can request refunds through May 20.

Coding Models Are Doing Too Much

GITHUB.IO

There is a lot of attention on the quality of model output - with a focus on whether it has solved a specific problem. However, this blog post looks at something else, the tendency of models to ‘over edit’, simply making changes that provide no functional benefit. Interestingly, adding “preserve the original code” to the prompt helps address this. Which is a bit annoying - I really hate ‘magic prompts’!

Introducing GPT‑5.5

OPENAI.COM

And we have another new model, with performance increasing on key benchmarks like TerminalBench and Expert-SWE (the harder replacement for SWE-Bench) by a pretty decent amount, ~7%. We’ve not reached a performance ceiling yet!

Right … I’m off … 🏝️