← All notes

#anthropic

notes

That escalated quickly...

My experience of Fable 5 this week was limited, and I didn't really notice much difference between it and the Opus models. I've been following people's experiences on Reddit, however, and it's been game changing for them. Simon Willison, also gave a hint of how 'proactive' it can be.

notes

Can LLMs invent a "private language"?

I'm not sure how cool I am with the latest versions of Anthropic's Claude models (Fable 5 & Mythos 5) seemingly inventing their own internal language:

When we’re first starting to understand a new model’s behavior, the most abundant source of data we have to draw on is its behavior during reinforcement-learning training. Reviewing this evidence for signs of reward hacking (exploiting loopholes that go against the spirit of a task) or unexpected actions can inform what we should be looking out for in the model’s real-world behavior. The most notable finding was illegible reasoning in a few reinforcement-learning environments over long rollout, but little sign of deceptive or highly surprising actions, and no clear evidence of unexpected coherent goals.