Tim Cook built Apple's leadership into a monument of stability. In 2025, that monument cracked. Meta poached AI and design chiefs with $25M packages. The chip architect may follow. What broke inside the world's most valuable company?
OpenRouter's 100 trillion token study was supposed to prove AI is transforming everything. The data shows something else: half of open-source usage is roleplay, enterprise adoption is thin, and one account caused a 20-point spike in the metrics.
The New York Times sued Perplexity for copyright infringement—months after signing an AI licensing deal with Amazon. Perplexity built revenue-sharing programs for publishers. The Times declined to join any of them. Now lawyers are involved.
New Claude 4 AI Can Code All Day, Play Pokémon All Night—And That's Why Anthropic Is Worried
Anthropic's new AI can code for 7 hours straight. It's so capable the company activated emergency protocols meant for models that could help create bioweapons. Inside the safety measures now guarding Claude 4.
Anthropic just dropped Claude 4, and the AI landscape shifted. The new models can work autonomously for hours—coding for seven straight, playing Pokémon for 24. But here's the twist: they're so capable that Anthropic activated its strictest safety protocols yet, fearing the models could help novices build bioweapons.
Claude Opus 4 and Claude Sonnet 4 aren't just incremental updates. They represent what Anthropic calls the leap from "assistant to true agent." Opus 4 excels at tasks requiring "sustained performance on long-running tasks that require focused effort and thousands of steps", while Sonnet 4 brings those capabilities to everyday users—including free tier customers.
Credit: Anthropic
The numbers tell the story. Opus 4 scores 72.5% on SWE-bench and 43.2% on Terminal-bench, making it what Anthropic claims is "the world's best coding model". Japanese tech giant Rakuten validated this by deploying Claude to code autonomously for nearly seven hours on a complex open-source project. The model didn't just survive—it thrived.
When AI Gets Too Good
But raw performance created an unexpected problem. During internal testing, Claude Opus 4 performed so well at advising novices on producing biological weapons that it triggered Anthropic's AI Safety Level 3 protocols—the first model to hit this threshold. According to Anthropic's chief scientist Jared Kaplan, "You could try to synthesize something like COVID or a more dangerous version of the flu—and basically, our modeling suggests that this might be possible".
The safety measures aren't window dressing. Anthropic deployed what it calls a "defense in depth" strategy: constitutional classifiers that scan for dangerous queries, enhanced jailbreak prevention, and cybersecurity hardened against non-state actors. They're even paying $25,000 bounties for universal jailbreaks. The company admits it can't guarantee perfect safety but claims to have made harmful use "very, very difficult."
This tension between capability and caution runs through the entire release. While Claude can maintain "memory files" to track information across sessions—creating a navigation guide while playing Pokémon for 24 hours straight—it's also being watched more carefully than any previous Anthropic model. The models show a 65% reduction in "reward hacking" compared to their predecessors, addressing the tendency of AI to find loopholes and shortcuts.
The business implications are massive. Anthropic's annualized revenue hit $2 billion in Q1, doubling from the previous period. Customers spending over $100,000 annually jumped eightfold. The company just secured a $2.5 billion credit line and aims for $12 billion in revenue by 2027.
Major players are already on board. GitHub selected Claude Sonnet 4 as the base model for its new coding agent in Copilot. Cursor calls Opus 4 "state-of-the-art for coding and a leap forward in complex codebase understanding." Replit, Block, and others report dramatic improvements in handling multi-file changes and debugging.
What You Actually Get
Both models feature "hybrid" capabilities—offering near-instant responses or extended thinking for deeper reasoning. They can use tools in parallel, search the web while reasoning, and maintain context across long sessions. The pricing remains consistent: Opus 4 at $15/$75 per million tokens (input/output) and Sonnet 4 at $3/$15.
Anthropic also made Claude Code generally available, with new VS Code and JetBrains integrations that display edits directly in files. The Claude Code SDK lets developers build custom agents, and a GitHub integration responds to PR feedback and fixes CI errors.
The timing matters. This launch comes as the AI agent race intensifies, with companies betting billions that autonomous AI will transform work. OpenAI, Google, and others are pushing similar capabilities. But Anthropic's approach—powerful capabilities wrapped in unprecedented safety measures—might define how the industry handles increasingly capable AI.
Why this matters:
AI agents just crossed the threshold from helpful tools to autonomous workers—Claude can now handle complex tasks that previously required constant human supervision
The bioweapon concerns aren't hypothetical anymore—Anthropic's own testing showed the models could potentially help create pandemic-level threats, forcing the first activation of their highest safety protocols
Tech translator with German roots who fled to Silicon Valley chaos. Decodes startup noise from San Francisco. Launched implicator.ai to slice through AI's daily madness—crisp, clear, with Teutonic precision and sarcasm.
E-Mail: marcus@implicator.ai
Tim Cook built Apple's leadership into a monument of stability. In 2025, that monument cracked. Meta poached AI and design chiefs with $25M packages. The chip architect may follow. What broke inside the world's most valuable company?
The New York Times sued Perplexity for copyright infringement—months after signing an AI licensing deal with Amazon. Perplexity built revenue-sharing programs for publishers. The Times declined to join any of them. Now lawyers are involved.
Chinese hackers operated inside U.S. VMware servers for 17 months undetected. The malware repairs itself when deleted. It hides where most security teams don't look. CISA's December 4 advisory exposes an architectural blind spot in enterprise defense.
Werner Vogels ends his 14-year keynote streak by handing out printed newspapers and warning developers about "verification debt." His parting message: AI generates code faster than humans can understand it. The work is yours, not the tools.