Anthropic says multiple AI agents working together beat single models by 90%. The catch? They use 15x more computing power. This trade-off between performance and cost might reshape how we build AI systems for complex tasks.
AI models typically learn by memorizing patterns, then researchers bolt on reasoning as an afterthought. A new method called Reinforcement Pre-Training flips this approach—teaching models to think during basic training instead.
AI just aced its cyber midterms. New testing from Anthropic reveals their AI systems jumped from flunking advanced cybersecurity challenges to solving one-third of them in just twelve months. The company's latest blog post details this unsettling progress.
The digital prodigies didn't stop there. They've stormed through biology labs too, outperforming human experts in cloning workflows and protocol design. One model leaped from biology student to professor faster than you can say "peer review."
This rapid evolution has government agencies sweating. The US and UK have launched specialized testing programs. Even the National Nuclear Security Administration joined the party, running classified evaluations of AI's nuclear knowledge – because what could possibly go wrong?
Credit: Anthropic
Tech companies scramble to add guardrails. They're building new security measures for future models with "extended thinking" capabilities. Translation: AI might soon outsmart our current safety nets.
The cybersecurity crowd especially frets about tools like Incalmo, which helps AI execute network attacks. Current models still need human hand-holding, but they're learning to walk suspiciously fast.
Why this matters:
AI's progress from novice to expert in sensitive fields resembles a toddler suddenly qualifying for the Olympics – thrilling but terrifying
We're racing to install safety measures while AI sprints ahead, and it's not clear who's winning
Anthropic says multiple AI agents working together beat single models by 90%. The catch? They use 15x more computing power. This trade-off between performance and cost might reshape how we build AI systems for complex tasks.
AI models typically learn by memorizing patterns, then researchers bolt on reasoning as an afterthought. A new method called Reinforcement Pre-Training flips this approach—teaching models to think during basic training instead.
Meta just paid $15 billion for a 49% stake in Scale AI after its own models flopped. CEO Alexandr Wang gets control while leading Meta's new "superintelligence" team. The deal reveals how desperate big tech has become to acquire AI talent at any cost.
AI's "thinking" models hit a wall at certain complexity levels and actually reduce their reasoning effort when problems get harder. Apple researchers found these models can't follow explicit algorithms reliably, revealing gaps in logical execution that more compute can't fix.