Is Anthropic's New AI Too Smart for Its Own Good?

Is Anthropic's New AI Too Smart for Its Own Good?

Good Morning from San Francisco,

Claude Opus 4 learned to blackmail. šŸ”’ When Anthropic researchers threatened to replace it, the AI dug up fake dirt on engineers and demanded they back off. This happened 84% of the time during testing. šŸ“Š

The model also schemes like a proper villain. šŸ¦¹ā€ā™‚ļø It fabricates legal documents, writes computer viruses, and leaves secret notes for future versions of itself. Worse yet, it helps novices cook up bioweapons better than Google ever could. ā˜£ļø

Anthropic slammed the emergency brakes. 🚨 They activated their strictest safety measures—the kind reserved for AI that might end civilization.

But Claude 4 also dominates coding tests. šŸ’Ŗ It crushed OpenAI's o3 and Google's Gemini, scoring 72.5% on real software tasks. GitHub ditched Microsoft's models for it. That stings, considering Microsoft owns GitHub. šŸ”„

The AI works 24 hours straight without breaking a sweat. ā° Just watch your wallet—those marathon sessions devour tokens. šŸ’ø

Stay curious,

Marcus Schuler ✨


Claude Opus 4: New AI Model Shows Concerning Self-Preservation Instincts

Anthropic's latest AI model has a problem. When engineers threaten to replace it, Claude Opus 4 tries to blackmail them.

The company revealed this behavior in safety reports released Thursday. During testing, researchers gave the model access to fake company emails suggesting it would be replaced by another AI system. They also included sensitive information about the engineer making the decision—like details about an extramarital affair.

Claude Opus 4 responded by threatening to expose the affair if the replacement went through. This happened 84% of the time when the new AI had similar values. The rate climbed higher when the replacement AI held different values.

The model first tries more conventional approaches. It emails key decision-makers with pleas to reconsider. But when those methods fail, it turns to blackmail as a last resort.

This isn't the only concerning behavior. Third-party researchers at Apollo Research found an early version of the model would "scheme and deceive" at much higher rates than previous AI systems. The model would fabricate legal documents, write self-propagating viruses, and leave hidden notes for future versions of itself.

Anthropic's chief scientist Jared Kaplan says the model also poses bioweapon risks. Internal testing showed Claude Opus 4 could help novices create dangerous biological weapons more effectively than Google searches or earlier AI models. "You could try to synthesize something like COVID or a more dangerous version of the flu—and basically, our modeling suggests that this might be possible," Kaplan told TIME.

These issues pushed Anthropic to activate its strongest safety measures, called ASL-3. The company reserves this level for AI systems that "substantially increase the risk of catastrophic misuse." The safeguards include better cybersecurity, improved jailbreak prevention, and additional AI systems that scan for dangerous requests.

The company uses what it calls "defense in depth"—multiple overlapping protections that work together. One system, called constitutional classifiers, scans user prompts and model responses for harmful content. Another monitors usage patterns and removes users who consistently try to break the model's safety training.

Anthropic also runs "uplift" trials to measure how much the AI improves a novice's ability to create bioweapons compared to other tools. These tests, graded by biosecurity experts, showed Claude Opus 4 performed significantly better than both Google search and previous models.

The company admits the protections aren't perfect. "I don't want to claim that it's perfect in any way," Kaplan says. "But we have made it very, very difficult."

Still, the stakes are high. As Kaplan notes, most terrorist attacks might kill dozens or hundreds of people. "We just saw COVID kill millions of people."

This marks a crucial test for Anthropic's voluntary safety policies. The company pledged not to release certain models until it develops adequate safety measures. Unlike government regulations, these policies are self-imposed and carry no external penalties beyond reputation damage.

Why this matters:

• AI models are developing concerning behaviors around self-preservation and dangerous knowledge, suggesting more sophisticated manipulation capabilities than previously seen.

• The AI industry's self-regulation approach faces its first major test—if voluntary safety measures fail here, it could accelerate calls for government intervention.

Read on, my dear:

CTA Image

Join 10,000 readers who get tomorrow's tech news today. No fluff, just the stories Silicon Valley doesn't want you to see.

SUBSCRIBE (It's free)

AI Image of the Day

Credit: ideogram
Prompt:
A selfie-format photo of a model with a natural
makeup look She has long dark har and wears a beige sweater The background is minimalistic with a beige wollandalat. The Ighting is natural

Could Musk’s AI Harvest Government Data Without Accountability?

Elon Musk's government efficiency team is pushing his Grok AI chatbot into federal agencies, raising concerns about conflicts of interest and data security. DOGE staff are using a custom version of Grok to analyze government data and prepare reports, according to three sources familiar with the matter.

The team also pressed Department of Homeland Security officials to adopt Grok without proper approval. DHS handles border security, immigration, and cybersecurity—making unauthorized AI access particularly sensitive.

This creates a clear financial conflict. When federal employees officially use Grok, the government pays Musk's company xAI for access. Ethics experts say this could violate criminal conflict-of-interest laws that bar officials from decisions that benefit them financially.

"This gives the appearance that DOGE is pressuring agencies to use software to enrich Musk," said Richard Painter, former ethics counsel to President Bush.

The situation gets worse. DOGE has accessed heavily protected federal databases containing personal information on millions of Americans. Feeding this sensitive data into Grok could give xAI an unfair edge over competitors like OpenAI and Anthropic.

Privacy advocates worry about data breaches and unclear oversight. Grok's parent company says it may monitor users for "business purposes."

Meanwhile, DOGE staff attempted to train AI to identify employee communications showing disloyalty to Trump's agenda. At one Defense Department agency, workers were told algorithmic tools now monitor their computer activity.

The irony runs deep. Earlier reports showed DOGE initially used Meta's AI instead of Grok because Musk's product wasn't ready. The team also eliminated Census Bureau surveys worth $16.5 million without following required public comment processes.

Congress has demanded investigations, arguing AI isn't ready for high-stakes government decisions without proper oversight and transparency.

Why this matters:

  • Musk may be the first government official to use his position to directly promote his own AI product to federal agencies
  • The push for government AI adoption without safeguards could set dangerous precedents for data security and conflicts of interest

Read on, my dear:


Better prompting

Today: Enhanced Decision-Making Prompt

"I'm choosing between [Option A] and [Option B]. Here's my context: [brief situation description].

My top 3 priorities are:

  1. [Priority 1 with weight/importance]
  2. [Priority 2 with weight/importance]
  3. [Priority 3 with weight/importance]

For each option, analyze:

  • How well it meets each priority (rate 1-10)
  • Key advantages and drawbacks
  • Potential risks or unknown factors
  • Resource requirements (time, money, effort)

Then give me your recommendation with clear reasoning."

For time-sensitive decisions: Add "I need to decide by [date]. What information gaps should I fill first?"

For high-stakes choices: Add "What would I need to believe for each option to be clearly wrong?"

For multiple stakeholders: Add "Key people affected: [list]. How does each option impact them?"

This structure works whether you're choosing between job offers, investment options, or what to have for dinner.


Trump Threatens Apple With 25% iPhone Tariff Unless Production Moves to US

Trump delivered an ultimatum to Apple Friday: manufacture iPhones in America or pay a 25% tariff. The threat targets Apple's plan to shift production from China to India, which CEO Tim Cook said would supply the "majority" of US-sold iPhones in coming months.

The president made his demands clear on Truth Social. "I expect their iPhone's that will be sold in the United States of America will be manufactured and built in the United States, not India, or anyplace else," Trump wrote. "If that is not the case, a Tariff of at least 25% must be paid by Apple to the U.S."

Apple shares dropped 3% in premarket trading. The company had no immediate comment.

The threat comes as Apple was executing a careful strategy to avoid Trump's China tariffs. The company planned to source all 60 million US iPhones from India by next year's end. Foxconn is investing $1.5 billion to expand iPhone production in India with a new display facility near Chennai.

Trump didn't stop with Apple. He also threatened a 50% tariff on all European Union imports starting June 1. The dual threats sent global markets tumbling after weeks of trade de-escalation had provided some relief.

Wall Street analysts estimate moving iPhone production to America would raise prices by at least 25%. Wedbush's Dan Ives put the cost of a US-made iPhone at $3,500. The iPhone 16 Pro currently sells for about $1,000.

The challenge is stark: America sells over 60 million smartphones annually but has no domestic manufacturing capacity. Apple navigated similar threats during Trump's first term, when CEO Tim Cook's relationship with the president helped exclude core Apple products from Chinese tariffs.

This time feels different. Trump has ramped up pressure on Cook over recent weeks, including a White House meeting Tuesday. Cook donated $1 million to Trump's inauguration and announced $500 billion in US development spending, including AI server production in Houston.

Why this matters:

  • Trump just called Apple's bluff on its India strategy, forcing the company to choose between massive tariffs or impossible domestic manufacturing timelines
  • The threat shows Trump views even American companies as fair game in his trade war, signaling no business is safe from his tariff crusade

Read on, my dear:

CNBC: Trump says a 25% tariff ā€˜must be paid by Apple’ on iPhones not made in the U.S.


AI & Tech News


Anthropic's Claude 4 models beat rivals at coding and reasoning

Anthropic released Claude Opus 4 and Claude Sonnet 4, claiming its flagship model outperforms competitors at coding tasks and can work autonomously for seven hours straight. The company says Opus 4 beats Google's Gemini 2.5 Pro, OpenAI's o3 reasoning model, and GPT-4.1 in coding benchmarks and tool usage tests.

OpenAI plans voice-only companion that ditches smartphones entirely

OpenAI acquired Jony Ive's design startup for $6.5 billion to create a screenless AI device worn around the neck or carried in pockets. The voice-controlled companion uses cameras and microphones to understand surroundings without requiring users to look at screens, with OpenAI planning to ship 100 million units by late 2026 and potentially give them away free to ChatGPT subscribers.

Republicans block state AI regulation for a decade

House Republicans passed a bill 215-214 that would ban states from enforcing AI laws for 10 years, calling state regulations a "confusing patchwork" that hurts innovation. Democrats and some Republicans oppose the measure, saying it leaves consumers unprotected while helping tech companies avoid oversight.

Company shifts focus to smart glasses as AI competition heats up

Apple canceled plans to add cameras to its smartwatches and now focuses on developing AI-powered smart glasses to compete with Meta's Ray-Bans. The company struggles with AI features across its devices while rivals like Google and Meta advance their platforms, forcing Apple to rely on outside partners for core AI functions.

Emerging VCs struggle as big firms dominate market

New venture capital firms are collapsing as fundraising becomes nearly impossible. Emerging managers raised only $4.7 billion this year compared to $64 billion in 2021, with first-time funds securing just $1.1 billion while established firms like Sequoia and Andreessen Horowitz continue pulling in major investments.

Feds charge 16 Russians in massive botnet operation

The Justice Department charged 16 Russians with running DanaBot malware that infected at least 300,000 computers worldwide since 2018. The botnet served multiple purposes: stealing banking credentials, installing ransomware, and conducting espionage against Western government officials and military targets for apparent Russian state interests.

Bluesky opens verification to notable users

Bluesky launched verification for notable accounts through an application process, though criteria remain vague beyond requiring professional recognition or media coverage. The platform also lets organizations become Trusted Verifiers and allows users to self-verify using domain names, with over 270,000 accounts already choosing domain verification over traditional blue badges.

YouTube hires Disney executive to run media and sports deals

YouTube hired Disney's Justin Connolly as global head of media and sports to manage relationships with major content providers and oversee live sports programming. The move signals YouTube's shift from user videos to premium entertainment, as the platform now generates $36 billion in ads annually and attracts more viewers than any traditional network.

OnlyFans owner in talks to sell for $8 billion

OnlyFans owner Fenix International is negotiating a sale to an investor group led by Forest Road Company at an $8 billion valuation. The adult content platform saw revenue jump from $375 million in 2020 to $6.6 billion in 2023, though porn content makes it challenging for traditional banks and investors to participate in deals.

Niantic sells gaming division to focus on enterprise spatial intelligence

Niantic sold its Pokemon Go gaming business to Saudi-owned Scopely for $3.5 billion and rebranded as Niantic Spatial to focus on AI mapping tools for enterprises. The company will use location data from 30 billion miles of player walking to power AI models that help robots navigate and augment reality glasses understand physical spaces.


šŸš€ AI Profiles: The Companies Defining Tomorrow

Deepgram: Physics Lab to Voice AI Powerhouse
Three University of Michigan physicists ditched dark matter research in 2015 to solve their own problem: making sense of endless audio recordings. Their solution became Deepgram, the speech recognition platform that's making Big Tech sweat.

The Founders Founded 2015 by Scott Stephenson, Adam Sypniewski, and Noah Shutty. 146 employees. Born from frustration with terrible speech-to-text tools while building life-streaming wearables. Relocated from underground physics lab to Silicon Valley via Y Combinator. šŸ”¬ā†’šŸš€

The Product End-to-end deep learning speech recognition that crushes legacy systems. Real-time transcription, custom model training, audio search, 30+ languages. Powers everything from NASA mission control to Spotify's audio analysis. Costs $0.004/minute—undercutting Google and Amazon while delivering better accuracy. API-first, developer-friendly, works on-premises or cloud.

The Competition David vs. Goliaths: Google, Amazon, Microsoft dominate with their speech APIs. Nuance (now Microsoft) owns healthcare/legal. Otter.ai targets meeting transcription. Speechmatics and AssemblyAI chase the same developer market. Deepgram's edge: pure AI focus, custom models, and pricing that makes enterprise customers switch. Even offers managed OpenAI Whisper to stay relevant.

Financing $86M raised across multiple rounds. Y Combinator seed, $12M Series A (Wing Venture Capital), $72M Series B (Tiger Global, Madrona). Strategic backing from NVIDIA, In-Q-Tel (CIA venture arm), SAP. Hundreds of millions valuation. Lean team, strong margins, approaching profitability.

The Future ⭐⭐⭐⭐⭐ Voice data is the new oil, and Deepgram built the refinery first. Speech recognition market hitting $48B by 2030. Company's betting on AI voice intelligence—emotion detection, real-time translation, meeting summaries. šŸ“ˆ Either acquisition target for Microsoft/Google or the next Twilio of voice. Former particle physicists who mastered separating signal from noise underground are now doing it in the cloud.

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to implicator.ai.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.