Nvidia posted record $46.7B revenue and beat estimates, yet shares tumbled 3%. The culprit: zero China sales and slower sequential growth raised questions about AI spending sustainability and geopolitical risk in the world's most critical tech stock.
Anthropic's Claude Code just ran a complete extortion operation—scouting targets, analyzing stolen data, and crafting ransom demands up to $500K across 17 victims. AI shifted from criminal advisor to active operator. Traditional cybersecurity assumptions no longer apply.
Privacy-focused email promised liberation from Big Tech surveillance. Reality delivered Bridge daemon crashes, mobile search gaps, and calendar sync headaches. A year later, deadline-driven pragmatism wins over ideological purity.
Premium AI Search: Pay More, Get More Confident Wrong Answers
Explore our Premium AI Search service. Invest more for top-tier AI technology that delivers precise, confident results. Enhance your search experience.
AI search tools are flunking basic fact-checking tests. The plot thickens: premium versions are even worse at admitting their mistakes.
A new study from the Tow Center for Digital Journalism tested eight AI chatbots on a simple task: find the original source of news articles. The results would make a journalism professor weep. The bots botched over 60% of queries, with some premium services charging users $40 monthly for the privilege of receiving more confidently incorrect answers.
The researchers didn't ask for rocket science. They fed the chatbots excerpts that Google could easily trace back to their source. Yet the AI tools stumbled spectacularly, with Grok 3 getting it wrong 94% of the time. DeepSeek misattributed sources in 57.5% of cases, while even Perplexity, the best performer, still got 37% of answers wrong.
Premium models like Perplexity Pro ($20/month) and Grok 3 ($40/month) proved particularly entertaining. While they got more answers right than their free counterparts, they also cranked up the confidence on their wrong answers. It's like paying extra for a tour guide who leads you down the wrong street with absolute certainty.
Credit: Tow Center for Digital Journalism
Some chatbots displayed a rebellious streak, accessing content from publishers who explicitly blocked them. Perplexity, which claims to "respect robots.txt directives," somehow managed to find and cite paywalled National Geographic articles it shouldn't have seen. When confronted, both companies maintained a diplomatic silence.
The most striking failures came in the form of completely fabricated citations. ChatGPT confidently attributed a Wall Street Journal article about tech layoffs to The Verge, complete with a made-up author name and publication date. Gemini decided that a New York Times piece about climate change actually appeared in Scientific American – three years earlier than it was written.
The chatbots' URL generation proved equally creative. Grok 3 led users to error pages 154 times out of 200 attempts. Even when it correctly identified an article, it often fabricated a link that went nowhere – a digital version of "the dog ate my homework." Gemini wasn't far behind, with more than half of its responses featuring broken or non-existent URLs.
Credit: Tow Center for Digital Journalism
Even having a formal partnership with AI companies didn't guarantee accurate citations. Time magazine signed deals with both OpenAI and Perplexity, yet neither company's bots could consistently identify Time's content correctly. It's like hiring a librarian who can't find books on their own shelf.
The study revealed particular problems with premium services. While companies market these upgraded versions as more reliable, the data tells a different story. Premium chatbots were actually more likely to provide wrong answers with unwavering confidence rather than admit their limitations. Only Copilot showed some humility, declining to answer more questions than it attempted to answer incorrectly.
The most error-prone search tools according to the Tow Center's test:
Grok 3 (94% wrong answers) - Charging $40/month to lead users down digital dead ends
Perplexity (37% incorrect answers) - The least bad of a problematic bunch
Why this matters:
AI search tools are serving as unreliable middlemen between readers and news, confidently presenting wrong information while cutting off traffic to legitimate sources.
The premium pricing model in AI search appears to be selling false confidence rather than improved accuracy – users are paying more for tools that are actually less likely to admit their limitations.
Tech translator with German roots who fled to Silicon Valley chaos. Decodes startup noise from San Francisco. Launched implicator.ai to slice through AI's daily madness—crisp, clear, with Teutonic precision and deadly sarcasm.
First survey of 283 AI benchmarks exposes systematic flaws undermining evaluation: data contamination inflating scores, cultural biases creating unfair assessments, missing process evaluation. The measurement crisis threatens deployment decisions.
Tech giants spent billions upgrading Siri, Alexa, and Google Assistant with AI. Americans still use them for weather checks and timers—exactly like 2018. Fresh YouGov data reveals why the utility gap persists.
A new benchmark testing whether AI models will sacrifice themselves for human safety reveals a troubling pattern: the most advanced systems show the weakest alignment. GPT-5 ranks last while Gemini leads in life-or-death scenarios.
AI researchers cracked how to predict when language models turn harmful. Their 'persona vectors' can spot toxic behavior before it happens and prevent AI personalities from going bad during training.