China's EUV prototype isn't a technological defeat for the West. It's a counterintelligence one. The vector isn't smuggled crates. It's people. Europe discovered, again, that openness without defense is vulnerability, not virtue.
Chinese scientists built a working EUV prototype using former ASML engineers and secondary-market parts. The machine generates light but hasn't produced chips. ASML took 18 years from prototype to production. Beijing wants 3-5. The math doesn't add up.
Zuckerberg paid $14 billion for Scale AI's founder to lead Meta's AI push. But Wang built a data labeling company, not a research lab. The Financial Times reports tensions mounting as Turing Award winner Yann LeCun heads for the exit.
Even the most advanced AI visual systems have a serious problem: They try to answer questions they can't actually solve. A new study from the University of Tokyo and their collaborators tested leading AI models on what seems like a simple task - knowing when to say "I can't answer that."
The results revealed a concerning gap between what these systems claim to understand and what they truly comprehend. The researchers created three types of impossible questions. They removed correct answers from multiple choice options, showed images that had nothing to do with the questions being asked, or provided completely irrelevant answer choices. A reliable AI system should recognize these situations and decline to answer.
But that's not what happened. While these same AI models score impressively on standard tests, they performed dismally when faced with impossible questions. Many open-source models got scores below 6% when they should have said "I can't answer this."
Credit: The University of Tokyo
The gap between closed-source models (like GPT-4 Vision) and open-source alternatives proved particularly stark. While GPT-4 Vision managed to identify unsolvable questions about 60% of the time, popular open-source models like CogVLM2 scored below 1% - despite both performing similarly well on standard tests.
"This suggests that our community's efforts to improve performance on existing benchmarks do not directly contribute to enhancing model reliability," the researchers note. In other words, we've been teaching AI to guess even when it shouldn't.
The study uncovers different failure patterns among models. Some struggle specifically with visual tasks, while others have trouble with basic reasoning about whether questions are answerable. The researchers found that adding explicit instructions to consider whether questions were impossible helped some models but made others perform worse.
Looking ahead, the team suggests that future AI development needs to focus not just on getting right answers, but on knowing when getting an answer isn't possible.
Why this matters:
Current AI visual systems are overconfident - they'll try to answer questions even when they can't possibly know the answer
We need new ways to measure AI reliability beyond just accuracy scores on standard tests
Tech translator with German roots who fled to Silicon Valley chaos. Decodes startup noise from San Francisco. Launched implicator.ai to slice through AI's daily madness—crisp, clear, with Teutonic precision and sarcasm.
E-Mail: marcus@implicator.ai
Cloudflare's 2025 data shows Googlebot ingests more content than all other AI bots combined. Publishers who want to block AI training face an impossible choice: lose search visibility entirely. The structural advantage runs deeper than most coverage acknowledged.
Stanford's AI hacker cost $18/hour and beat 9 of 10 human pentesters. The headlines celebrated a breakthrough. The research paper reveals an AI that couldn't click buttons, mistook login failures for success, and required constant human oversight.
Microsoft analyzed 37.5M Copilot conversations. Health queries dominated mobile usage every hour of every day. Programming's share collapsed. The data shows users want a confidant, not a productivity tool. The industry built for the boardroom anyway.
64% of teens use AI chatbots. But which ones? Higher-income teens cluster around ChatGPT for productivity. Lower-income teens are twice as likely to use Character.ai—the companion bot facing wrongful death lawsuits. The technology is sorting kids by class.