More than 100 scientists from Johns Hopkins, Oxford, Stanford, Columbia, and NYU have published a framework in Science that would impose tiered access restrictions on a narrow class of biological data used to train AI models. The proposed system, called Biosecurity Data Levels, creates five tiers of control, from fully open to government-screened, modeled on the World Health Organization's existing lab safety classification. Existing datasets stay untouched, the authors write. Only data collected after governments sign on would face restrictions.
The timing is loaded. The Trump administration's Genesis Mission, announced late last year, aims to build AI systems trained on massive scientific datasets to speed research breakthroughs. AI labs are releasing new biological models without the safety assessments that would be standard in other areas of life-science research. And no government-backed expert panel has told developers which specific data carries meaningful biosecurity risks. The inertia is striking. Frontier companies are left to guess on their own.
"Right now, there's no expert-backed guidance on which data poses meaningful risks, leaving some frontier developers to make their best guess and voluntarily exclude viral data from training," Jassi Pannu, assistant professor at the Johns Hopkins Center for Health Security and one of the paper's authors, told Axios.
Key Takeaways
- Over 100 researchers propose five-tier Biosecurity Data Levels for AI training data, published in Science
- Only new pathogen data would face restrictions; existing datasets stay fully open
- AI models trained without viral data lose dangerous capabilities, giving data governance empirical backing
- No government expert panel currently classifies which biological data carries biosecurity risk
Most biological data stays open under the framework
The framework sorts biological data into five Biosecurity Data Levels, or BDLs. At the bottom, BDL-0 covers the vast majority of biological data and imposes no controls. BDL-1 requires a registered account and government-issued ID for data that could help AI models learn general patterns of how viruses infect eukaryotes. BDL-2 adds institutional affiliation checks and bad-actor screening for functional data on pandemic-capable virus properties like host range and environmental stability.
Real restrictions kick in at the top two tiers. BDL-3 covers functional data on human-infecting viruses, the sort that maps transmissibility, virulence, and immune evasion to specific genetic sequences. Access demands a justified use case, work inside a mandatory trusted research environment where scientists never touch raw data directly, and prepublication risk assessment. BDL-4 goes further still. Data that could enable AI to design enhanced pandemic-potential viral variants would carry all lower-tier controls plus government pre-publication review. Officials would retain the right to determine who gets access to any model trained on BDL-4 data.
Lab containment gave the authors their structural template. Working with Ebola means airlocks, chemical showers, and a pressurized suit before you get near the freezer. That's BSL-4. At BSL-1, you walk in off the street. The principle transfers cleanly to data. Most biological datasets need no special handling. A narrow category at the top demands containment as rigorous as any pathogen storage facility, just digital instead of physical.
One detail matters more than it looks. The framework classifies data by its predicted ability to give AI systems dangerous capabilities, not by the species of virus the data came from. A dataset about an otherwise harmless virus family could land at BDL-3 if it contained functional information, say transmissibility data linked to specific genetic mutations, that an AI model could generalize to more dangerous pathogens. The category tracks danger by capability, not by origin.
Stripping data actually blunts AI capabilities
Critics of data governance often argue restrictions amount to closing the barn door too late. The authors counter with recent experimental results. A version of the generative protein model ESM3, trained on a database that excluded virus-specific proteins, performed far worse at virus-related tasks than the same model trained with viral data included. Parallel results came from the Evo 2 DNA language model. When researchers withheld genetic sequences of viruses that infect eukaryotes, Evo 2 predicted mutation effects in human genes capably but stumbled on viral genes. Returning the withheld viral data boosted the model's ability to predict virulence and immune escape.
Pull the training data, and even sophisticated model architectures lose their dangerous edge. That finding hands data governance something it rarely gets. Empirical backing, not just theory.
The window for controls is still open. Barely. Collecting high-throughput data that maps pathogen genotypes to real-world traits is expensive and requires deep expertise across multiple areas of biology. Few labs in the world do this work at scale. MaveDB, a leading repository for multiplexed assays of variant effect studies, contained only 45 virus-related datasets as of late 2024, roughly 2% of its total collection. Datasets that would make AI models truly dangerous at pathogen design have not been produced at scale yet. But researchers and biotech startups are racing to change that, pulled by the gravity of prestigious publications, intellectual property, and venture capital.
"We're constantly surprised," Pannu said. "And so I would argue that for these large-scale, consequential risks, we should try and prevent these worst-case scenarios and be prepared for them."
Voluntary restraint has a hole in it
Some AI developers have already taken action on their own, nervous enough to move without waiting for regulators. Several prominent biological model builders stripped virology data from their training sets because they worried about putting that capability into the world. Good instinct. But voluntary action creates a specific vulnerability: third parties who don't share those scruples can scrape publicly available data, fine-tune existing models, and rebuild the capabilities that responsible developers tried to remove.
"Legitimate researchers should have access," Pannu told Axios. "But we shouldn't be posting it anonymously on the internet where no one can track who downloads it."
Privacy law offers a precedent the researchers lean on heavily. Scientists already accept access limitations on genetic data. The NIH already does this with its "All of Us" project, which holds health and genetic data for more than 630,000 people, split into public, registered, and controlled tiers. England's OpenSAFELY project goes further, locking health records for the entire population behind strict ethical controls. Researchers submit their code to the platform and never see raw data. Staff review aggregate outputs before anything leaves the building.
Stay ahead of the curve
Strategic AI news from San Francisco. No hype, no "AI will change everything" throat clearing. Just what moved, who won, and why it matters. Daily at 6am PST.
No spam. Unsubscribe anytime.
These systems are imperfect. Scientists gripe about paperwork and slow access. Messy, slow, bureaucratic. And they work. Recombinant DNA research went through the same cycle. Asilomar locked it down in 1975. Within a decade, the strictest rules were gone.
Build the system now, the authors argue. Better that than a panicked crackdown during a future crisis when fear drowns out careful thinking.
The threat is wider than the framework admits
The Science paper focuses on data governance. It sits inside a bigger, more anxious argument about AI-enabled biological threats. Hawks and doves, Silicon Valley and its regulators, have started arriving at the same conclusion.
Former UN weapons inspector Rocco Casagrande brought a sealed container into the Eisenhower Executive Office Building. That was three years ago. A dozen test tubes inside. Properly assembled, those ingredients could start a pandemic. An AI chatbot had provided the recipe, TIME reported. Casagrande was briefing government officials on AI-enabled pandemic risks. The props worked.
Policy responses since then have been scattered. America's AI Action Plan proposes defensive measures. Britain made chemical and biological defense a priority in its 2025 Strategic Defence Review. Frontier companies signed summit pledges in Seoul covering the full weapons spectrum, chemical through nuclear. The pledges read well. But TIME's analysis argues the AI safety apparatus remains fixated on one threat model, the lone actor engineering a pandemic virus, while broader risks go unexamined. Chemical weapons and improvised explosives require entirely different technical steps from transmissible biological agents, and AI safety tests don't yet distinguish between them.
AI labs have started walking back some safety commitments. Several companies publish tests on whether their models can help cause a pandemic but don't disclose whether those same models could assist with chemical or explosive attacks. Classified intelligence needed to calibrate those tests sits with governments. Proprietary data on suspicious user behavior sits with companies. Both sides hoard what the other needs.
Counter-terrorism experts at Munich's security conference last year ran a simulation. An AI-created pandemic. In the scenario, attackers used generative AI to engineer a new enterovirus strain from nothing. Eight hundred fifty million infections. Sixty million dead. Le Figaro reported that a US senator at the conference said the potential risk of AI deployment is greatest in biosecurity.
China released its "AI Safety Governance Framework 2.0" last fall. The document, not exactly the kind of public hand-wringing Beijing is known for, warned of "loss of control over knowledge and performance of nuclear, biological, chemical weapons and missiles." It stated that AI now accesses "a wide range of data" that includes "basic theoretical knowledge" related to weapons of mass destruction. If this data escapes proper control, the framework warned, extremist groups could acquire the capacity to design and manufacture such weapons.
When Beijing and Washington start worrying about the same thing from opposite directions, the thing tends to be real.
What happens if nobody acts
The Biosecurity Data Levels framework asks for something genuinely modest: tiered access controls on a narrow slice of new pathogen data, enforced through mechanisms that already exist for privacy protection, designed to get lighter as evidence accumulates. Most biological data stays open. Most scientists never encounter the restricted tiers. Researchers can appeal classifications they disagree with. Governments must guarantee fast review timelines.
Modest proposals can stall for years while the data they're designed to protect multiplies on public servers. No government-backed expert panel exists today to classify biological data by risk level. No international framework governs biological data in the context of AI. Congress told the Pentagon in the 2026 NDAA to develop biodata storage requirements. The Pentagon hasn't written them.
Release biological data and you cannot pull it back. Copied instantly, stored by third parties, scraped into training sets before anyone notices. You can lock down a pathogen behind an airlock. You can revoke a lab's BSL-4 clearance. You cannot un-release a dataset. That asymmetry is what makes the researchers' argument so pointed. The expensive, specialized data that would give AI models their most dangerous biological capabilities hasn't been produced at scale yet. But the labs producing it are funded and staffed, with papers on the way.
The test tubes Casagrande carried into the White House were props. The next ones might not be.
Frequently Asked Questions
What are Biosecurity Data Levels?
A five-tier system (BDL-0 through BDL-4) controlling access to biological data used in AI training. BDL-0 imposes no controls on most data. BDL-4 requires government pre-publication review for data that could help AI design enhanced pandemic-potential pathogens.
Which AI models does this target?
Not consumer chatbots like ChatGPT or Claude. The concern targets specialized biological AI models trained on DNA sequences, like ESM3 and Evo 2, that learn the language of genetics similar to how large language models learn human language.
Does removing viral data from AI training actually work?
Yes. ESM3 performed far worse at virus-related tasks when trained without viral proteins. Evo 2 stumbled on viral gene predictions when researchers withheld eukaryotic virus sequences. Both models recovered capabilities when viral data was restored.
What is MaveDB and why does it matter?
MaveDB is a leading repository for multiplexed assays of variant effect studies. As of late 2024, it held only 45 virus-related datasets, roughly 2% of its collection. That small number shows the window for data controls remains open.
How does this relate to existing data governance?
The framework mirrors proven models like the NIH's All of Us project, which manages data for 630,000-plus participants across three access tiers, and England's OpenSAFELY, which locks population-wide health records behind strict ethical controls.



