A very good read from a respected source!

FOR EDUCATIONAL AND KNOWLEDGE SHARING PURPOSES ONLY. NOT-FOR-PROFIT. SEE COPYRIGHT DISCLAIMER.

Bio X AI: Policy Recommendations For A New Frontier

12.12.23|27 MIN READ|TEXT BY NAZISH JEFFERY & SARAH R. CARTER & TESSA ALEXANIAN & OLIVER CROOK & SAMUEL CURTIS& RICHARD MOULANGE & SHRESTHA RATH & SOPHIE ROSE
SCIENCE POLICY. DAY ONE PROJECT.

Artificial intelligence (AI) is likely to yield tremendous advances in our basic understanding of biological systems, as well as significant benefits for health, agriculture, and the broader bioeconomy. However, AI tools, if misused or developed irresponsibly, can also pose risks to biosecurity. The landscape of biosecurity risks related to AI is complex and rapidly changing, and understanding the range of issues requires diverse perspectives and expertise. To better understand and address these challenges, FAS initiated the Bio x AI Policy Development Sprintto solicit creative recommendations from subject matter experts in the life sciences, biosecurity, and governance of emerging technologies. Through a competitive selection process, FAS identified six promising ideas and, over the course of seven weeks, worked closely with the authors to develop them into the recommendations included here. These recommendations cover a diverse range of topics to match the diversity of challenges that AI poses in the life sciences. We believe that these will help inform policy development on these topics, including the work of the National Security Commission on Emerging Biotechnologies.

AI tool developers and others have put significant effort into establishing frameworks to evaluate and reduce risks, including biological risks, that might arise from “foundation” models (i.e., large models designed to be used for many different purposes). These include voluntary commitments from major industry stakeholders, and several efforts to develop methods for evaluations of these models. The Biden Administration’s recent Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence (Bioeconomy EO) furthers this work and establishes a framework for evaluating and reducing risks related to AI.

However, the U.S. government will need creative solutions to establish oversight for biodesign tools (i.e., more specialized AI models that are trained on biological data and provide insight into biological systems). Although there are differing perspectives among experts, including those who participated in this Policy Sprint, about the magnitude of risks that these tools pose, they undoubtedly are an important part of the landscape of biosecurity risks that may arise from AI. Three of the submissions to this Policy Sprint address the need for oversight of these tools. Oliver Crook, a postdoctoral researcher at the University of Oxford and a machine learning expert, calls on the U.S. government to ensure responsible development of biodesign tools by instituting a framework for checklist-based, institutional oversight for these tools while Richard Moulange, AI-Biosecurity Fellow at the Centre for Long-Term Resilience, and Sophie Rose, Senior Biosecurity Policy Advisor at the Centre for Long-Term Resilience, expand on the Executive Order on AI with recommendations for establishing standards for evaluating their risks. In his submission, Samuel Curtis, an AI Governance Associate at The Future Society, takes a more open-science approach, with a recommendation to expand infrastructure for cloud-based computational resources internationally to promote critical advances in biodesign tools while establishing norms for responsible development.

Two of the submissions to this Policy Sprint work to improve biosecurity at the interface where digital designs might become biological reality. Shrestha Rath, a scientist and biosecurity researcher, focuses on biosecurity screening of synthetic DNA, which the Executive Order on AI highlights as a key safeguard, and contains recommendations for how to improve screening methods to better prepare for designs produced using AI. Tessa Alexanian, a biosecurity and bioweapons expert, calls for the U.S. government to issue guidance on biosecurity practices for automated laboratories, sometimes called “cloud labs,” that can generate organisms and other biological agents.

This Policy Sprint highlights the diversity of perspectives and expertise that will be needed to fully explore the intersections of AI with the life sciences, and the wide range of approaches that will be required to address their biosecurity risks. Each of these recommendations represents an opportunity for the U.S. government to reduce risks related to AI, solidify the U.S. as a global leader in AI governance, and ensure a safer and more secure future.

An Evidence-Based Approach To Identifying And Mitigating Biological Risks From AI-Enabled Biological Tools

RICHARD MOULANGE & SOPHIE ROSE

Both AI-enabled biological tools and large language models (LLMs) have advanced rapidly in a short time. While these tools have immense potential to drive innovation, they could also threaten the United States’ national security.

AI-enabled biological tools refer to AI tools trained on biological data using machine learning techniques, such as deep neural networks. They can already design novel proteins, viral vectors and other biological agents, and may in the future be able to fully automate parts of the biomedical research and development process.

Sophisticated state and non-state actors could potentially use AI-enabled tools to more easily develop biological weapons (BW) or design them to evade existing countermeasures . As accessibility and ease of use of these tools improves, a broader pool of actors is enabled.

This threat was recognized by the recent Executive Order on Safe AI, which calls for evaluation of all AI models (not just LLMs) for capabilities enabling chemical, biological, radiological and nuclear (CBRN) threats, and recommendations for how to mitigate identified risks.

Developing novel AI-enabled biological tool -evaluation systems within 270 days, as directed by the Executive Order §4.1(b), will be incredibly challenging, because:

  • There appears to have been little progress on developing benchmarks or evaluations for AI-enabled biological tools in academia or industry, and government capacity (in the U.S. and the UK) has so far focused on model evaluations for LLMs, not AI-enabled biological tools.
  • Capabilities are entirely dual-use: for example, tools that can predict which viral mutations improve vaccine targeting can very likely identify mutations that increase vaccine evasion.

To achieve this, it will be important to identify and prioritize those AI-enabled biological tools that pose the most urgent risks, and balance these against the potential benefits. However, government agencies and tool developers currently seem to struggle to:

  • Specify which AI–bio capabilities are the most concerning;
  • Determine the scope of AI–enabled tools that pose significant biosecurity risks; and
  • Anticipate how these risks might evolve as more tools are developed and integrated

Some frontier AI labs have assessed the biological risks associated with LLMs , but there is no public evidence of AI-enabled biological tool  evaluation or red-teaming, nor are there currently standards for developing—or requirements to implement—them. The White House Executive Order will build upon industry evaluation efforts for frontier models, addressing the risk posed by LLMs, but analogous efforts are needed for AI-enabled biological tools.

Given the lack of research on AI-enabled biological tool evaluation, the U.S. Government must urgently stand up a specific program to address this gap and meet the Executive Order directives. Without evaluation capabilities, the United States will be unable to scope regulations around the deployment of these tools, and will be vulnerable to strategic surprise. Doing so now is essential to capitalize on the momentum generated by the Executive Order, and comprehensively address the relevant directives within 270 days.

Recommendations

The U.S. Government should urgently acquire the ability to evaluate biological capabilities of AI-enabled biological tools via a specific joint program at the Departments of Energy (DOE) and Homeland Security (DHS), in collaboration with other relevant agencies.

Strengthening the U.S. Government’s ability to evaluate models prior to their deployment is analogous to responsible drug or medical device development: we must ensure novel products do not cause significant harm, before making them available for widespread public use.

The objective(s) of this program would be:

  1. Develop state-of-the-art evaluations for dangerous biological capabilities
  2. Establish Department of Energy (DOE) sandbox for testing evaluations on a variety of AI-enabled biological tools
  3. Produce standards for performance, structure and securitisation of capability evaluations
  4. Use evaluations of the maturity and capabilities of AI-enabled biological tools to inform U.S. Intelligence Community assessments of potential adversaries’ current bio-weapon capabilities

Implementation 

  • Standing up and sustaining DOE and DHS’s ‘Bio Capability Evaluations’ program will require an initial investment of $2 million USD and $2 million/year until 2030 to sustain. Funding should draw on existing National Intelligence Program appropriations.
  • Supporting DOE to establish a sandbox for conducting ongoing evaluations of AI-enabled biological tools will require investment of $10 million annually. This could be appropriated to DOE under the National Defense Authorization Act (Title II: Research, Development, Test and Evaluation), which establishes funding for AI defense programs.

Lead agencies and organizations

  • U.S. Department of Energy (DOE) can draw on expertise from National Labs, which often evaluate—and develop risk mitigation measures for—technologies with CBRN implications.
  • U.S. Department of Homeland Security (DHS) can inform threat assessments and inform biological risk mitigation strategy and policy.
  • National Institute for Standards and Technology (NIST) can develop the standards for the performance, structure and securitization of dangerous capability evaluations.
  • U.S. Department of Health and Human Services (HHS) can leverage their AI Community of Practice (CoP) as an avenue for communicating with BT developers and researchers. The National Institutes of Health (NIH) funds relevant research and will therefore need to be involved in evaluations.

They should coordinate with other relevant agencies, including but not limited to the Department of Defense, and the National Counterproliferation and Biosecurity Center.

The benefits of implementing this program include:

Leveraging public-private expertise. Public-private partnerships (involving both academia and industry) will produce comprehensive evaluations that incorporate technical nuances and national security considerations. This allows the U.S. Government to retain access to diverse expertise whilst safeguarding the sensitive nature of dangerous capability evaluations contents and output—which is harder to guarantee with third-party evaluators.

Enabling evidence-based regulatory decision-making. Evaluating AI tools allows the U.S. Government to identify the models and capabilities that pose the greatest biosecurity risks, enabling effective and appropriately-scoped regulations. Avoiding blanket regulations results in a better balance of the considerations of innovation and economic growth with those of risk mitigation and security.

Broad scope of evaluation application. AI-enabled biological tools vary widely in their application and current state of maturity. Subsequently, what constitutes a concerning, or dangerous, capability may vary widely across tools, necessitating the development of tailored evaluations.

A Global Compute Cloud To Advance Safe Science And Innovation

SAMUEL CURTIS

Advancements in deep learning have ushered in significant progress in the predictive accuracy and design capabilities of biological design tools (BDTs), opening new frontiers in science and medicine through the design of novel functional molecules. However, these same technologies may be misused to create dangerous biological materials. Mitigating the risks of misuse of BDTs is complicated by the need to maintain openness and accessibility among globally-distributed research and development communities. One approach toward balancing both risks of misuse and the accessibility requirements of development communities would be to establish a federally-funded and globally-accessible compute cloud through which developers could provide secure access to their BDTs.

The term “biological design tools” (or “BDTs”) is a neologism referring to “systems trained on biological data that can help design new proteins or other biological agents.” Computational biological design is, in essence, a data-driven optimization problem. Consequently, over the past decade, breakthroughs in deep learning have propelled progress in computational biology. Today, many of the most advanced BDTs incorporate deep learning techniques and are used and developed by networks of academic researchers distributed across the globe. For example, the Rosetta Software Suite, one of the most popular BDT software packages, is used and developed by Rosetta Commons—an academic consortium of over 100 principal investigators spanning five continents.

Contributions of BDTs to science and medicine are difficult to overstate. There are already several AI-designed molecules in early-stage clinical trials. BDTs are now used to identify new drug targets, design new therapeutics, and construct faster and less expensive drug synthesis techniques. There are already several AI-designed molecules in early-stage clinical trials.

Unfortunately, these same BDTs can be used for harm. They may be used to create pathogens that are more transmissible or virulent than known agents, target specific sub-populations, or evade existing DNA synthesis screening mechanisms. Moreover, developments in other classes of AI systems portend reduced barriers to BDT misuse. One group at RAND Corporation found that language models could provide guidance that could assist in planning and executing a biological attack, and another group from MIT demonstrated how language models could be used to elicit instructions for synthesizing a potentially pandemic pathogen. Similarly, language models could accelerate the acquisition or interpretation of information required to misuse BDTs. Technologies on the horizon, such as multimodal “action transformers,” could help individuals navigate BDT software, further lowering barriers to misuse.

Research points to several measures BDT developers could employ to reduce risks of misuse, such as securing machine learning model weights (the numerical values representing the learned patterns and information that the model has acquired during training), implementing structured access controls, and adopting Know Your Customer (KYC) processes. However, precaution would have to be taken to not unduly limit access to these tools, which could, in aggregate, impede scientific and medical advancement. For any given tool, access limitations risk diminishing its competitiveness (its available features and performance relative to other tools). These tradeoffs extend to their developers’ interests, whereby stifling the development of tools may jeopardize research, funding, and even career stability. The difficulties of striking a balance in managing risk are compounded by the decentralized, globally-distributed nature of BDT development communities. To suit their needs, risk-mitigation measures should involve minimal, if any, geographic or political restrictions placed on access while simultaneously expanding the ability to monitor for and respond to indicators of risk or patterns of misuse.

One approach that would balance the simultaneous needs for accessibility and security would be for the federal government to establish a global compute cloud for academic research, bearing the costs of running servers and maintaining the security of the cloud infrastructure in the shared interests of advancing public safety and medicine. A compute cloud would enable developers to provide access to their tools through computing infrastructure managed—and held to specific security standards—by U.S. public servants. Such infrastructure could even expand access for researchers, including underserved communities, through fast-tracked grants in the form of computational resources.

However, if computing infrastructure is not designed to reflect the needs of the development community—namely, its global research community—it is unlikely to be adopted in practice. Thus, to fully realize the potential of a compute cloud among BDT development communities, access to the infrastructure should extend beyond U.S. borders. At the same time, the efforts should ensure the cloud has requisite monitoring capabilities to identify risk indicators or patterns of misuse and impose access restrictions flexibly. By balancing oversight with accessibility, a thoughtfully-designed compute cloud could enable transparency and collaboration while mitigating the risks of these emerging technologies.

Recommendations

The U.S. government should establish a federally-funded, globally-accessible compute cloud through which developers could securely provide access to BDTs. In fact, the Biden Administration’s October 2023 “Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence” (the “AI EO”) lays groundwork by establishing a pilot program of a National AI Research Resource (NAIRR)—a shared research infrastructure providing AI researchers and students with expanded access to computational resources, high-quality data, educational tools, and user support. Moving forward, to increase the pilot program’s potential for adoption by BDT developers and users, relevant federal departments and agencies should take concerted action in the timelines circumscribed by the AI EO to address the practical requirements of BDT development communities: the simultaneous need to expand access outside U.S. borders while bolstering the capacity to monitor for misuse.

It is important to note that a federally-funded compute cloud has been years in the making. The National AI Initiative Act of 2020 directed the National Science Foundation (NSF), in consultation with the Office of Science and Technology Policy (OSTP), to establish a task force to create a roadmap for the NAIRR. In January 2023, the NAIRR Task Force released its final report, “Strengthening and Democratizing the U.S. Artificial Intelligence Innovation Ecosystem,” which presented a detailed implementation plan for establishing the NAIRR. The Biden Administration’s AI EO then directed the Director of NSF, in coordination with the heads of agencies deemed appropriate by the Director, to launch a pilot program “consistent with past recommendations of the NAIRR Task Force.”

However, the Task Force’s past recommendations are likely to fall short of the needs of BDT development communities (not to mention other AI development communities). In its report, the Task Force described NAIRR’s primary user groups as “U.S.-based AI researchers and students at U.S. academic institutions, non-profit organizations, Federal agencies or FFRDCs, or startups and small businesses awarded [Small Business Innovation Research] or [Small Business Technology Transfer] funding,” and its resource allocation process is oriented toward this user base. Separately, Stanford University’s Institute for Human-centered AI (HAI) and the National Security Commission on Artificial Intelligence (NSCAI) have proposed institutions, building upon or complementing NAIRR, that would support international research consortiums (a Multilateral AI Research Institute and an International Digital Democracy Initiative, respectively), but the NAIRR Task Force’s report—upon which the AI EO’s pilot program is based—does not substantively address this user base.

In launching the NAIRR pilot program under Sec. 5.2(a)(i), the NSF should put the access and security needs of international research consortiums front and center, conferring with heads of departments and agencies with relevant scope and expertise, such as the Department of State, US Agency for International Development (USAID), Department of Education, the National Institutes of Health, and the Department of Energy. The NAIRR Operating Entity (as defined in the Task Force’s report) should investigate how funding, resource allocation, and cybersecurity could be adapted to accommodate researchers outside of U.S. borders. In implementing the NAIRR pilot program, the NSF should incorporate BDTs in their development of guidelines, standards, and best practices for AI safety and security, per Sec. 4.1, which could serve as standards with which NAIRR users should be required to comply. Furthermore, the NSF Regional Innovation Engine launched through Sec. 5(a)(ii) should consider focusing on international research collaborations, such as those in the realm of biological design.

Besides the NSF, which is charged with piloting NAIRR, relevant departments and agencies should take concerted action in implementing the AI EO to address issues of accessibility and security that are intertwined with international research collaborations. This includes but is not limited to:

  • In accordance with Sec. 5.2(a)(i), the departments and agencies listed above should be tasked with investigating the access and security needs of international research collaborations and include these in the reports they are required to submit to the NSF. This should be done in concert with the development of guidelines, standards, and best practices for AI safety and security required by Sec. 4.1.
  • In fulfilling the requirements of Sec. 5.2(c-d), the Under Secretary of Commerce for Intellectual Property, the Director of the United States Patent and Trademark Office, and the Secretary of Homeland Security should, in the reports and guidance on matters related to intellectual property that they are required to develop, clarify ambiguities and preemptively address challenges that might arise in the cross-border data use agreements.
  • Under the terms of Sec. 5.2(h), the President’s Council of Advisors on Science and Technology should, in its development of “a report on the potential role of AI […] in research aimed at tackling major societal and global challenges,” focus on the nature of decentralized, international collaboration on AI systems used for biological design.
  • Pursuant to Sec. 11(a-d), the Secretary of State, the Assistant to the President for National Security Affairs, the Assistant to the President for Economic Policy, and the Director of OSTP should focus on AI used for biological design as a use case for expanding engagements with international allies and partners, and establish a robust international framework for managing the risks and harnessing the benefits of AI. Furthermore, the Secretary of Commerce should make this use case a key feature of its plan for global engagement in promoting and developing AI standards.

The AI EO provides a window of opportunity for the U.S. to take steps toward mitigating the risks posed by BDT misuse. In doing so, it will be necessary for regulatory agencies to proactively seek to understand and attend to the needs of BDT development communities, which will increase the likelihood that government-supported solutions, such as the NAIRR pilot program—and potentially future fully-fledged iterations enacted via Congress—are adopted by these communities. By making progress toward reducing BDT misuse risk while promoting safe, secure access to cutting-edge tools, the U.S. could affirm its role as a vanguard of responsible innovation in 21st-century science and medicine.

1 How Will AI Enable Automated Labs?

At present, few biological processes can be carried out using laboratory robotics, but AI will enable automated labs in several ways: 1. More protocols will be automated, due to the strong incentive created by the need for large datasets to train AI tools. Recent projects in the space include the and a collaboration between Ginkgo Bioworks and Google Cloud. 2. More scientists will be able to use laboratory robotics, since language models will allow them to specify protocols without learning device-specific coding languages. 3. New kinds of experiments will be enabled by advanced AI models, which conduct independent experiments through autonomous science and self-driving labs.

FOR EDUCATIONAL AND KNOWLEDGE SHARING PURPOSES ONLY. NOT-FOR-PROFIT. SEE COPYRIGHT DISCLAIMER.