Los Alamos, OpenAI join to boost frontier model safety

2024-07-10 — Credit to: Image created in DALL-E by Nick Njegomir.

Researchers at Los Alamos National Laboratory are working with OpenAI on an evaluation study to bolster artificial intelligence safety. The upcoming evaluation will be the first of its kind and contribute to state-of-the-art research on AI biosecurity evaluations.

"The potential upside to growing AI capabilities is endless,” said Erick LeBrun, research scientist at Los Alamos. “However, measuring and understanding any potential dangers or misuse of advanced AI related to biological threats remain largely unexplored. This work with OpenAI is an important step towards establishing a framework for evaluating current and future models, ensuring the responsible development and deployment of AI technologies.”

AI-enabled biological threats could pose a significant risk, but existing work has not assessed how multimodal, frontier models could lower the barrier of entry for non-experts to create a biological threat. The team’s work will build upon previous work and follow OpenAI’s Preparedness Framework, which outlines an approach to tracking, evaluating, forecasting and protecting against emerging biological risks.

In previous evaluations, the research team found that ChatGPT-4 provided a mild uplift in providing information that could lead to the creation of biological threats. However, these experiments focused on human performance in written tasks (rather than biological benchwork) and model inputs and outputs were limited to text, which excluded vision and voice data.

Using proxy tasks and materials, the upcoming evaluation will be the first experiment to test multimodal frontier models in a lab setting by assessing experts’ abilities to perform and troubleshoot a safe protocol consisting of standard laboratory experimental tasks.

By examining the uplift in task completion and accuracy enabled by ChatGPT-4o (the latest version of ChatGPT), the team is looking to quantify and assess how frontier models can assist in real-world biological tasks.

“As a private company dedicated to serving the public interest, we’re thrilled to announce a first-of-its kind partnership with Los Alamos National Laboratory to study bioscience capabilities,” said Mira Murati, OpenAI’s Chief Technology Officer. “This partnership marks a natural progression in our mission, advancing scientific research, while also understanding and mitigating risks.”

These new evaluations will support the recent White House Executive Order on the Safe, Secure and Trustworthy Development and Use of Artificial Intelligence tasks where the Department of Energy national laboratories have been entrusted to help evaluate the capabilities of AI frontier models. DOE has the unique data, leadership computing and workforce to tackle these challenges, and is well suited to partner with industry on these efforts.

Additionally, Los Alamos established the AI Risks and Threat Assessments Group (AIRTAG) to focus on developing strategies to understand benefits and mitigate risks and help promote the secure deployment of AI tools.

“This type of cooperation is a great example of the type of work that AIRTAG is trying to foster to help understand AI risk, and ultimately making AI technology safer and more secure,” said Nick Generous, deputy group leader for Information Systems and Modeling at Los Alamos.

LA-UR-24-26293