DOE/LANL Jurisdiction Fire Danger Rating:
  1. LANL Home
  2. media
  3. publications
  4. 1663
March 31, 2025

Translate Fortran to C++ with AI and RAG

Scientists are using artificial intelligence and large language models to rewrite old code in modern languages.

  • Kyle Dickman, Science Writer
Credit to: Jacob Hassett

Download a print-friendly version of this article.

Across the Lab, much of the work being done on AI is focused on developing new models to interpret scientific data. But Dan O’Malley, a coder in Earth and Environmental Sciences, is harnessing the power of existing large language models (LLMs) to translate and modernize useful codes. Specifically, he and his 20-person team have a goal to demonstrate that AI is capable of translating some of the tens of millions of lines of Lab code written in Fortran, an older coding language, into C++, a code language that runs better on modern computers. “Being dependent on Fortran alone is risky these days,” says O’Malley. “Nobody wants to throw away the code their teams have spent years or decades developing, but translating the code into C++ is time-consuming and not the most exciting work.” 

That’s where AI comes in. O’Malley is taking open-source LLMs, running them on Lab computers, and plying the models with a technique called retrieval-augmented generation (RAG), where generative models are enhanced with data from external sources. The idea is to train the models to translate from Fortran to C++ using what’s known as few-shot learning. LLMs learn through exposure to patterns. O’Malley takes pieces of Fortran code that have been carefully and skillfully translated by a human coder into C++ and feeds both to the LLM. The model observes the underlying logic between the translation—when, where, and possibly why a human translator opted for one approach over another, for example. Because the model is pre-trained on a huge dataset that teaches it many things, often just a single example is needed to dramatically improve the LLM’s ability to pick up on code translation patterns. 

O’Malley began the project six months ago by exposing the LLMs to small datasets of between 1000 and 1200 lines of code. “When I feed it good translations, it can pick up the style of different coders, and then replicate the style of each coder in its own translation,” O’Malley says. If they’re able to establish a reliable methodology, within the next few years, the timesaving demonstration could be a technique shared with coders and scientists across the Lab. 

“There’s almost no field of science that isn’t being changed by AI, and all our missions at the Laboratory are using it in some capacity.”
—Thom Mason, 
Director of Los Alamos National Laboratory
, Santa Fe New Mexican, 2024

People also ask

  • What is retrieval-augmented generation? Retrieval-augmented generation or RAG is a little like a customizable way to train—you can think of that as teaching—powerful AI models. At the most basic level, Large Language Models, a fancy term for programs like Chat GPT or BERT, generate answers by referencing enormous general datasets: the actual internet. RAG, though, lets programmers focus the models. So instead of generating answer from its entire training dataset (e.g. the internet), the models key in on the datasets provided by programmers, allowing them to “learn” more deeply and provide better answers to specialized questions.   
  • How good is AI translation? The rule of thumb is that AI translation is cheap and quick, but without additional human tinkering, the models can miss cultural references, idioms, or metaphors. For translations in computer programming languages, that rule of thumb holds, but in both application, AI is a powerful tool in the hands of a skilled translator. 

Share

Stay up to date
Subscribe to 1663 magazine for expert insights on groundbreaking research initiatives and innovations advancing both national-security programs and basic science.
Subscribe Now

More 1663 Stories

1663 Home
80 Yrs Ww Hero Base 002 1

What and Why: Los Alamos Discoveries

80+ years of game-changing science and engineering

80yrs Nuclear and Particle Futures Neutrons Marquee 002

Discovery of a Lifetime

Los Alamos scientists measured the neutron lifetime with record accuracy.

80yrs Nuclear and Particle Futures Neutrinos Marquee 002

From Ghost Particle to Cosmic Messenger

Los Alamos has a long legacy of neutrino science

80yrs Nuclear and Particle Futures Nuclear Applications Marquee 002

Fission, Fusion, and the Data Behind Both

Decades of Los Alamos discoveries, from the birth of fission to the edge of fusion.

80yrs Nuclear and Particle Futures Cosmic Explosions Marquee 002

The Universe’s Brightest Mystery

Los Alamos scientists turned nonproliferation instruments into tools of cosmic revelation.

80yrs Information Science and Technology Nonlinear Dynamics Marquee 002

The Science of Unpredictability

A Nobel laureate, a brilliant programmer, and two unexpected discoveries—the rise of nonlinear dynamics at Los Alamos.

Follow us

Keep up with the latest news from the Lab