DOE/LANL Jurisdiction Fire Danger Rating:
  1. LANL Home
  2. media
  3. publications
  4. 1663
March 31, 2025

Translate Fortran to C++ with AI and RAG

Scientists are using artificial intelligence and large language models to rewrite old code in modern languages.

  • Kyle Dickman, Science Writer
Translate Marquee
Credit to: Jacob Hassett

Download a print-friendly version of this article.

Across the Lab, much of the work being done on AI is focused on developing new models to interpret scientific data. But Dan O’Malley, a coder in Earth and Environmental Sciences, is harnessing the power of existing large language models (LLMs) to translate and modernize useful codes. Specifically, he and his 20-person team have a goal to demonstrate that AI is capable of translating some of the tens of millions of lines of Lab code written in Fortran, an older coding language, into C++, a code language that runs better on modern computers. “Being dependent on Fortran alone is risky these days,” says O’Malley. “Nobody wants to throw away the code their teams have spent years or decades developing, but translating the code into C++ is time-consuming and not the most exciting work.” 

That’s where AI comes in. O’Malley is taking open-source LLMs, running them on Lab computers, and plying the models with a technique called retrieval-augmented generation (RAG), where generative models are enhanced with data from external sources. The idea is to train the models to translate from Fortran to C++ using what’s known as few-shot learning. LLMs learn through exposure to patterns. O’Malley takes pieces of Fortran code that have been carefully and skillfully translated by a human coder into C++ and feeds both to the LLM. The model observes the underlying logic between the translation—when, where, and possibly why a human translator opted for one approach over another, for example. Because the model is pre-trained on a huge dataset that teaches it many things, often just a single example is needed to dramatically improve the LLM’s ability to pick up on code translation patterns. 

O’Malley began the project six months ago by exposing the LLMs to small datasets of between 1000 and 1200 lines of code. “When I feed it good translations, it can pick up the style of different coders, and then replicate the style of each coder in its own translation,” O’Malley says. If they’re able to establish a reliable methodology, within the next few years, the timesaving demonstration could be a technique shared with coders and scientists across the Lab. 

“There’s almost no field of science that isn’t being changed by AI, and all our missions at the Laboratory are using it in some capacity.”
—Thom Mason, 
Director of Los Alamos National Laboratory
, Santa Fe New Mexican, 2024

People also ask

  • What is retrieval-augmented generation? Retrieval-augmented generation or RAG is a little like a customizable way to train—you can think of that as teaching—powerful AI models. At the most basic level, Large Language Models, a fancy term for programs like Chat GPT or BERT, generate answers by referencing enormous general datasets: the actual internet. RAG, though, lets programmers focus the models. So instead of generating answer from its entire training dataset (e.g. the internet), the models key in on the datasets provided by programmers, allowing them to “learn” more deeply and provide better answers to specialized questions.   
  • How good is AI translation? The rule of thumb is that AI translation is cheap and quick, but without additional human tinkering, the models can miss cultural references, idioms, or metaphors. For translations in computer programming languages, that rule of thumb holds, but in both application, AI is a powerful tool in the hands of a skilled translator. 

Share

Stay up to date
Subscribe to 1663 magazine for expert insights on groundbreaking research initiatives and innovations advancing both national-security programs and basic science.
Subscribe Now

More 1663 Stories

1663 Home
What Why Lansce Hero

What and Why: Los Alamos Neutron Science Center

A Backbone of America’s Nuclear Knowledge

Marquee Btn V2

The Los Alamos Neutron Science Center: By the Numbers

A cornerstone user facility, the particle accelerator at Los Alamos supports both basic and applied science.

Qna Lanscev3

Q&A: The LANSCE Accelerator Modernization Project

How Los Alamos is upgrading its flagship particle accelerator

Dynamic Imagingv4

The Dynamics of Dynamic Imaging

Real-time pictures of explosive events are vital to the Lab’s mission.

Atomic Armor Lansce V2

Accelerator Armor

A new approach to reducing downtime at the Lab’s largest accelerator

Positive Vibrations Marquee V5

pRad and the Future of Stockpile Stewardship

For more than two decades, proton radiography (pRad) has helped certify the nation’s stockpile. Now, the facility is getting a well-deserved upgrade.

Follow us

Keep up with the latest news from the Lab