Information Technology

NASA-born Software Keeps Cloud Traffic Moving
Subheadline
Process scheduler software originally developed for NASA helps ensure optimal use of cloud computing resources

Originally published 01/24/2022
Body

From your streaming TV queue to the cloud storage where you keep photos, servers now play an important role in our lives. As the world continues to need more from these specialized systems, the jobs we task them with have to be scheduled to prevent bottlenecks. A system originally developed for this purpose at NASA is now available everywhere in enterprise computing.

The problem arose in the 1990s, when engineers and researchers needing to run their programs on the powerful shared systems at NASA’s Ames Research Center in Silicon Valley, California, began to experience traffic jams. There was only so much space available in the memory and processors of their number-crunching machines, and overtaxing the system would lead to longer wait times for everyone.

The programmers, led by on-site software development contractor MRJ Technology Solutions (later acquired by Veridian), decided the best way was to make the computer equivalent of a crossing guard.

The Portable Batch System (PBS), first built in 1991, acts like a series of stoplights for supercomputers. Some computer tasks move forward, while others wait their turn. Tasks that take more resources are scheduled to run by themselves, while smaller tasks can run concurrently, keeping bottlenecks out of the system.

In 1998, after MRJ finished the original software for NASA, the Ames Commercial Technology Office, which would later become the Technology Transfer Office, gave the company authorization to distribute and support the software commercially. They split the system into multiple releases, including an open-source version and an actively developed version called PBS Pro, with several of the NASA project developers remaining on the team.

“The original version didn’t even have a graphical user interface,” said Bill Nitzberg, who led PBS development at Veridian. “We built that on top as part of PBS Pro.”

In 2003, Altair Engineering of Troy, Michigan, acquired active PBS development from Veridian. Since the early 2000s, the use cases of batch computing have continued to grow. In turn, PBS grew from a single piece of software into an entire suite, called PBS Works. Each tool in the suite performs a different function but still manages jobs in the queue. Hundreds of organizations use PBS Works, from large manufacturing firms like Ford and 3M, to scientific organizations like the Argonne National Laboratory.

With the advent of distributed computing providers like Amazon Web Services or Microsoft Azure, scheduling software has become a necessity, said Nitzberg, now the chief technology officer for PBS Works at Altair.

“Cloud services didn’t exist when we first developed PBS, and now everyone uses them,” he said. “People expect this software to be there now.”

Abstract
When NASA Ames needed to keep their computers from traffic jams, they developed a tool that’s seen decades of continued development from Altair Engineering of Troy, Michigan, helping companies and researchers make the most of cloud computing.
Screenshot from Altair Access

Screenshot from Altair Access, a newer part of Altair’s solution set for high-performance computing, which makes it easy for end users to manage resources and workloads through PBS Pro. Credit: Altair

A bank of servers representing cloud computing environments