NASA programmer recalls debugging Lisp in deep space

Debugging software that’s running 100 million miles is something most of us will never have to do, thankfully. But a former NASA programmer, software engineer Ron Garret, shared his experience diagnosing faulty LISP software during a Deep Space spacecraft mission, in a recent episode of Adam Gordon Bell’s Corecursive podcast. .

Garret shared a remarkable story about debugging in deep space – as well as some memories from the early days of programming. Along the way, Garret offered a refreshing perspective on what has changed – and what hasn’t changed – in the world of programming. Garret also explored the unique challenges of writing code for a spacecraft.

And he remembered his starring role in a truly glorious moment in Lisp history.

Powerful User

Garret had worked as a research scientist at NASA’s Jet Propulsion Laboratory from 1988 to 2000, and again from 2001 to 2004. Garret’s specialty: autonomous mobile robots. He helped pioneer what is today the de facto standard autonomous mobile robot control architecture.

Garret’s team worked on prototypes for the Mars Sojourner robotic rover.

On the podcast, Garret described the very limited programming options in 1988 – a world before Java, Python, JavaScript, and even C++. “There’s Pascal and C and Basic and machine code. And that’s about it in terms of popular languages. Doing anything in one of these languages ​​is really very difficult. The code for most spacecraft ended up being written in assembly language.

But then there was Lisp – a language based on the clean abstraction of problems into lists and functions. And while C programmers worry about things like dangling pointers, Lisp also has automatic memory management. “It’s so much faster and easier to get things done when the language you’re using provides you with some of these high-level abstractions,” Garret recalls on the podcast. “And in a world where the only language that has that is Lisp, knowing Lisp really is like a superpower.

With Lisp, “every problem becomes a compiler problem”

“It just blew everything else out of the water at the time.”

At the time, Lisp wasn’t really used at NASA.

“There was quite a bit of prejudice against Lisp because it was weird and unfamiliar, and it had this weird garbage-collecting technology that you never knew when it would stop your process in its tracks,” Garrett recalls.

Garret’s group found it useful for memory-limited hardware. Lisp could be used to fashion a custom language specifically for the problem at hand and then compile it for the robot hardware. Or, as Bell puts it, “every problem becomes a compiler problem”. Garret’s team painstakingly wrote and tested their code on a robot simulator (on a Macintosh computer) before installing it in the real rover and performing a time-consuming test in the Arroyo.

Despite the code base developed by the group, when the Sojourner rover reached Mars it was powered by C code.

Yet in 1998, a new NASA director launched NASA’s New Millennium Project – a pilot program to demonstrate different (and cheaper) technologies, through a number of deep space exploration missions. .

This meant their Lisp code had a second life, Garret recalls on the podcast. The autonomy technology that the team had started developing for the rovers was repurposed. His new mission? Flight controller.

Garret’s team worked on innovative decision-making software – using a custom language written in Lisp specifically designed to avoid the possibility of a dreaded “race condition” (where two concurrently executing threads fight for the same memory space). “It was tested for days and days and days” – on exactly the same hardware was going into space. “So we were very confident that it was going to work.

“And it didn’t work…”

Deep Space Failure

Garret explains that during their three days of flight control, “there was a time when he was supposed to do something and that time passed and he didn’t do what he was supposed to do. And the alarm bells rang…

“Now that code that has proven to be dead-end appears to be frozen 150 million miles from home.”

It was a tense situation. “We had no idea what was going on…. And everything we did when we decided to do something, we did it and then we sat and waited an hour for the result. After a team in a conference room reached a consensus, their orders “went through a review process consisting of a number of levels of management, all of whom had to approve it.”

Once approvals were obtained, the commands were transmitted via a dedicated wired network to one of the Deep Space Network’s 70-meter-wide antennas, which sent the commands into space at the speed of light…”

They first asked for a backtrace – a common programming operation that generates a list of all currently active processes (and, as Garret described it, “what they’re waiting for”.)

“It was actually almost immediately obvious what was wrong because there was this process waiting for something that should have already happened…

“The problem was that there was, in fact, a race condition. Which was supposed to be impossible. Unfortunately, one of Garret’s coders had called a lower level Lisp function – which had inadvertently created “a end of the line around security guarantees” of their carefully tailored language. (Garret blames himself for not explaining this more clearly to the coder.)

The team decided to “manually” trigger the event, which allowed the software to restart.

“We didn’t lose the spacecraft and we achieved all mission objectives – so technically it was a success,” Garret said on the podcast. “But the development process was so painful and fraught with pitfalls – and again there was politics. So despite the fact that we managed to get it to work, the autonomy project was subsequently canceled and it never flew again.

A 2002 essay on Garret’s personal website claims that “Lisp’s disappearance at JPL is a tragedy. The language is particularly well suited to the type of software development that is often done here: unique, highly dynamic applications that must be developed under extremely tight budgets and schedules. »

But Lisp was ignored for C++, then Java, with the rationale given as an attempt to follow “best practices”. Garret’s response? “We confuse best practices with Standard practice. The two are not the same.” And even beyond that, which is ultimately best is not an invariable standard, but should depend on the particulars of the project at hand.

But in a discussion on Hacker News, one commenter identified himself as a NASA engineer who had been the payload software engineer for a 2009 mission exploring the moon’s south pole – and said he had used Lisp to write his own custom language for instrument control. sequences (and to simulate the computer). “Lisp’s simple and flexible syntax and macros made it easy to express command and timing patterns for this.”

So they left Garret with a reassuring thought: “I think Lisp is still being used in various nooks and crannies at NASA.”


Feature image by NASA/JPL, public domain.

Comments are closed.