skip to content
Stephen Van Tran
Table of Contents

On December 8, 2025, a robot the size of a small car sat on the rim of an ancient lake bed on Mars, waiting for instructions. For twenty-eight years, every drive command sent to every NASA rover had been authored by human planners at the Jet Propulsion Laboratory in Pasadena — engineers who spent hours poring over orbital imagery, calculating slopes, plotting waypoints by hand, and running simulations before transmitting a single instruction across the 140-million-mile void. On that December morning, for the first time in the history of planetary exploration, the route came from an AI. Anthropic’s Claude vision-language model analyzed high-resolution imagery from the HiRISE camera aboard NASA’s Mars Reconnaissance Orbiter, generated a continuous path with waypoints spaced at ten-meter intervals, wrote the commands in Rover Markup Language — the bespoke XML-based programming language that Mars rovers have used since the Spirit and Opportunity era — and submitted the plan for verification. The JPL team processed those commands through the rover’s digital twin, modeling over 500,000 telemetry variables to confirm the instructions were safe. Then they pressed send.

Perseverance drove 210 meters on December 8 and another 246 meters on December 10, covering a combined 456 meters of Martian terrain that had been planned entirely by a generative AI system. The drives were flawless. JPL engineers required only minor adjustments to Claude’s route plans, and the agency estimates that the AI cut route-planning time roughly in half. In absolute terms, the distances were modest — Perseverance’s AutoNav system has been driving the rover autonomously for years, completing 90 percent of its 30-plus kilometers of total travel without step-by-step human commands. But there is a categorical difference between a rover following obstacle-avoidance algorithms in real time and an AI system performing the complex cognitive task that human mission planners have monopolized since 1997: analyzing terrain from orbit, designing a multi-waypoint route, writing the machine code, and critiquing its own work before submission. That is not automation. That is agency.

The implications ripple far beyond one crater on one planet. If a vision-language model can replace hours of expert human labor in one of the most safety-critical planning environments on Earth — or off it — the question is no longer whether AI agents will transform mission operations for NASA and every other space agency. The question is how fast, and what it means for the $469 billion global space economy that is about to discover it has a new kind of copilot.

Four hundred fifty-six meters that rewrote a twenty-eight-year playbook

The story of how Claude ended up planning drives on Mars begins not in the AI labs of San Francisco but in the windowless Rover Operations Center at JPL, where a team of roughly fifty engineers and scientists manages Perseverance’s daily activities. Every sol — every Martian day — the team faces the same constraint: a twenty-minute one-way signal delay between Earth and Mars means that real-time joystick control is impossible. Instead, the team plans each drive in advance, uploading a batch of commands that the rover executes autonomously while its operators sleep. The planning process is meticulous and slow. Engineers study overhead imagery from NASA’s Mars Reconnaissance Orbiter, cross-reference it with terrain-slope data from digital elevation models, identify hazards, calculate safe traversal paths, and hand-code waypoints in Rover Markup Language. A single drive plan can consume the better part of a working day.

The bottleneck is not incompetence — it is cognitive load. The engineers are among the best in the world at what they do. But the task of translating two-dimensional orbital imagery into three-dimensional route plans while simultaneously accounting for rock distributions, slope angles, wheel-soil interaction, power constraints, and communication windows is precisely the kind of multivariate spatial reasoning that large vision-language models have become remarkably good at. JPL’s leadership recognized this early. In collaboration with Anthropic, the team designed a demonstration in which Claude would receive the same imagery and data products that human planners use — HiRISE orbital photographs and digital elevation models of Jezero crater’s rim — and produce a complete drive plan, waypoints included, formatted in the rover’s native command language.

The results exceeded expectations on both speed and accuracy. Claude analyzed the orbital imagery, identified critical terrain features including rock fields and slope transitions, generated a continuous path with waypoints spaced at roughly ten-meter intervals, and then — in a step that distinguishes generative AI from traditional pathfinding algorithms — critiqued its own work, iteratively refining the waypoints for safety before converting the plan into Rover Markup Language. The entire process was then validated through JPL’s standard simulation pipeline, the same digital twin that checks every human-authored drive plan by modeling the rover’s physical response across more than half a million telemetry variables. The simulation confirmed that Claude’s commands were fully compatible with the rover’s flight software. On December 8, Sol 1707, Perseverance executed the first AI-planned drive on another world, covering 689 feet. Two days later, on Sol 1709, it drove another 807 feet.

The 50-percent reduction in planning time matters operationally because every hour saved in the planning cycle is an hour that the rover can spend doing science. Perseverance is currently exploring the rim of Jezero crater, a region of extraordinary geological interest where, in January 2026, an international team led by Imperial College London published a landmark study confirming that the crater once had a wave-lapped shoreline — evidence that Mars harbored a warm, humid climate and surface water far longer than previously believed. Since reaching the rim in December 2024, the rover has cored five rocks and sealed three sample tubes, bringing its total collected samples to 24 of a targeted 30. Every additional meter of traverse efficiency matters because those final samples could contain the evidence that answers the question humanity has asked for centuries: was there ever life on Mars?

One proprietary calculation illustrates the compounding value. If AI-assisted planning saves three hours per sol across the roughly 200 driving sols remaining in Perseverance’s extended mission, and each saved hour translates to an average of 15 additional meters of traverse based on the rover’s historical drive-rate data, then Claude-class planning tools could add approximately 9 additional kilometers to the mission’s total traverse distance — enough to reach geological targets that would otherwise fall outside the mission’s operational envelope. For a mission whose total cost exceeds $2.7 billion, squeezing an extra 30 percent of traverse capability out of the existing hardware through smarter planning is the highest-leverage efficiency gain available.

From obstacle avoidance to orbital reasoning: the autonomy stack gets a brain

Perseverance already runs one of the most sophisticated autonomous navigation systems ever deployed on another world. AutoNav, the rover’s onboard self-driving software, has been responsible for roughly 90 percent of the rover’s total traverse distance — a stunning leap from Curiosity, its predecessor, which completed only about 6 percent of its drives autonomously. AutoNav processes stereo camera imagery in real time, builds three-dimensional terrain maps, identifies obstacles, and plots safe paths around them. It holds the planetary rover record for the greatest distance driven without human review: 699.9 meters in a single command cycle. The AEGIS system — Autonomous Exploration for Gathering Increased Science — adds another layer by autonomously selecting science targets from wide-angle imagery and directing the SuperCam instrument to analyze interesting rocks without waiting for ground-team instructions.

What Claude’s demonstration adds is not a replacement for these systems but a new layer on top of them — one that operates at a fundamentally different level of abstraction. AutoNav is reactive: it sees terrain in front of it and navigates around obstacles in real time. AEGIS is opportunistic: it identifies science targets within the rover’s immediate field of view. Claude’s contribution is strategic: it performs the high-level cognitive task of designing an entire multi-hundred-meter traverse plan from orbital data, the same task that has required a team of experienced human planners since the Mars Pathfinder mission in 1997. The analogy in terrestrial driving is the difference between adaptive cruise control, which reacts to the car ahead, and a human driver who studies a map, picks a highway, and plans stops along the route. AutoNav is the cruise control. Claude is the driver.

This layered autonomy architecture has profound implications for future missions. The Mars Sample Return mission, the most ambitious robotic retrieval ever attempted, requires a fetch rover to locate and collect the sample tubes that Perseverance has cached across Jezero crater. That fetch rover will need to traverse significant distances with minimal human oversight, and its planning team will face the same signal-delay constraints as Perseverance’s. If vision-language models can reliably plan drives from orbital imagery, the fetch rover could operate with dramatically higher throughput, reaching cached samples faster and reducing the mission’s total surface operations timeline. NASA is evaluating two architecture options for the return mission — a JPL-led sky-crane approach estimated at $6.6 to $7.7 billion and a commercial heavy-lift option using SpaceX Starship at $5.8 to $7.1 billion — with a decision expected in mid-2026. Under either architecture, AI-assisted traverse planning could shave months off surface operations and potentially billions off total mission cost.

The outer solar system amplifies the case further. Europa Clipper, NASA’s flagship mission to Jupiter’s ice-covered moon, will contend with one-way signal delays exceeding 45 minutes. Future surface missions to Europa or Enceladus — moons where liquid water exists beneath kilometers of ice and where the conditions for life may be present today — would face round-trip communication delays approaching two hours. At those timescales, waiting for human planners to author each drive command becomes operationally untenable. The autonomous planning capability that Claude demonstrated on Mars in December is not a convenience for distant-moon missions. It is a prerequisite.

The economic calculus also extends to the commercial space sector. Companies like Astrobotic, Intuitive Machines, and ispace are building lunar rovers for NASA’s Commercial Lunar Payload Services program and for private customers. These rovers will operate in permanently shadowed craters, navigate boulder fields, and explore lava tubes — environments where orbital planning tools could dramatically reduce the ground-team headcount and accelerate mission timelines. If a vision-language model can plan a safe traverse through Jezero crater using overhead imagery and a digital elevation model, there is no technical reason it cannot do the same for Shackleton crater on the Moon, where the terrain data is arguably better and the stakes for commercial operators are measured in revenue per kilogram delivered.

The ways this breakthrough could stall at the launchpad

The December demonstration was a triumph of engineering and a genuine first in planetary exploration. It was also, by design, a carefully controlled experiment with significant guardrails that the celebratory coverage has largely glossed over. The drives were planned by Claude but verified through JPL’s full simulation pipeline before any commands were transmitted to Mars. Human planners reviewed the output, made adjustments, and retained full authority to reject the plan. The demonstration proved that a vision-language model can produce route plans of sufficient quality to survive JPL’s verification process — not that it can replace the verification process itself.

The gap between “AI-assisted planning” and “AI-autonomous planning” is wide, and the history of autonomous systems in safety-critical domains suggests that closing it will take years. Aviation offers the most instructive parallel. Autopilot systems have handled the majority of commercial flight operations for decades, but the regulatory framework still requires two human pilots in the cockpit, and every major aviation incident in the past twenty years has reinforced rather than relaxed that requirement. NASA’s own institutional culture is deeply conservative about autonomous systems — the agency that triple-redundantly tests every component on a spacecraft before launch is unlikely to hand full planning authority to a language model that occasionally hallucinates, no matter how impressive a single demonstration.

The hallucination problem deserves specific attention in this context. Vision-language models can misidentify terrain features, hallucinate obstacles that do not exist, or — more dangerously — fail to identify real hazards in imagery. On Earth, these errors are training data; on Mars, they are mission-ending events. A single misclassified rock field or underestimated slope could strand a $2.7 billion rover in a position from which it cannot recover. JPL’s 500,000-variable simulation catches these errors today, but the simulation itself relies on models of terrain and rover physics that carry their own uncertainties. The question is not whether Claude will occasionally produce a flawed plan — it certainly will — but whether the combined AI-plus-simulation pipeline has a lower error rate than the combined human-plus-simulation pipeline it seeks to augment. That comparison has not yet been made with the statistical rigor that aerospace engineering demands.

There is also the question of institutional incentives. The fifty engineers who currently plan Perseverance’s drives represent decades of accumulated expertise in Mars surface operations. They have survived dust storms, worked through hardware anomalies, and developed the intuitive judgment that comes from thousands of hours staring at Martian terrain. If AI planning tools reduce the human headcount needed for future missions, the agency risks losing the institutional knowledge that created those tools in the first place — a dynamic that the defense sector has already encountered as AI automates military intelligence analysis. NASA’s challenge is to use AI to amplify its planners’ capabilities without creating a dependency that atrophies the human expertise needed to supervise the machines.

Finally, the geopolitical dimension matters. China’s Tianwen-3 sample return mission is targeting a 2028 launch with the goal of returning Mars samples to Earth before NASA’s architecture decision even reaches the hardware phase. If AI-assisted planning gives the Perseverance mission a meaningful efficiency advantage, the question becomes whether NASA will share that advantage with its sample return partners at ESA or keep it proprietary — and whether China’s space program is developing equivalent capabilities using its own large language models. The space race has always been a technology race, and AI planning tools are now part of the arsenal.

The next 140 million miles start with a single waypoint

The December drives covered 456 meters of Martian terrain. In the context of Perseverance’s 30-plus-kilometer odometer, that is a rounding error. In the context of what it represents, it is a Kitty Hawk moment for autonomous exploration — the first proof that a general-purpose AI system can perform one of the most cognitively demanding tasks in robotic space operations with sufficient quality to satisfy the most demanding verification standards in aerospace engineering.

The near-term implications are already materializing. JPL has signaled that AI-assisted planning will become part of Perseverance’s regular operational toolkit, not a one-off experiment. As the rover continues its traverse of Jezero’s rim — searching for more samples, more shoreline evidence, more of the geological record that could contain biosignatures — the ability to plan drives faster means the ability to cover more ground before the rover’s plutonium power source decays below operational thresholds. Every additional kilometer of traverse is a lottery ticket for one of the most consequential scientific discoveries in human history.

The medium-term implications center on the Mars Sample Return mission. Whichever architecture NASA selects in mid-2026 — the $6.6 billion sky-crane option or the $5.8 billion commercial option — the fetch rover that will retrieve Perseverance’s cached samples will benefit enormously from AI-planned traverses. A fetch rover operating with Claude-class planning tools could potentially complete its sample collection campaign in half the time projected under current operational assumptions, reducing surface operations costs and closing the schedule gap with China’s Tianwen-3. The irony would be rich: the same AI safety company that is fighting the Pentagon in court over the military use of its technology might simultaneously become NASA’s most valuable partner in winning the Mars sample return race.

The long-term implications are the most transformative. The vision-language model that planned two drives on Mars in December 2025 was Claude — a general-purpose AI system that also writes code, drafts legal briefs, and analyzes financial statements. It was not a purpose-built space operations system. It was not trained on Mars imagery. It used the same orbital photographs and elevation data that human planners use, processed them through the same analytical framework a human would apply, and produced output in the rover’s native command language. The generality of the tool is the breakthrough. If a foundation model can plan Mars drives, the question for every space agency and every commercial operator is not whether to integrate AI into mission operations but how quickly they can do so without compromising safety margins.

Here is what operators, mission planners, and space industry leaders should be tracking:

  • Watch the cadence. If JPL moves from demonstration to regular operational use within the next two sols-per-month cycle, it signals that the technology has cleared internal confidence thresholds and will likely become standard for future missions.
  • Track the sample return architecture decision. The mid-2026 selection between sky-crane and commercial options will determine whether AI-assisted planning becomes a cost differentiator or merely a convenience for Mars Sample Return.
  • Monitor the commercial lunar sector. Astrobotic, Intuitive Machines, and ispace are the first companies likely to adopt AI traverse planning for their lunar missions. Whichever company moves first will set the benchmark for the industry.
  • Follow the autonomy stack integration. The real value emerges when AI route planning (strategic), AutoNav (tactical), and AEGIS (opportunistic science) are integrated into a single autonomous decision loop. That integration has not happened yet, but the December demonstration proved that the strategic layer works.
  • Size the labor market impact. If AI planning tools reduce the ground-team headcount needed per rover by 30 to 50 percent, the implications cascade through NASA’s workforce planning, JPL’s staffing models, and the university pipeline that trains the next generation of planetary scientists.

Four hundred fifty-six meters of Martian soil, two drives, one AI, and the entire future of autonomous space exploration compressed into a single proof of concept. The rover is still rolling. The AI is ready for its next assignment. And somewhere in Pasadena, fifty mission planners are learning to share the wheel.

In other news

Anthropic launches the Anthropic Institute with Jack Clark at the helm — Anthropic unveiled the Anthropic Institute on March 11, an interdisciplinary research body led by co-founder Jack Clark that merges the company’s Frontier Red Team, Societal Impacts, and Economic Research groups. The institute will study AI’s effects on employment, national security, and governance, with founding researchers including Matt Botvinick on AI and rule of law and Anton Korinek on economic transformation (eWeek).

Google ships Gemini Embedding 2, its first natively multimodal embedding model — Google released Gemini Embedding 2 on March 10, a model that maps text, images, video, audio, and documents into a single unified embedding space using Matryoshka Representation Learning for flexible output dimensions of 3,072, 1,536, or 768. Early adopters report up to 70 percent latency reduction for multimodal retrieval-augmented generation pipelines (VentureBeat).

IBM Granite 4.0 1B Speech tops the OpenASR leaderboard — IBM released Granite 4.0 1B Speech under the Apache 2.0 license, a compact multilingual speech model that hit number one on the Open ASR leaderboard with an average word error rate of 5.52 while supporting English, French, German, Spanish, Portuguese, and Japanese at half the parameter count of its predecessor (MarkTechPost).

Morgan Stanley warns a massive AI breakthrough is imminent — Morgan Stanley issued a note arguing that an unprecedented accumulation of compute at America’s top AI labs will produce a major capability leap in the first half of 2026, warning that most of the world isn’t ready for the economic and labor-market disruption that will follow.

South Korea opens AI cooperation talks with Anthropic — South Korea’s Deputy Prime Minister Bae Kyung-hoon met Anthropic CEO Dario Amodei at an AI summit in India to discuss policy cooperation, public-service AI applications, and safety research, as Seoul looks to diversify its AI partnerships beyond OpenAI. Korea ranks seventh globally in Claude usage intensity per working-age capita, and KOSPO has already begun distributing $10,000 Claude credit packages to Korean startups (Korea Times).