ENPIRE: AI Coding Agents Train an Eight-Robot Fleet

Nvidia, Carnegie Mellon and UC Berkeley released ENPIRE, a framework that lets AI coding agents autonomously train an eight-robot fleet to 99% success on pin and GPU insertion.

ENPIRE was released Tuesday in a paper from Nvidia, Carnegie Mellon University and UC Berkeley. The framework lets AI coding agents run the full loop of training robots without human supervision and was used to teach an eight-robot fleet tasks including pin insertion, seating graphics cards and zip-tie cutting to a reported 99 percent success rate.

The system splits work into a human setup phase and an autonomous phase. Humans build two reusable tools: a reset routine that returns the workspace to a known starting state and a camera-based reward function that scores success. After that setup, coding agents take over.

Researchers ran agents based on OpenAI’s Codex, Anthropic’s Claude Code and Moonshot’s Kimi Code. The agents searched published research, selected training methods such as imitation learning or reinforcement learning, wrote and revised code, and executed experiments directly on physical robot hardware without a person supervising each trial.

Nvidia ran the experiments in its GEAR lab on eight bimanual robot stations. Each station had its own computer and agent. Stations shared code and successful strategies via Git so methods that worked on one station could propagate to the others within minutes.

The team reported gains from scale. On the Push-T task, time to mastery fell from about five hours with one robot to roughly two hours with eight. Pin insertion time dropped from more than 90 minutes to about 40. Across four real-world tasks the agents drove their policies to a 99 percent success rate, according to the paper.

The researchers also tracked costs and limits. Running multiple agents in parallel reduced wall-clock training time, but the language-model token bill grew faster than the time savings. The paper highlights a gap between simulation and reality: all three agents solved Push-T in a simulator, while two failed on physical hardware, with real-world friction cited as a factor.

ENPIRE was evaluated in RoboCasa, a simulated kitchen benchmark for chores such as opening cabinets and turning off stoves. In that benchmark ENPIRE outperformed Nvidia’s end-to-end model GR00T and CaP-X, an agent that skips an autoresearch loop. The authors present ENPIRE as an extension of prior work, including a 2023 system that used a language model to write reward functions inside a simulator.

Jim Fan, co-lead of Nvidia’s GEAR Lab, posted on social media that the team gave eight coding agents a fleet of robots, a GPU allocation and a token budget, then let the agents work toward solving tasks quickly while keeping the robots active.

The release coincided with Alibaba unveiling the Qwen-Robot Suite, a set of foundation models for robot navigation, manipulation and physics simulation. The paper states human involvement after the initial setup was limited to writing the publication.

The material on GNcrypto is intended solely for informational use and must not be regarded as financial advice. We make every effort to keep the content accurate and current, but we cannot warrant its precision, completeness, or reliability. GNcrypto does not take responsibility for any mistakes, omissions, or financial losses resulting from reliance on this information. Any actions you take based on this content are done at your own risk. Always conduct independent research and seek guidance from a qualified specialist. For further details, please review our Terms, Privacy Policy and Disclaimers.

Articles by this author