NVIDIA has announced a preview release of Isaac Gym – NVIDIA’s physics simulation environment for reinforcement learning research. RL-based training is now more accessible as tasks that once required thousands of CPU cores can now instead be trained using a single GPU.
RL has become one of the most promising research areas in machine learning and has demonstrated great potential for solving complex problems. RL-based systems have achieved superhuman performance in very challenging tasks, ranging from classic strategy games such as Go and Chess, to real-time computer games like StarCraft and DOTA. RL based approaches also hold promise for robotics applications, such as solving a Rubik’s Cube, or learning locomotion by imitating animals.
Reinforcement learning supercomputer
Until now, most RL robotics researchers were forced to use clusters of CPU cores for the physically accurate simulations needed to train RL algorithms. In one of the more well-known projects, the OpenAI team used almost 30,000 CPU cores (920 computers with 32 cores each) to train their robot in the Rubik’s Cube task.
In a similar task, Learning Dexterous In-Hand Manipulation, OpenAI used a cluster of 384 systems with 6144 CPU cores, plus 8 Volta V100 GPUs and required close to 30 hours of training to achieve its best results. This in-hand cube object orientation task is a challenging dexterous manipulation task, with complex physics and dynamics, many contacts, and a high-dimensional continuous control space.
Isaac Gym includes an example of this cube manipulation task for researchers to recreate the OpenAI experiment. The example supports training both recurrent and feed-forward neural networks, as well as domain randomization of physics properties that help with sim-to-real transfer. With Isaac Gym, researchers can achieve the same level of success as OpenAI’s supercomputer — on a single A100 GPU — in about 10 hours!
End to End GPU RL
Isaac Gym achieves these results by leveraging NVIDIA’s PhysX GPU-accelerated simulation engine, allowing it to gather the experience data required for robotics RL. In addition to fast physics simulations, Isaac Gym also enables observation and reward calculations to take place on the GPU, thereby avoiding significant performance bottlenecks. In particular, costly data transfers between the GPU and the CPU are eliminated. Implemented this way, Isaac Gym enables a complete end-to-end GPU RL pipeline.