At NVIDIA GTC, Intel RealSense and LimX Dynamics presented a demonstration showing a humanoid robot navigating autonomously using 3D vision and visual localization. The demonstration illustrates how sensors, robotics software and simulation platforms can be combined to enable humanoid robots to better understand their surroundings and move safely through complex environments.
According to RealSense, the project focuses on one of the central challenges in humanoid robotics: reliable perception. While many industrial robots operate in controlled factory environments, humanoid robots are expected to function in dynamic spaces where people, obstacles and changes in elevation constantly occur.
3D vision for humanoid navigation
The demonstration uses a humanoid robot developed by LimX Dynamics and equipped with RealSense depth cameras. These sensors provide real-time three-dimensional information about the surrounding environment, allowing the robot to measure distances, detect obstacles and analyse the geometry of its surroundings.
Depth sensing plays an important role in legged robotics. Unlike wheeled mobile robots that typically operate on flat surfaces, humanoid robots must continuously determine where to place their feet and how to maintain balance while walking. This requires a detailed three-dimensional understanding of the environment.
The RealSense cameras generate a continuous 3D point cloud of the surroundings. Using this information, the robot can detect objects, avoid obstacles and navigate through spaces without relying on predefined routes.
Visual SLAM and robot perception
In addition to the sensors, the system relies on visual SLAM technology. SLAM stands for simultaneous localization and mapping, a method that allows a robot to build a map of its environment while determining its own position within that map.
For the demonstration, the perception stack is integrated with software from the ecosystem of NVIDIA. The system uses GPU-accelerated robotics software for visual odometry and mapping. By combining depth sensing with visual SLAM, the robot can continuously update its position and interpret changes in the environment while moving.
This capability is considered essential for humanoid robots that are expected to operate in buildings, warehouses and other human environments.
Training in simulation
An important component of the system is its integration with simulation environments. Parts of the navigation system were developed and tested in virtual robotics environments before being deployed on the physical robot.
This approach, often referred to as “sim-to-real,” allows locomotion and navigation algorithms to be trained in simulation first. Developers can test thousands of scenarios in virtual environments, including different terrain conditions and unexpected obstacles. The training process helps robots respond more reliably to real-world situations once deployed.
Perception as a key capability
The demonstration highlights a broader trend within humanoid robotics: perception is becoming a central element of robot design. Many autonomous mobile robots used in logistics rely on relatively simple navigation systems because they operate on flat floors. Humanoid robots, however, must deal with stairs, curbs, uneven surfaces and moving people.
This requires robots not only to detect objects but also to understand the structure of their environment. Depth cameras and visual mapping technologies are therefore increasingly integrated into humanoid platforms.
RealSense describes the vision system as functioning similarly to a “visual cortex,” enabling robots to interpret their surroundings and operate safely around humans.
LimX as a development platform
The humanoid robot used in the demonstration was developed by LimX Dynamics, a robotics company focused on legged robots and humanoid platforms for research and development. The platform is designed as a development environment for embodied AI applications and supports a range of sensors and software interfaces.
For research purposes, the robot can be equipped with depth cameras, inertial sensors and additional sensors such as LiDAR or RGB cameras. This flexibility allows developers to integrate and test new perception and navigation systems.
Although the demonstration represents a development-stage project, it provides an example of how humanoid robots could eventually navigate autonomously in complex environments. By combining depth sensing, visual SLAM and simulation-based training, the system enables robots to interpret their surroundings in real time and adapt their movements accordingly.
