Bipedal locomotion boosted with AI and reinforced control

Hybrid AI improves bipedal locomotion with adaptive control and reinforced learning

Bipedal robots achieve stable and resilient gait thanks to a hybrid system that combines physical planning and deep reinforcement control.

By Ruth Arteaga on December 31, 2025

A new artificial intelligence-based control system has taken a significant step toward more natural and stable bipedal locomotion. Combining optimized path planning with deep reinforcement learning (DRL), researchers developed a two-stage framework that enables bipedal robots to move with greater stability, energy efficiency, and tolerance to perturbations.

A dual control architecture

The solution integrates two complementary processes. In the first stage, a deep neural network learns to generate smooth and energetically feasible joint trajectories from parameters such as body mass, leg length and gait. This network is trained with data derived from physical optimization simulations that ensure that the zero moment point (ZMP) stays within the support area of the robot.

In the second stage, a DRL algorithm, specifically DDPG, takes control of the joint torque. This agent does not learn from scratch, but follows previously generated trajectories as a reference. Its training is guided by a reward function that prioritizes forward speed, prolonged stability and minimal energy consumption, penalizing falls and imbalances.

It may interest you

• Brazil stakes millions of dollars in low-carbon hydrogen investments in 2026

• Next-Gen IRON: XPENG’s humanoid robot that looks like a human being

A more humane and resilient march in the face of uncertainty

Simulation testing revealed superior performance over traditional inverse dynamics-based approaches. The robot was able to maintain an upright posture, reduced torso oscillation and a smoother periodic gait during prolonged cycles. Even when faced with up to 20% variations in mass or significant external disturbances, the system was able to regain balance and resume the gait pattern without collapsing.

This adaptability is due to the combination of the physical knowledge incorporated in the planning stage with the flexibility of reinforced learning. The result is a hybrid system capable of generating robust control policies in uncertain environments.

Real-world applications of bipedal locomotion in robotics

The breakthrough represents a key step towards the implementation of bipedal robots in practical real-world tasks, from logistics to personal assistance. By integrating deep learning with torque-based adaptive control, this two-stage framework opens up new possibilities for bipedal locomotion in dynamic and unpredictable environments.

In a high-fidelity simulation environment in MATLAB, the system maintained a 100% success rate against 100 tests with random perturbations, including abrupt angular velocity alterations of up to +55% or -30%. This reliability makes it a promising candidate for future developments in stable and efficient autonomous robotics.

Source and photo: AZO Robotics