Project Overview
The goal of this project was to create a quadruped platform for reinforcement learning tasks under $600. First, I built a Pybullet environment using the open-source Spot Micro CAD models (and later my own - OpenQuadruped). I used this platform to validate my novel sim-to-real RL method: D$^2$-GMBC. After building this original version, I collaborated with a friend to mechanically redesign Spot for higher fidelity, and more optimal weight distribution. Check out the package on Github!
Dynamics and Domain Randomized Gait Modulation with Bezier Curves for Sim-to-Real Legged Locomotion
Using this platform, I propose a data-efficient novel reinforcement learning method that seeks to deliver a robust and universally controllable gait without contact or environment sensing. It builds on an existing gait scheme using 12-point Bezier curves which I modify to allow for any combination of forward, lateral, and yaw commands at user-defined step heights, lengths, and speeds. The method wraps a learning agent around this scheme to modulate gait parameters such as step and body height, and to add significant residuals to the resultant foot coordinates. The only sensor used here is an IMU. To read more about this approach, and to access the paper, please visit this website!
Inverse Kinematics
After deriving the Inverse Kinematics for each leg, the next step was to describe the IK for the body itself. The approach used here considers a world frame $w$, which is the robot centroid’s base position, and a body frame $b$, describing the robot’s pose relative to the world frame. In addition, we have $T_{ws}$, which is a transform from the world frame to the robot’s shoulder: this describes the base transform between the robot centroid and the shoulder. Finally, we have our inputs: $T_{wb}$, which describes the desired transform from world to body (RPY and Translation), and $T_{bf}$, the desired foot position relative to the transformed body - this is useful for gait generation. The output of our process is $T_{sf}$, the transform between each shoulder and its respective foot required to achieve this motion - this is fed into the leg IK solver to retrieve joint angles. The gallery below shows our inputs and outputs. Note that this diagram is facing the robot, so the example shown is for body roll.
Here’s a gif of the body IK in action:
Bezier Gait
The Bezier Gait deployed in this project uses a open-loop trajectory generator, which resets when the desired stride period is completed. The basic adaptation of the Bezier curve generator gives 2D foot coordinates over time: horizontal and vertical. In section V of the paper, I describe my method for extending the trajectories into 3D.
Here’s what the translational and yaw gaits look like:
Gym Environment and Terrain
The environment provided here is largely derived from Pybullet’s minitaur example. In fact, it is nearly identical aside from accounting for the differences in the robots themselves. Another difference is the terrain used in the environment, which is an optional programmatically generated heightfield triggered at the command-line. You should experiment with the meshscale argument as well, as this will change the characteristics of your terrain. This environment is great for locomotive reinforcement learning tasks!
Reinforcement Learning Task
To allow for stable terrain traversal, I trained an Augmented Random Search agent with a 12-dimensional observation space [IMU Inputs (8), Leg Phases (4)] and a 14-dimensional action space [Clearance Height (1), Body Height (1), and Foot XYZ Residual modulations (12)] processed through an exponential filter with alpha = 0.7, the agent was able to traverse the light terrain in as little as 150 epochs.
Here’s a system diagram and algorithm for the D$^2$-GMBC process:
Real World Validation
Here are some additional results, where D$^2$-GMBC is shown on the right.
The best part is that even though the agent was only trained to walk forward, it responds to previously unseen commands such as yaw and lateral motion! This means that you can finally use RL on a real robot! Keep in mind that this is all done on a $600 platform!
Mechanical Redesign
Together with Adham Elarabawy, I have a completed a total mechanical redesign of SpotMicro, the robot that inspired this project. We call it Open Quadruped!
Main improvements:
- Shortened the body by 40mm while making more room for our electronics with adapter plates.
- Moved all the servos to the hip to save 60g on the lower legs, which are now belt-drive actuated with tunable belt tightness.
- Added support bridge on hip joint for added longevity.
- Added flush slots for hall effect sensors on the feet.
I also went created a new URDF with proper inertial values on each link, making the simulation much more reliable.
Power Distribution Board
We also designed this Power Distribution Board with a 1.5mm Track Width to support up to 6A at a 10C temperature increase (conservative estimate). There are copper grounding planes on both sides of the board to help with heat dissipation, and parallel tracks for the power lines are provided for the same reason. The PDB also includes shunt electrolytic capactiors for each servo motor to smooth out the power input. The board interfaces with a sensor array (optionally used for foot sensors) and contains two I2C terminals and a regulated 5V power rail. At the center of the board is a Teensy 4.0 which communicates with a Raspberry Pi over ROSSerial to control the 12 servo motors and read analogue sensors.