Fiducial Exoskeletons

Abstract

We present a simple yet effective approach to robot state estimation that eliminates the need for expensive, high-precision actuators. Conventional methods rely on forward kinematics (FK) and precise motor encoders, but these degrade in accuracy due to backlash, thermal drift, and mismatched simulation–sensor dynamics. To improve both accuracy (especially on lower-cost hardware), we reformulate state estimation as a visual pose-estimation problem. Our method estimates the 6-DoF pose of each robot link from a single RGB image and performs a global joint optimization—optionally warm-started with encoder readings—to recover the full robot configuration. This formulation supports accurate 3D tasks such as joint-angle inference, camera pose estimation, and robot calibration directly from monocular images. To enable robust visual pose estimation of each link, we introduce the fiducial exoskeleton: a 3D-printed frame with fiducial markers, mounted on each link. The known marker-link relationship allows precise per-link pose estimation without sequential calibration or specialized equipment. Experiments on state-estimation and robotic control tasks demonstrate substantial accuracy gains over standard FK-based methods, even on such low-cost hardware.

State Estimation Results

We compare state estimation (estimated robot state rendered on top of original video) with the traditional method of just using forward kinematics, our method using the raw encoders as input, and our method without using any internal motor readings.

The gray rendered robot is the estimated robot state overlayed on top of the real robot.

Forward Kinematics (Encoders Only)

Ours — w/o Encoder Init

Ours — w Encoders

Control Results

We compare accuracy of moving the robot to a target kinematic state using encoders only, our method without delta correction, and our method with delta correction.

The gray rendered robot is the target robot state overlayed on top of the robot moving to (and ideally reaching) that state, for a sequence of target states .

Fiducial Exoskeletons:
Image-Centric Robot State Estimation!

Moving Camera, Static Robot

Moving Robot, (Mostly) Static Robot Base

Abstract

State Estimation Results

Forward Kinematics (Encoders Only)

Ours — w/o Encoder Init

Ours — w Encoders

Control Results

Naive (Encoders Only)

Ours — w/o Delta Correction

Ours — w/ Delta Correction

Analytical Puck Grasp Demo

Coming Soon: FidEx on More Embodiments

Fiducial Exoskeletons: Image-Centric Robot State Estimation!

Moving Camera, Static Robot

Moving Robot, (Mostly) Static Robot Base

Abstract

State Estimation Results

Forward Kinematics (Encoders Only)

Ours — w/o Encoder Init

Ours — w Encoders

Control Results

Naive (Encoders Only)

Ours — w/o Delta Correction

Ours — w/ Delta Correction

Analytical Puck Grasp Demo

Coming Soon: FidEx on More Embodiments

Fiducial Exoskeletons:
Image-Centric Robot State Estimation!