Visual-goal motion planning extends classical motion planning by replacing explicit goal configurations qgoal with goal images Igoal. While traditional RRT-based planners guide tree expansion toward known configuration targets, visual-goal planning navigates configuration space using only visual similarity to the goal image as feedback. This formulation is essential for vision-centric applications where goals are demonstrated visually but precise joint angles are unavailable to obtain.
We compare vRRT against Prof.Robot and reference RRT* solutions across three robot platforms. Each video shows the planned trajectory overlaid on the scene. vRRT successfully discovers collision-free paths that closely match RRT* solutions while operating purely from visual goals.
Franka Emika Panda
UR5e
Fetch
We deploy vRRT on a physical Fetch mobile manipulator to validate sim-to-real transfer. Each video compares the robot execution (left) with the Gaussian Splatting rendering (right) of the planned trajectory. vRRT successfully plans and executes paths in real-world environments, demonstrating effective transfer to physical deployment.
We use an image generation model to create goal images of a Franka robot from natural language prompts, and demonstrate vRRT successfully plans executable paths to these synthesized targets despite domain gap.
In practical robotics applications, goals are often demonstrated through videos—for instance, a human operator recording a desired manipulation outcome from multiple angles. We validate vRRT on the Panda-3Cam-Azure dataset, which captures real Franka robot configurations, simulating such demonstration scenarios. Given only the video observations without explicit joint angles, vRRT successfully recovers the demonstrated poses.
@inproceedings{lee2026visualrrt,
title = {Visual-RRT: Finding Paths toward Visual-Goals via Differentiable Rendering},
author = {Sebin Lee and Jumin Lee and Taeyeon Kim and Youngju Na and Woobin Im and Sungeui Yoon},
booktitle = {Proceedings of the IEEE/CVF conference on computer vision and pattern recognition},
year = {2026}
}