PRESTO: Fast Motion Planning Using Diffusion Models Based on Key-Configuration Environment Representation
Mingyo Seo1* Yoonyoung Cho2* Yoonchang Sung1 Peter Stone1 Yuke Zhu1† Beomjoon Kim2†
1UT Austin 2KAIST * Equal contribution † Equal advising
IEEE International Conference on Robotics and Automation (ICRA), 2025
Paper | Code | Appendix
We introduce a learning-guided motion planning framework that generates seed trajectories using a diffusion model for trajectory optimization. Given a workspace, our method approximates the configuration space (C-space) obstacles through an environment representation consisting of a sparse set of task-related key configurations, which is then used as a conditioning input to the diffusion model. The diffusion model integrates regularization terms that encourage smooth, collision-free trajectories during training, and trajectory optimization refines the generated seed trajectories to correct any colliding segments. Our experimental results demonstrate that high-quality trajectory priors, learned through our C-space-grounded diffusion model, enable the efficient generation of collision-free trajectories in narrow-passage environments, outperforming previous learning- and planning-based baselines. |
Main Results
We evaluate our method on a motion planning task using the Franka Emika Panda robot arm to traverse a 3-tier shelf with various objects in simulation. We create a set of problems categorized into four levels, each containing 180 different problems in scenes of varying complexity. |
|
Collision-free/Colliding |
Across all levels, PRESTO consistently outperforms the pure learning algorithms SceneDiffuser and Motion Planning Diffuser (MPD), which lack a key-configuration environment representation and a motion-planning objective, respectively. Compared to Bi-RRT, PRESTO uses diffusion-learned trajectory priors to generate collision-free trajectories more efficiently, especially in narrow passages. Additionally, compared to TrajOpt, an optimization-based method, PRESTO's high-quality initial trajectories lead to faster convergence in complex domains, despite the computational overhead of running the diffusion model. |
Ablation Studies
|
Compared to PRESTO, Point-Cloud Conditioning shows performance degradation across problem levels and post-processing iterations, with higher collision rates and penetration depths that worsen with complexity. Similarly, Training Without TrajOpt exhibits consistent performance degradation across all levels, though less severe than Point-Cloud Conditioning. This highlights that incorporating motion-planning costs into the training of diffusion models enhances trajectory quality. Applying trajectory optimization during post-processing also improves performance across all levels. Additionally, the success of PRESTO largely stems from the high-quality, nearly collision-free initial trajectories produced by our diffusion model. |
|
In an unconditional diffusion model, test-time guidance constrains trajectories to specific environments and start/goal configurations. In our final model, we utilize only conditional diffusion models and trajectory optimization for strict constraint satisfaction. Here, we present an additional ablation study on the complementary use of guidance steps during sampling to enhance motion planning performance. Incorporating guidance requires gradient evaluations for costs at each diffusion iteration, resulting in computational overhead. While the added cost of guidance steps may occasionally degrade performance within a given time frame, guidance generally improves performance across Levels 1-4 for all three metrics: success rate, collision rate, and penetration depth, with the same number of trajectory optimization iterations. |
Citation
|