Potential Based Diffusion Motion Planning

Potential Based Diffusion Motion Planning and Generalization. Our method encodes motion planning constraints to latent vectors and uses the potential functions to generate motion plans. We show that our method can directly generalize to cluttered heterogeneous environments via composing potentials, while the potential functions are only trained on simple homogeneous environments.

Abstract

Effective motion planning in high dimensional spaces is a long-standing open problem in robotics. One class of traditional motion planning algorithms corresponds to potential-based motion planning. An advantage of potential based motion planning is composability -- different motion constraints can be easily combined by adding corresponding potentials. However, constructing motion paths from potentials requires solving a global optimization across configuration space potential landscape, which is often prone to local minima. We propose a new approach towards learning potential based motion planning, where we train a neural network to capture and learn an easily optimizable potentials over motion planning trajectories. We illustrate the effectiveness of such approach, significantly outperforming both classical and recent learned motion planning approaches and avoiding issues with local minima. We further illustrate its inherent composability, enabling us to generalize to a multitude of different motion constraints.

Method

Learning Potentials for Motion Planning. Given a motion plan $q_{1:T}$ from start state $q_{\text{st}}$ to end state $q_{\text{e}}$ and a characterization of the configuration space $C$ ($\textit{i.e.}$ the set of obstacles in the environment), we propose to learn a trajectory-level potential function $U_\theta$ so that

\begin{equation} q_{1:T}^* = \text{arg\,min}_{q_{1:T}} U_\theta(q_{1:T}, q_{\text{st}}, q_{\text{e}}, C), \label{eqn:potential_traj} \end{equation}

where $q_{1:T}^*$ is a successful motion plan from $q_{\text{st}}$ to $q_{\text{e}}$. To learn the potential function above, we propose to learn an EBM across a dataset of solved motion planning $D = \{q_{\text{st}}^i, q_{\text{e}}^i, q_{1:T}^i, C^i \}$, where $e^{-E_\theta(q_{1:T}|q_{\text{st}}, q_{\text{e}}, C)} \propto p(q_{1:T}|q_{\text{st}}, q_{\text{e}}, C)$. Since the dataset $D$ is of solved motion planning problems, the learned energy function $E_\theta$ will have minimal energy at successful motion plans $q_{1:T}^*$ and thus satisfy our potential function $U_\theta$.

Learning EBM for Motion Plan Generation. To learn the EBM landscape that enables us to effectively optimize and generate motion plans $q_{1:T}^*$, we propose to shape the energy landscape using denoising diffusion training objective. Formally, to train our potential, we use the energy based diffusion training objective, where the gradient of energy function is trained to denoise noise-corrupted motion plans $q_{1:T}^i$ from dataset $D$,

\begin{equation} \mathcal{L}_{\text{MSE}}=\|\mathbf{\epsilon} - \nabla_{q_{1:T}} E_\theta(\sqrt{1-\beta_s} q_{1:T}^i + \sqrt{\beta_s} \mathbf{\epsilon}, s, q_{\text{st}}^i, q_{\text{e}}^i, C^i)\|^2 \label{eqn:train_obj} \end{equation}

where $\epsilon$ is sampled from Gaussian noise $\mathcal{N}(0, 1)$, $s \in\{1,2,...,S\}$ is the denoising diffusion step (we set $S = 100$), and $\beta_s$ is the corresponding Gaussian noise corruption on a motion planning path $q_{1:T}^i$. We refer to $E_\theta$ as the diffusion potential function.

Optimize and Sample from Diffusion Potential Function. To optimize and sample from our diffusion potential function, we initialize a motion path $q_{1:T}^S$ at diffusion step $S$ from Gaussian noise $\mathcal{N}(0, 1)$ and optimize for motion path following the gradient of the energy function $E_{\theta}$. We iteratively refine the motion path $q_{1:T}^s$ across each diffusion step following

\begin{align} & q_{1:T}^{s-1}=q_{1:T}^{s}-\gamma \epsilon + \xi, \quad \xi \sim \mathcal{N} \bigl(0, \sigma^2_s I \bigl), \notag \\ & \text{where} \: \: \epsilon = \nabla_{q_{1:T}} E_\theta(q_{1:T}, s, q_{\text{st}}, q_{\text{e}}, C) \label{eqn:diffusion_opt} \end{align}

To parameterize the energy function $E_\theta(q_{1:T}, s, q_{\text{st}}, q_{\text{e}}, C)$, we use classifier-free guidance to form a peaker composite energy function conditioned on $C$. $\gamma$ and $\sigma^2_s$ are diffusion specific scaling constants (A rescaling term at each diffusion step is omitted above for clarity). The final predicted motion path $q_{1:T}^*$ corresponds to the output $q_{1:T}^0$ after running $S$ steps of optimization from the diffusion potential function.

Composing Diffusion Potential Functions Given two separate diffusion potential functions $E^1_\theta(\cdot)$ and $E^2_\theta(\cdot)$, encoding separate constraints for motion planning, we can likewise form a composite potential function $E^{\text{comb}}(\cdot) = E^1(\cdot) + E^2(\cdot)$ by directly summing the corresponding potentials. To sample from the new diffusion potential function $E^{\text{comb}}$, we can follow

\begin{align} &q_{1:T}^{s-1}=q_{1:T}^{s}-\gamma \epsilon^{\text{comb}}+\xi, \quad \xi \sim \mathcal{N} \bigl(0, \sigma^2_s I \bigl), \label{eqn:diffusion_opt_comb} \\ & \text{where} \: \: \epsilon^{\text{comb}} = \nabla_{q_{1:T}} (E_\theta^{1}(q_{1:T}, s, q_{\text{st}}, q_{\text{e}}, C_1) + E_\theta^{2}(q_{1:T}, s, q_{\text{st}}, q_{\text{e}}, C_2)). \notag \end{align}

This potential function $E^{\text{comb}}$ will have low energy precisely at motion planning paths $q_{1:T}$ which satisfy both constraints, with sampling corresponding to optimizing this potential function. Hence, by iteratively sampling from the composite potential function, we can obtain motion plans that satisfy both constraints. Please see our compositional results below.

Trajectory Denoising Process using Diffusion Potential Function

Trajectory Denoising Process on Maze2D. Motion trajectories are randomly initialized from Gaussian distribution in timestep $S = 100$. Noises are iteratively removed via the gradient of the energy function. A feasible collision-free trajectory can be obtained at timestep $S = 0$. We use DDIM to accelerate sampling speed.

Motion Plans on Concave Obstacles

(a)

RMP

RRT*

BIT*

MPNet

Ours
(b)

RMP

RRT*

BIT*

MPNet

Ours
(c)

RMP

RRT*

BIT*

MPNet

Ours
(d)

RMP

RRT*

BIT*

MPNet

Ours

Qualitative Comparison of Motion Plans on Maze2D with Concave Obstacles. The green star indicates the start pose and the red star indicates the goal pose. RMP tends to stuck in local minima and fails to reach the goals; RRT* is slow in speed and may reach the planning time limit in some difficult problems; trajectories by MPNet may occasionally go through obstacles in the environments; our method is able to generate smooth and low-cost trajectories without collision while being up to 10 times faster than BIT*.

Motion Plans on KUKA

(a)

Ours

MπNet
(b)

Ours

MπNet
(c)

Ours

MπNet
(d)

Ours

MπNet

Qualitative Comparison of Motion Plans on KUKA. We present motion plans by our diffusion potential planner and MπNet. The large green/pink ball indicates the start/goal state. Our method in the left column generates smooth and near-optimal trajectories, while the trajectories of MπNet might choose a longer route, get stuck in some local regions, fail to pass through some narrow passages, or keep hovering near the start state.

Diverse Motion Plan Morphology

Trajectory 1

Trajectory 2

Stochastic Denoising Process Enables Diverse Motion Path Morphology. Trajectory 1 and Trajectory 2 are generated under the same constraints (e.g., start state, goal state, obstacles locations) and planning horizon, but with different denoising random seeds.

Maze2D 6+6 Obstacles

Horizon

Different Planning Horizons Enable Diverse Motion Path Morphology. Left: Motion planning problem setup. The green/pink star indicates the start/goal state. Right: Various motion path selection according to the planning horizon.

Generalization via Composing Potentials

Our method can generalize to more difficult out-of-distribution environments by sampling from the composite diffusion potential function. We demonstrate qualitative results where we compose potentials of the same obstacle type, of different obstacle types, and of static and dynamic obstacles. Please see our paper for more comprehensive results.

6
6 + 1
6 + 2
6 + 3
6 + 4
6 + 5
6 + 6

Compositional Generalization over Increasing Obstacles. The green/pink star indicates the start/goal state. Our planner is trained with a dataset with only 6 obstacles and can generalize to scenarios with more obstacles by directly composing potentials in test time (without any re-training). Though trained on environments with fewer obstacles, the generated trajectories can still reach the goals with near optimal paths and various morphology shapes through composing potentials.

3
3 + 1
3 + 2
3 + 3
3 + 4
3 + 5
3 + 6

Compositional Generalization over Different Obstacles. Two separately trained models are composed: a model only trained on large blocks (blue) of size $1.4 \times 1.4$ and a model only trained on smaller blocks (orange) $1 \times 1$. By composing two models, our planner can generalize to more complex scenarios with various obstacles in test time (without any re-training). The generated trajectories demonstrate various morphology and reach the goal with smooth and short paths.

6
6 + 1
6 + 2
6 + 3
6 + 4
6 + 5
6 + 6

Compositional Generalization over Different Obstacles. Two separately trained models are composed: a model only trained on large blocks (blue) of size $1.4 \times 1.4$ and a model only trained on smaller blocks (orange) $1 \times 1$. By composing two models, our planner can generalize to more complex scenarios with various obstacles in test time (without any re-training). The generated trajectories demonstrate various morphology and reach the goal with smooth and short paths.

6 static + 1 dynamic

3 static + 1 dynamic

Compositional Generalization over Static and Dynamics Obstacles. The current position of the agent is shown in the pink asterisk. The gray line indicates the moving trajectory of the dynamic obstacle (the large blue block). Note that our method is only trained on environments that purely consist of static obstacles or dynamic obstacles. For example, for the left two videos, we compose a model trained on 6 static obstacles and a model trained on 1 dynamic obstacle and evaluate the model on environments with both static and dynamic obstacles. The composite planner is able to generate feasible motion plans, where the resulting trajectories do not directly go straight to the targets but detour and wait to avoid the moving obstacle.

BibTeX

@article{luo2024potential,
      title={Potential based diffusion motion planning},
      author={Luo, Yunhao and Sun, Chen and Tenenbaum, Joshua B and Du, Yilun},
      journal={arXiv preprint arXiv:2407.06169},
      year={2024}
    }

Related Projects

Check out a list of our related papers on compositional generation and energy based models. A full list can be found here!

Reduce, Reuse, Recycle: Compositional Generation with Energy-Based Diffusion Models and MCMC

We propose new samplers, inspired by MCMC, to enable successful compositional generation. Further, we propose an energy-based parameterization of diffusion models which enables the use of new compositional operators and more sophisticated, Metropolis-corrected samplers.

Compositional Visual Generation with Composable Diffusion Models

We present a method to compose different diffusion models together, drawing on the close connection of diffusion models with EBMs. We illustrate how compositional operators enable the ability to composing multiple sets of objects together as well as generate images subject to complex text prompts.

Planning with Diffusion for Flexible Behavior Synthesis

Diffuser is a denoising diffusion probabilistic model that plans by iteratively refining randomly sampled noise. The denoising process lends itself to flexible conditioning, by either using gradients of an objective function to bias plans toward high-reward regions or conditioning the plan to reach a specified goal.