Learning Iterative Reasoning through Energy Diffusion

MIT CSAIL

(* indicates equal contribution)


ICML 2024

Abstract

We introduce iterative reasoning through energy diffusion (IRED), a novel framework for learning to reason for a variety of tasks by formulating reasoning and decision-making problems with energy-based optimization. IRED learns energy functions to represent the constraints between input conditions and desired outputs. After training, IRED adapts the number of optimization steps during inference based on problem difficulty, enabling it to solve problems outside its training distribution — such as more complex Sudoku puzzles, matrix completion with large value magnitudes, and path finding in larger graphs. Key to our method’s success is two novel techniques: learning a sequence of annealed energy landscapes for easier inference and a combination of score function and energy landscape supervision for faster and more stable training. Our experiments show that IRED outperforms existing methods in continuous-space reasoning, discrete-space reasoning, and planning tasks, particularly in more challenging scenarios.

Reasoning as Optimizing a Sequence of Energy Landscapes. Our approach formulates reasoning as iteratively optimizing a sequence of learned energy functions. Energy functions are trained with a combination of score function supervision and contrastive energy landscape supervision.

Continuous Algorithmic Reasoning

Our approach can solve a set of continuous algorithmic tasks such as matrix addition, matrix inverse and matrix completion. Below, we illustrate the error map of the predictions of our approach on the matrix inverse task as we iteratively optimize each energy landscape.



Error Map Across Landscape Optimization. Illustration of prediction error on the matrix inverse task as predictions are optimized. Optimization results at later energy landscapes have lower prediction error.

Discrete-Space Reasoning

Our approach can also solve discrete reasoning tasks such as Sudoku. Below, we illustrate the predicted solutions to the Sudoku problem as we run optimization over additional energy landscapes.



Predicted Sudoku Solutions Over Energy Landscape. Illustration of predicted sudoku boards across optimized energy landscapes. Incorrect entries are highlighted in red. At later landscapes, predicted sudoku boards are more accurate.

When generalizing to harder variants of Sudoku (with fewer numerical entries given), we can leverage additional test-time optimization iterations to obtain the final more accurate answer.


Generalization Performance with Per Landscape Optimization Steps. By increasing the number of optimization steps run at each energy landscape, IRED is able to leverage additional computational steps to improve the final performance on harder variants of Sudoku.

Planning

Finally, our approach can also solve planning problems. Below, we illustrate how IRED enables us to find the correct next action to take in a graph given start node (green) to a goal node (red).



Optimized Plans Across Landscapes. Plot of next action prediction in plans across energy landscapes. In each visualization, the green/red nodes indicate start/goal nodes with connections between nodes indicated with arrows. The darkness of a node indicates the score for selecting the corresponding node as the next node to move to in the predicted plan. As landscapes are sequentially optimized, the correct next action is selected.

Related Projects

Check out a list of our related papers on compositional generation and energy based models. A full list can be found here!

We propose energy optimization as an approach to add iterative reasoning into neural network. We illustrate how this procedure enables generalization to harder instances of problems unseen at training time on both continuous, discrete and image processing tasks.

We propose new samplers, inspired by MCMC, to enable successful compositional generation. Further, we propose an energy-based parameterization of diffusion models which enables the use of new compositional operators and more sophisticated, Metropolis-corrected samplers.

BibTeX

@InProceedings{Du_2024_ICML,
    author    = {Du, Yilun and Mao, Jiayuan and Tenenbaum, Joshua B.},
    title     = {Learning Iterative Reasoning through Energy Diffusion},
    booktitle = {International Conference on Machine Learning (ICML)},
    year      = {2024}
}