Optimizing Functionals on the Space of Probabilities with Input Convex Neural Networks
Abstract: Gradient flows are a powerful tool for optimizing functionals in general metric spaces, including the space of probabilities endowed with the Wasserstein metric. A typical approach to solving this optimization problem relies on its connection to the dynamic formulation of optimal transport and the celebrated Jordan-Kinderlehrer-Otto (JKO) scheme. However, this formulation involves optimization over convex functions, which is challenging, especially in high dimensions. In this work, we propose an approach that relies on the recently introduced input-convex neural networks (ICNN) to parametrize the space of convex functions in order to approximate the JKO scheme, as well as in designing functionals over measures that enjoy convergence guarantees. We derive a computationally efficient implementation of this JKO-ICNN framework and experimentally demonstrate its feasibility and validity in approximating solutions of low-dimensional partial differential equations with known solutions. We also demonstrate its viability in high-dimensional applications through an experiment in controlled generation for molecular discovery.
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: **Minor revision request:** Dear Action editor, Thank you very much for overseeing our paper and for the constructive feedback. We have revised Appendix D according to your recommendation: We changed SGLD throughout to ULA. Thank you again. Best regards, Authors --------- Dear reviewers, We wanted to thank you again for all the helpful comments and thoughtful reviews. We have just posted an updated version of our manuscript addressing the clarifications and changes requested in your reviews. Below is a summary of the changes we made: 1. **Added requested references to our related works section** 2. **Clarifications and additional comments** - Clarified that warmstart **was used** in solving PDEs with known solutions (Section 5) and was **not** used in the molecular discovery experiments (Section 6). - Clarified that JKO-ICNN is more general than other generative modeling approaches because, as reviewer 83vF suggests, “any loss can basically be plugged in and minimized.” - Commented that convex potential flows can also be used even “when we only have access to the target up to a constant (using the forward KL instead of the reverse KL).” (83vF) - Added discussion of Hutchinson estimator limitations (x751) 3. **Provided additional details to the Molecule experiment:** - Moved more of the text explaining the direct optimization baseline, specifically its formal definition, from the appendix to the main paper’s experimental results sections. - Moved details from Appendix E regarding the divergence D to Section 6. - Clarified how the Divergence term fits into the theoretical framework (x751) - Cleaned up Table 6 in Appendix to make it more readable 4. **Fixed Typos** - In Section 4, after quotation of Salim et al, "show that in enjoys" --> “in” should be “it” - Fix typo in Appendix D, in the recursion $X_K$ should be $X_k$
Assigned Action Editor: ~Arnaud_Doucet2
Submission Number: 54