Main content start

Convex Liftings for Neural Networks

Aaron Mishkin

One of the key themes in modern optimization research is the boundary between convex and non-convex optimization. While convex optimization problems can often be solved efficiently, most non-convex problems are provably hard. Unfortunately, many crucial problems are non-convex, including training the neural networks which have been key to the rapid advances in machine learning over the last decade. This project breaks the barrier between convex and non-convex optimization by "lifting" non-convex problems into high-dimensional spaces where they become convex. While liftings are known for simple neural networks, more complex architectures like those powering ChatGPT and Midjourney remain unexplored. Since lifted networks can be faster and more stable to train than their non-convex counterparts, developing new liftings has the potential to accelerate optimization for many problems. It also can shed new light on existing algorithms. For example, Adam is one of the most-cited ideas in Computer Science, but the reasons behind its success are not well understood. Lifting Adam into the space of convex neural networks provides a new and powerful approach to studying this extremely popular algorithm.


 

Academic Year
2024-2025