\section{Conclusions}
\label{sec:conclusions}

Understanding the inductive bias of neural networks for learning DNFs is an important and difficult theoretical challenge. In this work we focused on a setup of learning read-once DNFs under the uniform distribution with a convex network. We empirically observed that GF converges to solutions that align with the DNF terms. We then proved that GF cannot converge to solutions that memorize the training points, despite the fact that they minimize the training accuracy. We additionally proved that under certain assumptions, solutions that minimize the norm are solutions that align with the terms of the DNF. 
Together with recent results which show that GF is biased towards minimization of norms, this corroborates our empirical findings.

Our work has several limitations which are mainly due to the fact that analyzing nonlinear networks with nonlinear data is a major challenge:
\begin{enumerate} 
    \item We only consider a setup of uniform distribution and read-once DNF. We also restrict analysis to convex networks, but our empirical results actually suggest that convex networks may be preferable to standard ones in our setting.
    \item We do not show an end-to-end convergence of GF to DNF recovery solutions.
    \item We do not address the sample complexity of GF in our setting (however, intuitively, an inductive bias towards alignment will improve sample complexity, since alignment reduces the effective number of parameters the network uses).
\end{enumerate}

Our work opens up many interesting directions for future work. For example, it would be interesting to understand if DNF recovery is possible for other distributions and DNFs that are not read-once. Another interesting direction is to understand the sample complexity of neural networks for learning DNFs and how it relates to DNF-recovery. Finally, it will be interesting to understand how learning dynamics in neural nets are related to other algorithms for learning DNFs.

\section*{Acknowledgments} This project was funded by the European Research Council (ERC) under the European Unions Horizon 2020 research and innovation programme (grant ERC HOLI 819080). AB is supported by a Google PhD fellowship.
\clearpage