Neural networks were originally introduced in 1943 by McCulloch and Pitts as an approach to develop learning algorithms by mimicking the human brain. The major goal at that time was to introduce a sound theory of artificial intelligence. However, the limited amount of data and the lack of high performance computers made the training of deep neural networks, i.e., networks with many layers, infeasible.

Nowadays, massive amounts of training data are available, complemented by a tremendously increased computing power, allowing us for the first time to apply deep learning algorithms in practice. It is for this reason that deep neural networks have recently seen an impressive comeback. Spectacular applications of deep learning are AlphaGo, which for the first time enabled a computer to beat the top world players in the game Go (which is by far more complex than chess), or the speech recognition systems available on each smart phone these days.

Despite the outstanding success of deep learning in real-world applications, most of the related research is empirically driven and a mathematical foundation is almost completely missing. At the same time, those methods have already shown their impressive potential also in mathematical research areas such as imaging sciences, inverse problems, or numerical analysis of partial differential equations. In many situations, deep-learning-based strategies have even outperformed traditional mathematical approaches, which were before said to be the state-of-the-art for particular problem classes.

Recently, various researchers from the mathematical community decided to contribute to developing a thorough mathematical foundation of deep learning, approaching the problem from diverse angles; thereby joining the theoretical computer scientists, who also focus on those questions. The beauty of this research area is its importance for mathematics and society as well as the richness of techniques which are involved, ranging from applied harmonic analysis and approximation theory over optimization methods to statistical learning theory.

Some key issues in this realm are the following:

  • Why are deep neural networks oftentimes much more effective than shallow ones?
  • How many training samples does one need to achieve a certain accuracy in a specific model situation?
  • How sparsely connected can the network be?
  • Can one give a meaning to filters in different layers?
  • Which classes of functions can be efficiently approximated?
  • Are there provably better algorithms than regular gradient descent?
  • What are problem classes for which neural networks are not well suited?

The intention of this website is to support research on the mathematics of deep learning and to provide a platform for dissemination and discussion of recent results.

- Gitta Kutyniok

Category: Featured Articles