Felix Voigtlaender - Neural networks for classification problems with boundaries of Barron class

2020 ж. 7 Жел.
429 Рет қаралды

Presentation given by Felix Voigtlaender on December 2nd in the one world seminar on the mathematics of machine learning on the topic "Neural network performance for classification problems with boundaries of Barron class".
Abstract: We study classification problems in which the distances between the different classes are not necessarily positive, but for which the boundaries between the classes are well-behaved. More precisely, we assume these boundaries to be locally described by graphs of functions of Barron-class. ReLU neural networks can approximate and estimate classification functions of this type with rates independent of the ambient dimension. More formally, three-layer networks with $N$ neurons can approximate such functions with $L^1$-error bounded by $O(N^{-1/2})$. Furthermore, given $m$ training samples from such a function, and using ReLU networks of a suitable architecture as the hypothesis space, any empirical risk minimizer has generalization error bounded by $O(m^{-1/4})$. All implied constants depend only polynomially on the input dimension. We also discuss the optimality of these rates. Our results mostly rely on the "Fourier-analytic" Barron spaces that consist of functions with finite first Fourier moment. But since several different function spaces have been dubbed "Barron spaces'' in the recent literature, we discuss how these spaces relate to each other. We will see that they differ more than the existing literature suggests.

Пікірлер
  • You can swap what is adjustable in deep neural networks. You can have fixed dot products (enacted with fast transforms) and adjustable (parametric) activation functions. The fast Walsh Hadamard transform is ideal. See Fast Transform fixed-filter-bank neural networks.

    @nguyenngocly1484@nguyenngocly14843 жыл бұрын
  • ReLU is a switch. f(x)=x is connect. f(x)=0 is disconnect. A ReLU net is a switched composition of dot products. If all the switch states become known the net collapses to a simple matrix, upon which you can apply various metrics if you are curious. Also never forget the variance equation for linear combinations of random variables applies to the dot product (to test noise sensitivity etc.)

    @nguyenngocly1484@nguyenngocly14843 жыл бұрын
KZhead