$\vec{w}h\alpha\mathfrak{t}\;\; i\mathbb{S}\ldots$


Shpresim Sadiku (TU Berlin, Zuse Institut Berlin)
TU Berlin, EW 201 (Physics Building)
In addition, the talk will be live-streamed via zoom; the link has been sent out with the email announcements of this talk.
About what?

Deep Neural Networks (DNNs) are a composition of several vector-valued functions. In order to train DNNs, it is necessary to calculate the gradient of the error function with respect to all parameters. As the error function of a DNN consists of several nonlinear functions, each with numerous parameters, this calculation is not trivial. We revisit the Backpropagation (BP) algorithm, widely used by practitioners to train DNNs. By leveraging the composite structure of the DNNs, we show that the BP algorithm is able to efficiently compute the gradient and that the number of layers in the network does not significantly impact the complexity of the calculation.