Depth is the hallmark of deep neural networks. But more depth means more sequential computation and higher latency. This begs the question – is it possible to build high-performing “non-deep” neural networks? We show that it is. To do so, we use parallel subnetworks instead of stacking one layer after another. This helps effectively reduce depth while maintaining high performance.
2021: Ankit Goyal, Alexey Bochkovskiy, Jia Deng, V. Koltun
https://arxiv.org/pdf/2110.07641v1.pdf
view more