In this episode we discuss NAR-Former: Neural Architecture Representation Learning towards Holistic Attributes Prediction
by Yun Yi, Haokui Zhang, Wenze Hu, Nannan Wang, Xiaoyu Wang. The paper proposes a neural architecture representation model that can be used to estimate attributes of different neural network architectures such as accuracy and latency without running actual training or inference tasks. The proposed model first uses a simple and effective tokenizer to encode operation and topology information into a single sequence, then uses a multi-stage fusion transformer to build a compact vector representation. An information flow consistency augmentation is proposed for efficient model training, which achieves promising results in predicting both cell architectures and whole deep neural networks. Code is available on Github.
view more