This paper shows that masked autoencoders (MAE) are scalable self-supervised learners for computer vision. Our MAE approach is simple: we mask random patches of the input image and reconstruct the missing pixels. It is based on two core designs.
2021: Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Doll'ar, Ross B. Girshick
Ranked #1 on Image Classification on iNaturalist 2019
https://arxiv.org/pdf/2111.06377v2.pdf
view more