In this episode we discuss Hard Patches Mining for Masked Image Modeling
by Haochen Wang, Kaiyou Song, Junsong Fan, Yuxi Wang, Jin Xie, Zhaoxiang Zhang. The paper proposes a new framework called Hard Patches Mining (HPM) for pre-training in masked image modeling (MIM). The authors argue that MIM models should not only focus on predicting specific contents of masked patches but also on producing challenging problems by themselves. HPM uses an auxiliary loss predictor that predicts patch-wise losses and decides where to mask next, using a relative relationship learning strategy to prevent overfitting. Experiments demonstrate the effectiveness of HPM in constructing masked images and the efficacy of the ability to be aware of where it is hard to reconstruct.
view more