The proposed framework contains two important ingredients: Smoothness regularization and Bregman proximal point optimization. Our experiments show that the proposed framework achieves new state-of-the-art performance on a number of NLP tasks including GLUE, SNLI, SciTail and ANLI. 2020: Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, T. Zhao Keywords: Language model, Overfitting, Experiment, Manifold regularization, Matrix regularization https://arxiv.org/pdf/1911.03437v4.pdf
view more