Extremely large pre-trained language models (PTMs) such as GPT-3 are usually released as a service, allowing users to design taskspecific prompts to query the PTMs through some black-box APIs. In such a scenario, which we call Language-Model-as-a-Service (LMaaS), gradients of the PTMs are usually not available. Can we optimize the task prompts by only accessing the model inference APIs? Based on recent observations that large PTMs have a very low intrinsic dimensionality, this work proposes the Black-Box Tuning to optimize PTMs through derivativefree algorithms.
2022: Tianxiang Sun, Yunfan Shao, Hong Qian, Xuanjing Huang, Xipeng Qiu
https://arxiv.org/pdf/2201.03514v1.pdf
Create your
podcast in
minutes
It is Free