Dual PatchNorm
Reversible Vision Transformers
Offsite-Tuning: Transfer Learning without Full Model
A Length-Extrapolatable Transformer
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Multimodal Chain-of-Thought Reasoning in Language Models
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
InstructPix2Pix: Learning to Follow Image Editing Instructions
Towards Robust Blind Face Restoration with Codebook Lookup Transformer
Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers
How Close is ChatGPT to Human Experts? Comparison Corpus, Evaluation, and Detection
Why do Nearest Neighbor Language Models Work?
Text2Poster: Laying Out Stylized Texts on Retrieved Images
Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling
Reversible Column Networks
The Forward-Forward Algorithm: Some Preliminary Investigations
Cramming: Training a Language Model on a Single GPU in One Day
TorchGeo: deep learning with geospatial data
Revisiting Classifier: Transferring Vision-Language Models for Video Recognition
Editing Models with Task Arithmetic
Join Podbean Ads Marketplace and connect with engaged listeners.
Advertise Today
Create your
podcast in
minutes
It is Free
AI Deep Dive
Cyber Security Headlines
Cybersecurity Today
The 404 Media Podcast
The WAN Show