A Vector Quantized Approach for Text to Speech Synthesis on Real-World Spontaneous Speech
Token Merging: Your ViT But Faster
BioGPT: Generative Pre-trained Transformer for Biomedical Text Generation and Mining
Dual PatchNorm
Reversible Vision Transformers
Offsite-Tuning: Transfer Learning without Full Model
A Length-Extrapolatable Transformer
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Multimodal Chain-of-Thought Reasoning in Language Models
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
InstructPix2Pix: Learning to Follow Image Editing Instructions
Towards Robust Blind Face Restoration with Codebook Lookup Transformer
Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers
How Close is ChatGPT to Human Experts? Comparison Corpus, Evaluation, and Detection
Why do Nearest Neighbor Language Models Work?
Text2Poster: Laying Out Stylized Texts on Retrieved Images
Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling
Reversible Column Networks
The Forward-Forward Algorithm: Some Preliminary Investigations
Cramming: Training a Language Model on a Single GPU in One Day
Join Podbean Ads Marketplace and connect with engaged listeners.
Advertise Today
Create your
podcast in
minutes
It is Free
Cyber Security Headlines
The WAN Show
Cybersecurity Today
Risky Business
Babbage from The Economist