YOLOX: Exceeding YOLO Series in 2021
Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup
Contrastive Fine-grained Class Clustering via Generative Adversarial Networks
A Transformer-Based Siamese Network for Change Detection
Lawin Transformer: Improving Semantic Segmentation Transformer with Multi-Scale Representations via Large Window Attention
R3LIVE: A Robust, Real-time, RGB-colored, LiDAR-Inertial-Visual tightly-coupled state Estimation and mapping package
Robust Self-Supervised Audio-Visual Speech Recognition
Rethinking Soft Labels for Knowledge Distillation: A Bias-Variance Tradeoff Perspective
Vision Transformer for Small-Size Datasets
JoJoGAN: One Shot Face Stylization
Diffusion Models Beat GANs on Image Synthesis
SLIP: Self-supervision meets Language-Image Pre-training
NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation
Grounded Language-Image Pre-training
VocBench: A Neural Vocoder Benchmark for Speech Synthesis
Masked Autoencoders Are Scalable Vision Learners
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
Towards Real-World Blind Face Restoration with Generative Facial Prior
MetaFormer is Actually What You Need for Vision
Large-Scale Intelligent Microservices
Join Podbean Ads Marketplace and connect with engaged listeners.
Advertise Today
Create your
podcast in
minutes
It is Free
AI Deep Dive
Cyber Security Headlines
Cybersecurity Today
The WAN Show
Techmeme Ride Home