Language-driven Semantic Segmentation
R-Drop: Regularized Dropout for Neural Networks
Black-Box Tuning for Language-Model-as-a-Service
Vision Transformer with Deformable Attention
Transfer Learning for Pose Estimation of Illustrated Characters
Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets
A ConvNet for the 2020s
Detecting Twenty-thousand Classes using Image-level Supervision
YOLOX: Exceeding YOLO Series in 2021
Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup
Contrastive Fine-grained Class Clustering via Generative Adversarial Networks
A Transformer-Based Siamese Network for Change Detection
Lawin Transformer: Improving Semantic Segmentation Transformer with Multi-Scale Representations via Large Window Attention
R3LIVE: A Robust, Real-time, RGB-colored, LiDAR-Inertial-Visual tightly-coupled state Estimation and mapping package
Robust Self-Supervised Audio-Visual Speech Recognition
Rethinking Soft Labels for Knowledge Distillation: A Bias-Variance Tradeoff Perspective
Vision Transformer for Small-Size Datasets
JoJoGAN: One Shot Face Stylization
Diffusion Models Beat GANs on Image Synthesis
SLIP: Self-supervision meets Language-Image Pre-training
Join Podbean Ads Marketplace and connect with engaged listeners.
Advertise Today
Create your
podcast in
minutes
It is Free
Babbage from The Economist
Cyber Security Headlines
Techmeme Ride Home
Cybersecurity Today
Software Engineering Daily