Download - SWaV: Unsupervised Learning of Visual Features by Contrasting Cluster Assignments (Mathilde Caron)

Discover

Podcast Features
Your all-in-one podcasting solution.

Blog to Podcast
Turn your blog into an engaging podcast.
Livestream
High-performing audio live, without limits.

Podcast Studio
Easy-to-use audio recorder app.
Podbean AI
AI-Enhanced Audio Quality and Content Generation.

Podcast App
The best podcast player & podcast app.

Ads Marketplace
Join Ads Marketplace to earn money
through sponsorship on your podcast.

PodAds
Manage your ads with dynamic ad insertion capability.
Apple Podcasts Subscriptions Integration
Effortlessly publish and manage exclusive episodes for your
Apple Podcasts subscribers directly from Podbean.
Live Streaming
Receive livestream rewards from your audience and earn
recurring income from your Fan Club membership.

All Arts Business Comedy Education
Fiction Government Health & Fitness History Kids & Family
Leisure Music News Religion & Spirituality Science
Society & Culture Sports Technology True Crime TV & Film
Live

How to Start a Podcast
How to Start a Live Podcast
How to Monetize a podcast
How to Promote Your Podcast
How to Use Group Recording

Log in
Start your podcast for free

Podcasting
Monetization
Advertisers
Enterprise
Pricing
Discover

Machine Learning Street Talk (MLST)

Technology

SWaV: Unsupervised Learning of Visual Features by Contrasting Cluster Assignments (Mathilde Caron)

2020-09-14

Download Right click and do "save link as"

This week Dr. Tim Scarfe, Yannic Lightspeed Kicher, Sayak Paul and Ayush Takur interview Mathilde Caron from Facebook Research (FAIR).

We discuss Mathilde's paper which she wrote with her collaborators "SWaV: Unsupervised Learning of Visual Features by Contrasting Cluster Assignments" @ https://arxiv.org/pdf/2006.09882.pdf

This paper is the latest unsupervised contrastive visual representations algorithm and has a new data augmentation strategy and also a new online clustering strategy.

Note; Other authors; Ishan Misra, Julien Mairal, Priya Goyal, Piotr Bojanowski, Armand Joulin

Sayak Paul - @RisingSayak / https://www.linkedin.com/in/sayak-paul/

Ayush Thakur - @ayushthakur0

/ https://www.linkedin.com/in/ayush-thakur-731914149/

The article they wrote;

https://app.wandb.ai/authors/swav-tf/reports/Unsupervised-Visual-Representation-Learning-with-SwAV--VmlldzoyMjg3Mzg

00:00:00 Yannic probability challenge (CAN YOU SOLVE IT?)

00:01:29 Intro topic (Tim)

00:08:18 Yannic take

00:09:33 Intro show and guests

00:11:29 SWaV elevator pitch

00:17:31 Clustering approach in general

00:21:17 Sayak and Ayush's article on SWaV

00:23:49 Optional transport problem / Sinkhorn-Knopp algorithm

00:31:43 Is clustering a natural approach for this?

00:44:19 Image augmentations

00:46:20 Priors vs experience (data)

00:48:32 Life at FAIR

00:52:33 Progress of image augmentation

00:56:10 When things do not go to plan with research

01:01:04 Question on architecture

01:01:43 SWaV Results

01:06:26 Reproducing Matilde's code

01:14:51 Do we need the whole dataset to set clustering loss

01:16:40 Self-supervised learning and transfer learning

01:23:25 Link to attention mechanism)

01:24:41 Sayak final thought why unsupervised better

01:25:56 Outro

Abstract;

"Unsupervised image representations have significantly reduced the gap with supervised pretraining, notably with the recent achievements of contrastive learning methods. These contrastive methods typically work online and rely on a large number of explicit pairwise feature comparisons, which is computationally challenging. In this paper, we propose an online algorithm, SwAV, that takes advantage of contrastive methods without requiring to compute pairwise comparisons. Specifically, our method simultaneously clusters the data while enforcing consistency between cluster assignments produced for different augmentations (or “views”) of the same image, instead of comparing features directly as in contrastive learning. Simply put, we use a “swapped” prediction mechanism where we predict the cluster assignment of a view from the representation of another view. Our method can be trained with large and small batches and can scale to unlimited amounts of data. Compared to previous contrastive methods, our method is more memory efficient since it does not require a large memory bank or a special momentum network. In addition, we also propose a new data augmentation strategy, multi-crop, that uses a mix of views with different resolutions in place of two full-resolution views, without increasing the memory or compute requirements much. We validate our findings by achieving 75.3% top-1 accuracy on ImageNet with ResNet-50, as well as surpassing supervised pretraining on all the considered transfer tasks."