Download - #32 Deep tensor factorization and a pitfall for machine learning methods with Jacob Schreiber

Discover

Podcast Features
Your all-in-one podcasting solution.

Podcast Studio
Easy-to-use audio recorder app.
Livestream
High-performing audio live, without limits.

Podcast App
The best podcast player & podcast app.
Podbean AI
AI-Enhanced Audio Quality and Content Generation.

Ads Marketplace
Join Ads Marketplace to earn money
through sponsorship on your podcast.

PodAds
Manage your ads with dynamic ad insertion capability.
Patron & Paid Content
The seamless way for fans to support you directly
from your podcast.
Apple Podcasts Subscriptions Integration
Effortlessly publish and manage exclusive episodes for your
Apple Podcasts subscribers directly from Podbean.

All Arts Business Comedy Education
Fiction Government Health & Fitness History Kids & Family
Leisure Music News Religion & Spirituality Science
Society & Culture Sports Technology True Crime TV & Film
Live

How to Start a Podcast
How to Start a Live Podcast
How to Monetize a podcast
How to Promote Your Podcast
How to Use Group Recording

Log in
Start your podcast for free

Podcasting
Monetization
Enterprise
Pricing
Discover

the bioinformatics chat

Science:Life Sciences

#32 Deep tensor factorization and a pitfall for machine learning methods with Jacob Schreiber

2019-04-29

Download Right click and do "save link as"

In this episode we hear from Jacob Schreiber about his algorithm, Avocado.

Avocado uses a neural netwok to factorize a three-dimensional tensor of epigenomic data into the three independent factors corresponding to cell types, assay types, and genomic loci. Avocado can extract a low-dimensional, information-rich summary from the wealth of experimental data from projects like the Roadmap Epigenomics Consortium and ENCODE. Avocado can also predict the data from the experiments that have not been performed.

Jacob also talks about a pitfall he discovered when trying to predict gene expression from a mix of genomic and epigenomic data. As you increase the complexity of a machine learning model, its performance may be increasing for the wrong reason: instead of learning something biologically interesting, your model may simply be memorizing the average gene expression based on the nucleotide sequence.

Links:

Avocado on GitHub Multi-scale deep tensor factorization learns a latent representation of the human epigenome (Jacob Schreiber, Timothy Durham, Jeffrey Bilmes, William Stafford Noble) Completing the ENCODE3 compendium yields accurate imputations across a variety of assays and human biosamples (Jacob Schreiber, Jeffrey Bilmes, William Noble) A pitfall for machine learning methods aiming to predict across cell types (Jacob Schreiber, Ritambhara Singh, Jeffrey Bilmes, William Stafford Noble)