Download - ANTHOLOGY — Open source AI (Interview)

Discover

Podcast Features
Your all-in-one podcasting solution.

Blog to Podcast
Turn your blog into an engaging podcast.
Livestream
High-performing audio live, without limits.

Podcast Studio
Easy-to-use audio recorder app.
Podbean AI
AI-Enhanced Audio Quality and Content Generation.

Podcast App
The best podcast player & podcast app.

Ads Marketplace
Join Ads Marketplace to earn money
through sponsorship on your podcast.

PodAds
Manage your ads with dynamic ad insertion capability.
Apple Podcasts Subscriptions Integration
Effortlessly publish and manage exclusive episodes for your
Apple Podcasts subscribers directly from Podbean.
Live Streaming
Receive livestream rewards from your audience and earn
recurring income from your Fan Club membership.

All Arts Business Comedy Education
Fiction Government Health & Fitness History Kids & Family
Leisure Music News Religion & Spirituality Science
Society & Culture Sports Technology True Crime TV & Film
Live

How to Start a Podcast
How to Start a Live Podcast
How to Monetize a podcast
How to Promote Your Podcast
How to Use Group Recording

Log in
Start your podcast for free

Podcasting
Monetization
Advertisers
Enterprise
Pricing
Discover

The Changelog: Software Development, Open Source

Technology

ANTHOLOGY — Open source AI (Interview)

2023-05-24

Download Right click and do "save link as"

This week on The Changelog we’re taking to you the hallway track of The Linux Foundation’s Open Source Summit North America 2023 in Vancouver, Canada. Today’s anthology episode features: Beyang Liu (Co-founder and CTO at Sourcegrpah), Denny Lee (Developer Advocate at Databricks), and Stella Biderman (Executive Director and Head of Research at EleutherAI).

Special thanks to our friends at GitHub for sponsoring us to attend this conference as part of Maintainer Month.

Leave us a comment

Changelog++ members get a bonus 3 minutes at the end of this episode and zero ads. Join today!

Sponsors:

DevCycle – Build better software with DevCycle. Feature flags, without the tech debt. DevCycle is a Feature Flag Management platform designed to help you build maintainable code at scale.
Sentry – See the untested code causing errors - or whether it’s partially or fully covered - directly in your stack trace, so you can avoid similar errors from happening in the future. Use the code CHANGELOG and get the team plan free for three months.
Rocky Linux – Enterprise Linux, the open source community way.
Fly.io – The home of Changelog.com — Deploy your apps and databases close to your users. In minutes you can run your Ruby, Go, Node, Deno, Python, or Elixir app (and databases!) all over the world. No ops required. Learn more at fly.io/changelog and check out the speedrun in their docs.

Featuring:

Beyang Liu – Twitter, GitHub
Denny Lee – Mastodon, Twitter, GitHub, LinkedIn
Stella Biderman – Twitter, GitHub, LinkedIn
Adam Stacoviak – Mastodon, Twitter, GitHub, LinkedIn, Website
Jerod Santo – Mastodon, Twitter, GitHub, LinkedIn

Show Notes:

The common denominator for these conversations is open source AI.

Beyang Liu and his team at Sourcegraph are focused on enabling more developers to understand code and their approach to a completely open source, model agnostic, coding assistant called Cody has significant interest from us.

Denny Lee and the team at Databricks recently released Dolly 2.0, the first open source, instruction-following LLM, that has been fine-tuned on a human-generated instruction dataset and is licensed for research and commercial use. They want to be the platform of choice the future of AI development.

Stella Biderman gave the keynote address on generative AI at the conference and works at the base layer doing open source research, model training, and AI ethics. Stella trained the EleutherAI pythia model family that Databricks’ used to create Dolly - 2.0.

Cody from Sourcegraph - Read, write, and understand code 10x faster with AI. Cody answers code questions and writes code for you by reading your entire codebase and the code graph.
Free Dolly: Introducing the World’s First Truly Open Instruction-Tuned LLM
EleutherAI - Empowering Open Source Artificial Intelligence Research

Something missing or broken? PRs welcome!

Timestamps:

(00:00) - This week on The Changelog
(01:44) - Sponsor: DevCycle
(04:38) - Start the show!
(05:46) - We met Beyang 10 years ago!
(06:08) - The mission of Sourcegraph
(07:22) - Adam still Googles, just less
(08:30) - Plugins make models interesting
(09:35) - When did you start thinking about this?
(12:16) - This is a "Eureka!" momement in time
(13:11) - The gospel of text based input
(15:44) - Is this the future interface of Sourcegraph?
(17:21) - Iterating the interface
(17:59) - How can you access Cody?
(18:27) - Cody is open source
(20:13) - How does it get code intelligence?
(21:58) - What about privacy?
(26:11) - GPT for X
(26:53) - Cody vs Copilot
(29:25) - Open source + model agnostic
(31:22) - What's next?
(33:19) - How high up the stack can AI tooling go?
(36:07) - Is this a step change to plateau?
(38:21) - The ultimate flattener
(42:56) - Will AI awallow all of programing?
(45:52) - Sponsor: Sentry
(50:08) - We're fine-tuned
(50:51) - JIT conference presenter
(52:32) - This time 4 weeks ago
(53:54) - Let's generate our own data
(55:05) - All 15,000 Q&A data is open
(56:12) - Verbose is not always desirable
(56:42) - I want my own Dolly 2.0
(58:14) - How did you collect the Q&A data?
(1:00:39) - We thought we'd need more data
(1:01:40) - Dolly proved it could be done
(1:03:24) - Google's leaked memo
(1:06:06) - Databricks' play in this chess game
(1:08:45) - Turning AI on our transcripts
(1:11:03) - Chain or foundational model?
(1:12:42) - Sponsor: Rocky Linux
(1:15:19) - The base layer
(1:16:27) - What should the world know?
(1:17:40) - Where does the money come from?
(1:18:13) - Training LLMs is NOT that expensive
(1:22:07) - Focused on open source AI research
(1:25:49) - Interpreting LLMs
(1:28:30) - Influencing the properties of the model
(1:31:40) - Do you have fear of where this is going?
(1:32:58) - Connecting with Stella and team
(1:34:07) - Stella's news source is their Discord server
(1:36:22) - Outro