The Artificial Intelligence Show
Business:Marketing
#61: Pirated Books Are Powering Generative AI, the 2023 State of Marketing AI Report, and GPT-3.5 Fine-Tuning Is Here
Pirated books are powering generative AI
The Atlantic just released a major investigative journalism piece that proves popular large language models, like Meta’s LLaMA, have been using pirated books to train their models—a fact that was previously alleged by multiple authors in multiple lawsuits against AI companies.
The article states, “Upwards of 170,000 books, the majority published in the past 20 years, are in LLaMA’s training data. . . . These books are part of a dataset called “Books3,” and its use has not been limited to LLaMA. Books3 was also used to train Bloomberg’s BloombergGPT, EleutherAI’s GPT-J—a popular open-source model—and likely other generative-AI programs now embedded in websites across the internet.”
According to an interview in the story with the creator of the Books3 dataset of pirated books, it appears Books3 was created with altruistic intentions. Reisner interviewed the independent developer of Books3, Shawn Presser, who said he created the dataset to give independent developers “OpenAI-grade training data,” in fear of large AI companies having a monopoly over generative AI tools.
The 2023 State of Marketing AI Report findings
Marketing AI Institute, in partnership with Drift, just released our third-annual State of Marketing AI Report. The 2023 State of Marketing AI Report contains responses from 900+ marketers on AI understanding, usage, and adoption. In it, we’ve got tons of insights on how marketers understand, use, and buy AI technology, the top outcomes marketers want from AI, the top barriers they face when adopting AI, how the industry feels about AI's impact on jobs and society, who owns AI within companies, and much more. Paul and Mike talk about some of the most interesting findings from the data.
You can now fine-tune GPT-3.5 Turbo
OpenAI just announced a big update: You can now fine-tune GPT-3.5 Turbo to your own use cases. This means you can customize the base GPT-3.5 Turbo model to your own needs, so they perform much better on use cases that may be custom to your organization’s specific needs. For instance, you might fine-tune GPT-3.5 Turbo to better understand text that’s highly specific to your industry or business. You might also fine-tune models to sound more like your brand in their outputs or remember specific examples or preferences when producing outputs, so you don’t have to spend resources and bandwidth on highly complex prompts every time you use a model. Notably, OpenAI says: “Early tests have shown a fine-tuned version of GPT-3.5 Turbo can match, or even outperform, base GPT-4-level capabilities on certain narrow tasks.” They also note fine-tuning for GPT-4 will be coming this fall.
Plus…the rapid-fire topics this week are interesting, so stick around for the full episode.
Listen to the full episode of the podcast: https://www.marketingaiinstitute.com/podcast-showcase
Want to receive our videos faster? SUBSCRIBE to our channel!
Visit our website: https://www.marketingaiinstitute.com
Receive our weekly newsletter: https://www.marketingaiinstitute.com/newsletter-subscription
Looking for content and resources?
Register for a free webinar: https://www.marketingaiinstitute.com/resources#filter=.webinar
Come to our next Marketing AI Conference: www.MAICON.ai
Enroll in AI Academy for Marketers: https://www.marketingaiinstitute.com/academy/home
Join our community:
Slack: https://www.marketingaiinstitute.com/slack-group-form
LinkedIn: https://www.linkedin.com/company/mktgai
Twitter: https://twitter.com/MktgAi
Instagram: https://www.instagram.com/marketing.ai/
Facebook: https://www.facebook.com/marketingAIinstitute
Create your
podcast in
minutes
It is Free