In this episode, we discuss WildChat: 1M ChatGPT Interaction Logs in the Wild by Wenting Zhao, Xiang Ren, Jack Hessel, Claire Cardie, Yejin Choi, Yuntian Deng. WILDCHAT is a dataset featuring 1 million user-ChatGPT conversations with over 2.5 million interaction turns, created by collecting chat transcripts and request headers from users who consented to participate. It surpasses other datasets in terms of diversity of prompts, languages covered, and the inclusion of toxic interaction cases, providing a comprehensive resource for studying chatbot interactions. Additionally, it incorporates detailed demographic data and timestamps, making it valuable for analyzing varying user behaviors across regions and times, and for training instruction-following models under AI2 ImpACT Licenses.
Create your
podcast in
minutes
It is Free