Sean Falconer (Head of Marketing & Developer Relations @ Skyflow, Podcast Host for Partially Redacted and Software Engineering Daily) talks about security and privacy of LLMs and how to prevent PII (personally identifiable information) from leaking out
SHOW: 807
CLOUD NEWS OF THE WEEK - http://bit.ly/cloudcast-cnotw
NEW TO CLOUD? CHECK OUT OUR OTHER PODCAST - "CLOUDCAST BASICS"
SHOW SPONSORS:
SHOW NOTES:
Topic 1 - Our topic for today is the security and privacy LLMs. Sean, welcome to the show. Before diving into today’s discussion, give everyone a 1 minute summary of your background. What’s Sean’s origin story?
Topic 2 - Let’s dig into LLM security and privacy. We see this concern a lot on the podcast and we’ve touched on it with various past shows, but we haven’t dug in deep. First, let’s frame the problem. What are we talking about when we talk about LLM security and privacy?
Topic 3 - In folks I talk to, I hear about two significant concerns. First, there is a fear that customer PII information might leak out. Second, company IP or confidential into might leak out related to products or offerings. We’ve seen examples of both to date. This could be exposed in the form of integration into a model (query it for the answer) or in the fine-tuning or RAG stage. Either one could lead to compliance issues, lost rev etc. But, that same data at risk is the potential differentiation of the models. How do you both mask the data but take advantage of the data?
Topic 4 - One thing I’ve noticed is many orgs only think about privacy in relation to the fine-tuning stage where they are taking a broad model and making it company specific. It is about much more than that though. Just like standard software development, we have different stages. How is the data collected and stored, how is it used for training and fine-tuning, how is it used after deployment and during interaction stage, etc. How should security and privacy be handled across all phases?
Topic 5 - Let’s talk beyond LLMs for a bit. What about Data Lakes and Data Warehousing? I see this as a problem across all big data, correct?
Topic 6 - How does API security fit into this? Much of what we are talking about is at the storage and retrieval level. But, increasingly we see API issues exposing data. How does that fit in here?
Topic 7 - Let’s talk podcasts, we had Jeff, the previous host of Software Engineering Daily on a few times. How are things over at Software Engineering Daily? Tell everyone a bit about the show.
FEEDBACK?
Listen on: Apple Podcasts Spotify
Create your
podcast in
minutes
It is Free