Wesley Reisz talks to Sid Anand, a data architect at cybersecurity company Agari, about building cloud-native data pipelines. The focus of their discussion is around a solution Agari uses that is built from Amazon Kinesis Streams, serverless functions, and auto scaling groups.
Sid Anand is an architect at Agari, and a former technical architect at eBay, Netflix, and LinkedIn. He has 15 years of data infrastructure experience at scale, is a PMC for Apache Airflow, and is also a program committee chair for QCon San Francisco and QCon London.
Why listen to this podcast
- Real-time data pipeline processing is very latency sensitive
- Micro-batching allows much smaller amounts of data to be processed
- Use the appropriate data store (or stores) to support the use of the dataIngesting data quickly into a clean database with minimal indexes can be fast
- Communicate using a messaging system that supports schema evolution
More on this: Quick scan our curated show notes on InfoQ http://bit.ly/2rJU9nB
You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq
Subscribe: www.youtube.com/infoq
Like InfoQ on Facebook: bit.ly/2jmlyG8
Follow on Twitter: twitter.com/InfoQ
Follow on LinkedIn: www.linkedin.com/company/infoq
Want to see extented shownotes? Check the landing page on InfoQ: http://bit.ly/2rJU9nB
view more