Skip to content


Staying agile in the face of the data deluge

A talk at Span conference, London, UK, 28 Oct 2014

As our applications need to process ever more data in ever shorter time, it’s difficult to stay sane. The architecture of our applications quickly becomes a monstrosity of different databases, queues and servers held together by string and sellotape. That may work at first, but soon gets ugly. If something goes wrong, it’s hard to recover. If features of the application need to change, it’s hard to adapt.

Stream processing gives us a route towards building data systems that are scalable, robust, and easy to adapt to changing requirements. In this talk, we will discuss how you can bring sanity to your own application architecture with Apache Samza, an open source framework for distributed stream processing applications.

Apache Samza is used in production at LinkedIn, building upon years of hard-won experience in building large-scale data systems. Even if you’re not processing millions of messages per second, like LinkedIn is, you can still pick up useful tips on how to structure your data processing for scale and agility.