This is the second part of our guide on streaming data and Apache Kafka. In part one I talked about the uses for real-time data streams and explained our idea of a stream data platform. The remainder of this guide will contain specific advice on how to go about building a stream data platform in your organization.
This advice is drawn from our experience building and implementing Kafka at LinkedIn and rolling it out across all the data types and systems there. It also comes from four years working with tech companies in Silicon Valley to build Kafka-based stream data platforms in their organizations.
This is meant to be a living document. As we learn new techniques, or new tools become available, I’ll update it.
Much of the advice in this guide covers techniques that will scale to hundreds or thousands of well formed data streams. No one starts with…
View original post 6 089 more words