Data Intensive Applications — Reliability, Scalability, Maintainability
Introduction
In this article we’re going to learn how to build software systems that deal with large amount of data and how to build them in a reliable way. This article is highly inspired by one of the best Data Engineering books available out there: Designing Data Intensive Applications by Martin Kleppmann.
Note: I have received no compensation for writing this piece. Please consider supporting my and others’ writing by becoming a Medium member with this link.
What is a Data Intensive Application
First, let’s define what is a data intensive use case, how can we tell whether you need to build an application that is data intensive? A system / application is data intensive if it checks at least one of these boxes:
- The amount of data it generates or uses is huge and increasing quickly
- The complexity of data it generates or uses increases quickly
- The speed of change in data is increasing quickly
Typically big websites like LinkedIn, Facebook and Google are data intensive as they have millions of users coming and using their websites daily. To build such a system requires very different skill sets.