Huge demand and technology shifts have dramatically changed the best practices of how to work with data, and yet no new books have come forward to establish these practices keeping many following old patterns and books who's best practices are now often bad practice.

This was the problem the authors Dave Fowler (CEO of Chartio - now acquired by Atlassian) and Matt David ( Head of The Data School) faced over a decade of working with thousands of customers struggling to turn their huge datasets into a needed source of truth and insight.

They began organizing and writing on the stages, pitfalls and best practices of setting up a modern data stack so that companies and their knowledge workers can well utilize their data and become a winning Informed Company. Initial versions of this were shared on dataschool.com and they have now put considerable more work into formalizing these stages and practices for this print edition.

Praise for the book

This book provides the foundational knowledge you’ll need to navigate the modern data stack. From there, you’ll be able to dive in yourself, get your hands dirty, and start asking questions of your fellow practitioners.

~ Tristan Handy - CEO @ DBT Labs

The Informed Data Company is a lucid, pragmatic explanation of how to use data in a modern organization.

~ George Fraser - CEO @ Fivetran

This book is the missing map to the modern data landscape and is highly recommended for technology professionals seeking to improve their understanding and avoid pitfalls when implementing a modern data lake and associated capabilities.

George Barnett - Sr Platform Architect @ Atlassian

The 4 Stages of Agile Data Organization

Working for over a decade with thousands of modern companies at all stages we've recognized 4 healthy stages that companies go through as they grow.

The 4 Stages of Agile Data Organization

The book is organized as a progression through these 4 stages, each an addition to the last bringing both extra overhead and benefits.
>> Read more in the Intro Chapter

 

Table of contents

Stage 1 Source (aka Siloed Data)

  1. Starting with Source Data
  2. The Need to Replicate Source Data
  3. Source Data Best Practices

Stage 2 Data Lake (aka Data Combined)

  1. Why Build a Data Lake?
  2. Choosing an Engine for the Data Lake
  3. Extract and Load (EL) Data
  4. Data Lake Security
  5. Data Lake Maintenance

Stage 3 Data Warehouse (aka the Single Source of Truth)

  1. The Power of Layers and Views
  2. Staging Schemas
  3. Model Data with dbt
  4. Deploy Modeling Code
  5. Implementing the Data Warehouse
  6. Managing Data Access
  7. Maintaining the Source of Truth

Stage 4 Data Marts (aka Data Democratized)

  1. Data Mart Implementation
  2. Data Mart Maintenance

What's changed in data

  1. Modern versus Traditional Data Stacks: What’s Changed?
  2. Row- versus Column-Oriented Database
  3. Style Guide Example
  4. Building an SST Example



Authors

Dave Fowler has worked in BI for years, and has always looked for ways to JOIN teams ON data. He wants to enable any working professional (not just data analysts) to explore and understand their data. As the founder and CEO of Chartio, Dave has spent the last decade leading the development of a self-service BI product that aims to do just that. Chartio’s suite of tools made it easy for anyone at a data-driven business to browse their schemas, merge various data sources, and produce beautiful dashboards. Chartio has since been acquired by Atlassian.

Follow on Dave's Twitter or Dave's LinkedIn

 
Matt David has worked in product management and education for seven years. As data becomes a necessary skill for more and more jobs, he passionately advocates for data literacy among the workforce. As the current head of The Data School, he oversees the production of free, online resources focused on leveraging data within companies. Recent book topics include SQL optimization, Data Governance, and common analysis biases. Dave started the Data School and together he and Matt have grown it to an important free resource for the data community.

Follow on Twitter or LinkedIn

 

Foreword by Tristan Handy - CEO of dbt Labs

Editing and contributions
Mila Page (Developer Relations @ dbt Labs),
Emilie Schario (Data-Strategist-in-Residence at Amplify Partners),
Tracy Chow (Sr Data Support Engineer @ Chartio/Atlassian),
David Yerrington (Data Science Consultant and Educator)

 

Who this book is for

We wrote this book for whoever values data and believes that informed companies are competitive. It’s a book for the working professional creating a practical, modern data stack. It’s for the lone analyst or the professional embedded in a team. It’s for anyone interested in what design practices underlie robust data architecture, the kind that equips entire companies with business intelligence insights. At its heart, this book is written with collaboration in mind.

Blog