Repository for the ActivitySchema spec and supporting materials
The Activity Schema data model is a standard designed for modern data warehouses. It is designed to make data modeling and analysis substantially simpler, faster, and more reliable than existing methodologies (e.g. a star schema)
The core approach is to separate modeling from querying. Models should be fully independent: asking new data questions should not require creating new models. Instead, queries can combine any number of existing models to build anything they need.
This is done by modeling all data in the warehouse as a single time series table.
For more info also see the home page or read the spec.
At its core an activity schema consists of transforming raw tables into a single, time series table called an activity stream. All downstream plots, tables, materialized views, etc used for BI are built directly from that single table, with no other dependencies.
Traditional Data Modeling - many custom-structured tables
Activity Schema - one standard table
Business concepts are represented as entity doing an activity ('a customer completed an order') instead of facts or nouns (orders, products). Activities are built directly from source tables, store only their own data, and are the single source of truth for each concept.
All queries run against an activity stream table to assemble data for analysis, BI, and reporting. Instead of traditional foreign key joins, queries combine activities using relationships in time (e.g. all customers who completed an order and submitted a support ticket before their next completed order).
Maintainable data models
Faster analysis and querying
Learn more by reading the full activity schema specification.
The activity schema home page has more info on the approach and its benefits
The implementation page covers how to build an activity schema in production.
A new activity schema dbt package has some helpful macros for building activity stream tables.
Narrator provides a full implementation of the activity schema as a service. The Narrator team directly supports the growth of the activity schema, and is always happy to provide suggestions on how to build one yourself.
The dbt Slack has a #modeling-activity-schema
channel for discussion about the activity schema.