How to use dbt Snapshots to track data history
Some data points naturally change over time, but it happens slowly.
Unsurprisingly, this is referred to as a "slowly changing dimension" or SCD.
This is a common data modeling scenario but can take a while to get right.
But if you use dbt, you can take advantage of their built-in Snapshots feature.
Snapshots require just a few configurations to set up and the result is a new table that tracks historical changes just like a SCD.
So in this video, you'll learn more about what dbt Snapshots are and how to easily add them to your project.
Enjoy!
Full Course: The Playbook for dbt™ - bit.ly/3FueEdl
►► The Starter Guide for dbt (Free PDF)
Get clarity on key dbt concepts so you can build better projects & avoid common mistakes → bit.ly/starter-dbt
Timestamps:
0:00 - Intro
0:51 - What are Snapshots?
1:57 - Review Scenario
3:08 - Best Practices
3:55 - Create a Snapshot
7:45 - Add New Data
9:29 - Reference Snapshot in a Model
Title & Tags:
How to use dbt Snapshots to track data history
#kahandatasolutions #dataengineering #dbt
►► The Starter Guide for dbt (Free PDF) → bit.ly/starter-dbt Get clarity on key dbt concepts so you can build better projects & avoid common mistakes
Awesome. Helped in understanding the concept of Slowly changing dimensions (SCD) !
Glad it helped!
After going through the first 25% of the playbook and then stopping....here I am again googling answers to a work problem and finding your videos. Appreciate all the info and content. (Our issues was an incorrect unique_key setting on a table where the snapshot was created HUGE tables as a result.)
Hi, please make a detailed video on dbt analytics certification. As I am preparing for that it would be much helpful. Appreciate your works on data engineering
Thank for a very good content, I have a question how we can track deleted raw from the source ?
Hi, thanks for explaining this concept. Is there a way to set valid to date of ending record to d-1? This way using between in where condition will return 2 records instead of 1
Hey ,What have you mentioned in yml file? As I was trying to do this way but the staging model could not refer the snapshot due to some reason.
is it possible to partition snapshots (especially for Bigquery?)
I find snowflake streams to be better than this concept
You're a "Solutions Architect". Do you really need to nickel and dime your youtube audience to death with "courses" ?? You should feel shame.
why did you even bother writing this lmao, youre not even a content creator