Understanding Full-Refresh dbt Models
Lab Steps
Introduction
Full-refresh models are very important when working with dbt. They allow you to define models that are always completely refreshed every time a dbt execution is performed. This guarantees you that records inside this kind of models are always fresh.
In this lab step, you will learn more about full-refresh dbt models.
Full-Refresh Models
In dbt, full-refresh models can be implemented by using the table materialization type. When you define a model that uses this materialization type, on each execution dbt first creates the table if it doesn't exist, and then repopulates it with the whole records returned by the SQL query that defines the model. If it already exists and the structure is not changed, dbt only completely repopulates it. Technically speaking, dbt inserts all the records returned by the query without any filter in order to also get rows that were already in the table.
This approach is very helpful when you always want fresh data, and when there could be changes in the records of the dbt model already put in.
You can refer to the official dbt docs in order to get more insights about materialization types.
Changes in the Model Structure
Full-refresh models are also essential when you already have a dbt model up and running, and you need to fix its structure by changing the SQL query it's based on (ex. adding a new column, or removing an existing one). In this case, dbt will drop the table and will create it again with the new structure.