Skip to main content

SQL models

Getting started

Building your first models

If you're new to dbt, we recommend that you read a quickstart guide to build your first dbt project with models.

dbt's Python capabilities are an extension of its capabilities with SQL models. If you're new to dbt, we recommend that you read this page first, before reading: "Python Models"

A SQL model is a select statement. Models are defined in .sql files (typically in your models directory):

  • Each .sql file contains one model / select statement
  • The model name is inherited from the filename.
  • We strongly recommend using underscores for model names, not dots. For example, use models/my_model.sql instead of models/my.model.sql.
  • Models can be nested in subdirectories within the models directory.

Refer to How we style our dbt models for details on how we recommend you name your models.

When you execute the dbt run command, dbt will build this model data warehouseA data warehouse is a data management system used for data storage and computing that allows for analytics activities such as transforming and sharing data. by wrapping it in a create view as or create table as statement.

For example, consider this customers model:

models/customers.sql
with customer_orders as (
select
customer_id,
min(order_date) as first_order_date,
max(order_date) as most_recent_order_date,
count(order_id) as number_of_orders

from jaffle_shop.orders

group by 1
)

select
customers.customer_id,
customers.first_name,
customers.last_name,
customer_orders.first_order_date,
customer_orders.most_recent_order_date,
coalesce(customer_orders.number_of_orders, 0) as number_of_orders

from jaffle_shop.customers

left join customer_orders using (customer_id)

When you execute dbt run, dbt will build this as a view named customers in your target schema:

create view dbt_alice.customers as (
with customer_orders as (
select
customer_id,
min(order_date) as first_order_date,
max(order_date) as most_recent_order_date,
count(order_id) as number_of_orders

from jaffle_shop.orders

group by 1
)

select
customers.customer_id,
customers.first_name,
customers.last_name,
customer_orders.first_order_date,
customer_orders.most_recent_order_date,
coalesce(customer_orders.number_of_orders, 0) as number_of_orders

from jaffle_shop.customers

left join customer_orders using (customer_id)
)

Why a view named dbt_alice.customers? By default dbt will:

  • Create models as viewsA view (as opposed to a table) is a defined passthrough SQL query that can be run against a database (or data warehouse).
  • Build models in a target schema you define
  • Use your file name as the view or tableIn simplest terms, a table is the direct storage of data in rows and columns. Think excel sheet with raw values in each of the cells. name in the database

You can use configurations to change any of these behaviors — more on that later.

FAQs

How can I see the SQL that dbt is running?
Do I need to create my target schema before running dbt?
If I rerun dbt, will there be any downtime as models are rebuilt?
What happens if the SQL in my query is bad or I get a database error?
Which SQL dialect should I write my models in? Or which SQL dialect does dbt use?

Configuring models

Configurations are "model settings" that can be set in your dbt_project.yml file, and in your model file using a config block. Some example configurations include:

  • Changing the materializationThe exact Data Definition Language (DDL) that dbt will use when creating the model’s equivalent in a data warehouse. that a model uses — a materialization determines the SQL that dbt uses to create the model in your warehouse.
  • Build models into separate schemas.
  • Apply tags to a model.

Here's an example of model configuration:

dbt_project.yml
name: jaffle_shop
config-version: 2
...

models:
jaffle_shop: # this matches the `name:`` config
+materialized: view # this applies to all models in the current project
marts:
+materialized: table # this applies to all models in the `marts/` directory
marketing:
+schema: marketing # this applies to all models in the `marts/marketing/`` directory

models/customers.sql

{{ config(
materialized="view",
schema="marketing"
) }}

with customer_orders as ...

It is important to note that configurations are applied hierarchically — a configuration applied to a subdirectory will override any general configurations.

You can learn more about configurations in the reference docs.

FAQs

What materializations are available in dbt?
What model configurations exist?

Building dependencies between models

You can build dependencies between models by using the ref function in place of table names in a query. Use the name of another model as the argument for ref.

models/customers.sql
with customers as (

select * from {{ ref('stg_customers') }}

),

orders as (

select * from {{ ref('stg_orders') }}

),

...

dbt uses the ref function to:

  • Determine the order to run the models by creating a dependent acyclic graph (DAG).
The DAG for our dbt projectThe DAG for our dbt project
  • Manage separate environments — dbt will replace the model specified in the ref function with the database name for the tableIn simplest terms, a table is the direct storage of data in rows and columns. Think excel sheet with raw values in each of the cells. (or view). Importantly, this is environment-aware — if you're running dbt with a target schema named dbt_alice, it will select from an upstream table in the same schema. Check out the tabs above to see this in action.

Additionally, the ref function encourages you to write modular transformations, so that you can re-use models, and reduce repeated code.

Testing and documenting models

You can also document and test models — skip ahead to the section on testing and documentation for more information.

Additional FAQs

Are there any example dbt models?
Can I store my models in a directory other than the `models` directory in my project?
Can I build my models in a schema other than my target schema or split my models across multiple schemas?
Do ref-able resource names need to be unique?
How do I remove deleted models from my data warehouse?
As I create more models, how should I keep my project organized? What should I name my models?
If models can only be `select` statements, how do I insert records?
Why can't I just write DML in my transformations?
How do I specify column types?
0
Loading