dbt + Snowflake: The Definitive Production Setup Guide

dbt (data build tool) and Snowflake are arguably the most powerful combination in the modern data stack. This guide documents the production architecture we deploy on enterprise engagements — tested across 40+ Snowflake implementations.

Recommended Project Structure

We recommend a four-layer architecture: raw (source replicas, never transformed), staging (light cleaning and typing), intermediate (business logic), and marts (consumption-ready tables for BI tools). This structure makes lineage clear and keeps transformation logic composable.

Snowflake Warehouses for dbt

Use separate virtual warehouses for development, CI/CD runs, and production. Size development at XS (it's cheap), CI at S, and production at M or L depending on model complexity. Auto-suspend after 60 seconds. Never share a warehouse between dbt and BI tools — the workload profiles are incompatible.

CI/CD with GitHub Actions

Our standard pipeline: on every PR, run dbt build --select state:modified+ against a CI schema on a dedicated Snowflake warehouse. This runs only affected models and their downstream dependencies — fast and cost-efficient. On merge to main, deploy to production with dbt build and post Slack notifications on failure.

Cost Governance

Tag all Snowflake resources created by dbt with a dbt_model tag. Create a QUERY_HISTORY alert if any single dbt job exceeds 1000 credits. Use dbt source freshness to avoid running expensive models on stale upstream data.

Testing Strategy

Every staging model should have not_null and unique tests on its primary key. Every mart model should have at least one relationship test back to its source. Use dbt-expectations for business-logic validation. Aim for 100% primary key coverage, 80%+ field coverage on marts.

dbt + Snowflake: The Definitive Production Setup Guide

Recommended Project Structure

Snowflake Warehouses for dbt

CI/CD with GitHub Actions

Cost Governance

Testing Strategy

Found this useful?