DBT Project Overview

Our dbt (Data Build Tool) project serves as the core analytics ETL pipeline for Pair Team, transforming healthcare data from multiple sources into analysis-ready marts for business intelligence. The project lives in the analytics-etl repository and orchestrates the transformation of raw healthcare data into actionable insights that power our Looker dashboards and operational reporting.

Architecture

The project follows a three-tier medallion architecture that processes data from four primary sources:

Data Sources:

Arc: Primary care platform data including patient charts, communications, and ECM enrollment profiles
Elation: Electronic health record (EHR) data with clinical visits, procedures, medications, and lab results
Zus: Healthcare data aggregation platform providing conditions, encounters, and clinical observations
Regal: Contact center operations data tracking calls, tasks, and agent activity

Transformation Layers:

Connectors (models/connectors/): Source-specific staging models that normalize and clean raw data from each platform using the naming pattern stage_{source}__{table_name}
Intermediate (models/[domain]/intermediate/): Business logic transformations that join and enrich data across sources, following the pattern int__{domain}_{description}
Marts (models/[domain]/marts/): Final analytics-ready tables materialized as tables for performance, consumed directly by Looker

Key Business Domains

The project organizes models into domain-specific schemas that align with business functions:

Operations: Patient enrollment tracking, churn analysis, engagement metrics, and social needs assessment
Care Plan: Intervention management and care coordination workflows
Care Gaps: HEDIS quality measures and care gap identification for preventive care
Communications: Patient-provider interaction analytics and communication effectiveness
Population Health: Clinical readings, vital signs, and health outcome tracking
Risk Stratification: High-risk patient identification and prioritization
Finance: Billing, authorization tracking, and revenue cycle management

Development Workflow

Engineers work in isolated development environments where models materialize under DBT_DEV_{username} schemas in Snowflake. The workflow emphasizes:

Local testing with dbt build --select +model_name+ to validate model logic and dependencies
SQL linting via SQLFluff with Snowflake dialect and pre-commit hooks to enforce code quality
Comprehensive documentation with all mart models requiring column descriptions and data quality tests
Pull request reviews before merging to main, where models automatically deploy to production

Orchestration

Our DBT production models are managed via GitHub Actions.

tip

The production environment runs nightly at 3 AM Pacific via GitHub Actions, executing a full-refresh build of nearly all models. Models materialize under the DBT_PROD schema and connect to Looker for business intelligence consumption. Two models use specialized strategies: stage_zus__observation runs incrementally for performance, and readings_bp_snapshot maintains historical snapshots to filter outdated clinical readings. If there are any schema changes for these models, we will need to run DBT against the production target profile using the --full-refresh option.

Technical Stack

info

The project leverages several dbt packages to extend functionality:

dbt_utils: Core utility macros and tests
the_tuva_project: Open-source healthcare analytics models for standardized clinical and claims data transformation
audit_helper: Data comparison tools for migration and refactoring validation
dbt_artifacts: Run metadata logging for observability and monitoring
dbt_project_evaluator: Automated best practice validation for project structure and quality

Development requires Python 3.x, SSH-based Snowflake authentication, and the Power User for dbt VS Code extension for enhanced productivity with inline query execution and visual lineage exploration.

Architecture​

Key Business Domains​

Development Workflow​

Orchestration​

Technical Stack​

Architecture

Key Business Domains

Development Workflow

Orchestration

Technical Stack