Skip to main content

Coginiti Retail Project Complete Workflow Tutorial

Overview

This tutorial walks you through the complete coginiti-retail data pipeline — from raw source data to automated, scheduled reports — using Coginiti Actions. You will explore each layer of the pipeline, understand how data quality tests are embedded at every step, and orchestrate everything into an automated workflow.

Scenario: Publishing Retail Analytics Pipeline

You're part of a data team working on a retail analytics pipeline.

The challenge: The team needs a reliable, automated pipeline that transforms raw retail data into business reports, with quality checks at every layer and flexible reporting that stakeholders can customize.

Coginiti Actions solves this by allowing you to:

  • Define multi-step data pipelines as reusable actions
  • Embed automated testing directly inside each transformation step
  • Create parameterized reports that run on schedules
  • Orchestrate everything through a single workflow with job dependencies
note

This guide is based on PostgreSQL DB


Step 1: Reviewing Pipeline Scripts

Context: You need to build a three-layer data pipeline (landing → base → mart) that processes retail transaction data. Each layer is a separate run script so they can be reviewed, tested, and orchestrated independently.

Prerequisites

Before running the pipeline, ensure you have:

  • A coginiti-retail-tutorial project
info

Available here: https://github.com/coginiti-dev/coginiti-retail-tutorial
Read the _instructions file first

  • PostgreSQL DB named retail_csl
  • Database schemas created: landing, stage, base, mart, reports
  • Write access to the project

Reviewing the Landing Layer Script

Starting with the simplest run script that populates raw data tables.

  1. Navigate to the project's run_scripts/ folder in Coginiti
  2. Open the landing asset
  3. What this does: publication.Run() finds all entities in the landing package with publication metadata and creates or replaces their target tables in the database.
    After publishing, test.Run() automatically validates all source data entities.

script_landing.png

Reviewing the Base Layer Script

Next, the base layer run script which includes batch tracking for audit trails.

  1. In the run_scripts/ folder, open base
  2. What this does: publication.Run() finds all entities across the base sub-packages (location, touchpoint, customer, offering, datetime, sales) with publication metadata and creates or replaces their tables.
    The blocks=[base.Batch] parameter creates a batch record before processing, stamping every row with a batch number for data lineage tracking.
    After publishing, test.Run() validates all base layer entities automatically.

script_base.png

Reviewing the Mart Layer Script

Finally, the mart layer run script for analytics-ready tables.

  1. Open mart in the run_scripts/ folder
  2. What this does: publication.Run() finds all entities in the mart package with publication metadata and creates or replaces their tables in the database.

script_mart.png

Running Scripts Manually

To test each layer independently:

  1. Run each run script in sequence:
    • Load landing layer: open landing asset → run
    • Load base layer: open base asset → run
    • Load mart layer: open mart asset → run
  2. Verify tables were created in each schema (landing, base, mart)

running_scripts_landing_base_mart.gif

tip

Best Practices

  • Keep scripts focused: one run script per layer makes debugging easier
  • Import only what you need: explicit imports make dependencies clear
  • Test incrementally: run each layer independently before orchestrating

Step 2: Reviewing Parameterized Reports

Context: Reports need to be flexible — stakeholders run the same report with different date ranges, metrics, and filters. Parameterized run scripts accept runtime values and overwrite a "current state" table on each run.

Prerequisites

  • Mart layer loaded with data
  • Parameter values defined for each report
info

More about parameters in actions: Parameters — Coginiti Actions Reference

Reviewing the Daily Snapshot Report

Rolling window sales report with configurable lookback periods.

In the reports/ folder, open daily_snapshot:

Key parameters:

  • $snapshot_date — the date for the snapshot (e.g., 2023-01-01)
  • $lookback_days — number of days to look back (e.g., 30)
  • $snapshot_type — type of snapshot (daily, weekly, monthly)
  • Target table: reports.rpt_daily_snapshot_current (overwrites on each run)

daily_snapshot.png

Reviewing the Report Run Script

  1. In the run_scripts/ folder, open report_daily_snapshot
  2. What this does: blocks=[reports.DailySnapshot] runs a specific entity rather than an entire package.
    After publishing, targeted tests verify the snapshot contains data and has no null dates.

script_daily_snapshot.png

Reviewing the Store Rankings Report

Report that ranks stores by configurable metrics.

In the reports/ folder, open store_rankings:

Key parameters:

  • $start_date — start of the ranking period (e.g., 2022-01-01)
  • $end_date — end of the ranking period (e.g., 2022-12-31)
  • $ranking_metric — metric to rank by: sales_amount, transaction_count, or customer_count
  • $top_n — number of top stores to include
  • Target table: reports.rpt_store_rankings_current (overwrites on each run)

store_rankings.png

Reviewing the Report Run Script

  1. In run_scripts/, open report_store_rankings
  2. What this does: runs the StoreRankings entity specifically, then verifies the result has data and no duplicate ranks.

script_store_rankings.png

Reviewing the Customer Export Report

Report for flexible customer segmentation exports.

In the reports/ folder, open customer_export:

Key parameters:

  • $export_type — segment to export: all, high_value, at_risk, recent
  • $min_orders — minimum order count filter (integer)
  • $min_sales — minimum sales amount filter (integer)
  • $include_contact_info — include email and phone in export: true or false
  • Target table: reports.rpt_customer_export_staging (overwrites on each run)

customer_export.png

Reviewing the Report Run Script

  1. In run_scripts/, open report_customer_export
  2. What this does: runs the CustomerExport entity specifically, then verifies the result has data and no duplicate customers.

script_customer_export.png

Running Parameterized Reports Manually

  1. Run each report run script in sequence:
    • Open report_daily_snapshot → run → provide parameter values in the pop-up
    • Open report_store_rankings → run → provide parameter values in the pop-up
    • Open report_customer_export → run → provide parameter values in the pop-up
  2. Verify tables were created in the reports schema

running_scripts_parametrized_reports.gif


Step 3: Validating with Tests

Context: In this tutorial, data quality tests are integrated into the execution scripts and run automatically. However, depending on your use case, you may choose to create dedicated test scripts as well.
Tests in Coginiti return rows only when they fail, making it easy to spot issues.

warning

An action stops immediately upon any failure, tests included.
We are working on advanced error handling for future Coginiti releases — stay tuned!

How Tests Are Organized

Test definitions (#+test sql) live in _test files alongside the source entities they validate:

  • sources/customers_test, sources/stores_test, sources/products_test, sources/inbound_channels_test
  • base/touchpoint/email_address_test, base/touchpoint/email_domain_test, base/touchpoint/postal_address_test, base/touchpoint/postal_code_test
  • base/sales/sales_header_test
  • reports/daily_snapshot_test, reports/store_rankings_test, reports/customer_export_test

Reviewing Source Data Tests

  1. Navigate to the sources/ folder
  2. Open customers_test
  3. What this checks:
    Test_Customers_ValidEmail() — returns any customer IDs with invalid email format (should return no rows)
    Test_Customers_UniqueID() — returns any customer IDs that appear more than once (should return no rows)
  4. Other _test assets follow the same pattern

customers_test.png

How Tests Run in the Pipeline

Tests are not run manually — they execute automatically inside each run script:

  • run_scripts/landing runs test.Run(packages=[sources]) — discovers and executes all #+test sql definitions in the sources package
  • run_scripts/base runs test.Run(packages=[location, touchpoint, customer, offering, datetime, sales]) — discovers and executes all tests across base sub-packages
  • run_scripts/report_* each run test.Run(tests=[...]) — explicitly lists the specific tests for that report

script_landing_tests.png

Reviewing Targeted Report Tests

Report run scripts list their tests explicitly rather than discovering an entire package:

  1. Review report_daily_snapshot in run_scripts/
  2. What's different: tests=[reports.Test_DailySnapshot_HasData, reports.Test_DailySnapshot_NoNullDates] explicitly names which tests to run, giving precise control over report validation.

script_daily_snapshot_test.png

Report tests for store_rankings and customer_export follow the same pattern.

Interpreting Test Results

When you run any run script, test results appear in the output:

  • No rows returned — all tests passed ✅
  • Rows returned — tests failed ❌ (the rows show the failing records)

running_tests_success.gif running_tests_failure.gif

tip

Best Practices

  • Use descriptive test names that reflect what is being validated, ending with the _test suffix
  • Check business rules: test for nulls, duplicates, and referential integrity
  • Use #+meta { :doc } to explain what each test validates

Step 4: Reviewing the Orchestration Action

Context: Now that all run scripts are understood, the team orchestrates them into an automated workflow with job dependencies and a schedule.

info

For a step-by-step guide to creating actions: Creating Coginiti Actions
For field reference: Coginiti Actions Reference

Reviewing the Action File

  1. In the project, navigate to the .actions/ folder
  2. Open retail_action.toml
  3. The file is organized into these sections:

[general] — Required. name and description of the action.
[environment] — Optional. Specifies which project environment to use for connection setup. The id must match an environment defined in project.toml.

info

More about environments: Coginiti Environments

[schedule] — Optional. Scheduling configuration. When present, cron and misfire_policy are required; timezone is optional but recommended.

note

If a non-IANA timezone is used, you will see a DST notification in the Scheduler:
action_schedule_dst_notification.png

[job.<job_name>] — Required. A job groups steps that execute together.
depends_on = [...] — Optional. Lists jobs that must complete before this job starts.
steps = [...] — Required. Step names executed in the listed order.
<step_name> = { command, asset } — Required. Step definition with run_asset command and asset path.
parameters = { ... } — Optional. Key-value pairs passed to the asset. If defined here, they override any external parameters from scheduled events.

retail_workflow.png

The Complete Action File

ETL Pipeline Jobs

[general]
name = "retail_workflow"
description = "This test workflow will be used for demo purposes, based on coginiti-retail-tutorial data that can be found here: https://github.com/coginiti-dev/coginiti-retail-tutorial"

[environment]
id = "prod"

[schedule]
cron = "0 15 20 * * ?" # sec min hour day month dayOfWeek — runs at 20:15 daily
misfire_policy = "RUN_IMMEDIATELY"
timezone = "Europe/Helsinki" # IANA timezone with DST rules applied

[job.landing]
steps = ["landing"]
landing = { command = "run_asset", asset = "/run_scripts/landing" }

[job.base]
depends_on = ["landing"]
steps = ["base"]
base = { command = "run_asset", asset = "/run_scripts/base" }

[job.mart]
depends_on = ["base"]
steps = ["mart"]
mart = { command = "run_asset", asset = "/run_scripts/mart" }

This creates the pipeline: landing (with source tests) → base (with base tests) → mart

Report Jobs

[job.reports]
depends_on = ["mart"]
steps = ["report_daily_snapshot", "report_store_rankings", "report_customer_export"]
report_daily_snapshot = { command = "run_asset", asset = "/run_scripts/report_daily_snapshot", parameters = { "$snapshot_date" = "2023-10-01", "$lookback_days" = "30", "$snapshot_type" = "monthly" } }
report_store_rankings = { command = "run_asset", asset = "/run_scripts/report_store_rankings", parameters = { "$start_date" = "2022-01-01", "$end_date" = "2022-12-31", "$ranking_metric" = "sales_amount", "$top_n" = "50" } }
report_customer_export = { command = "run_asset", asset = "/run_scripts/report_customer_export", parameters = { "$export_type" = "high_value", "$min_orders" = "10", "$min_sales" = "5000", "$include_contact_info" = "true" } }

This creates three reports sequentially after mart is ready, each with its own parameters.

Resulting dependency chain:

landing → base → mart → reports
├── report_daily_snapshot ($snapshot_date, $lookback_days, $snapshot_type)
├── report_store_rankings ($start_date, $end_date, $ranking_metric, $top_n)
└── report_customer_export ($export_type, $min_orders, $min_sales, $include_contact_info)

The bottom of the file contains an annotated template showing every supported field — the same template appears in every newly created action file:

retail_workflow_example.png

note

All parameter values must be strings in TOML (use quotes):
parameters = { "$start_date" = "2022-01-01", "$top_n" = "50", "$include_contact_info" = "true" } — correct
parameters = { "$start_date" = 2022-01-01, "$top_n" = 50, "$include_contact_info" = true } — unquoted values will error

Asset paths must start with a leading slash:
asset = "/run_scripts/report_daily_snapshot" — correct
asset = "run_scripts/report_daily_snapshot" — missing leading slash will error


Step 5: Scheduling and Automation

To execute the entire pipeline on a schedule:

  1. Update the [schedule] block in retail_action.toml to meet your needs and save the changes
  2. Go to SchedulingManage Scheduled Events
  3. Click Add
  4. Name — choose a name for this scheduled event
  5. Scheduled Object — select retail_action.toml from the retail project
  6. Connection, Database, Schema — filled automatically from the [environment] field.
    If not configured: choose your ConnectionDatabase = retail_csl
  7. Configure If Schedule Missed and Notifications as needed
  8. Parameters — enable Parse Parameters and select the $<name> option. Parameters are loaded automatically from the action file.
  9. Click Save, then Start

action_scheduling.gif

The scheduled action starts at the configured time and sends a notification:

notification_in_progress.png

The notification updates to reflect success or failure:

notification_success.png notification_failure.png

You can open each step to inspect its result:

notification_review_steps.png


tip

For more information about other exciting Coginiti features visit our documentation

Thank you for choosing us!