Semantic Model Reference
Complete technical reference for the Semantic Model Specification language used to define semantic layers in Coginiti.
Overview
The Semantic Model Specification (SMDL) is a domain-specific language for defining semantic layers within Coginiti projects. SMDL uses HCL-like syntax, which provides a balance between flexibility, readability, and features.
Semantic layers are defined in files with the .smdl extension (Semantic Model Definition Language) and stored within a project directory. Each entity can be defined in a separate file or multiple entities can be defined in a single file.
File structure
Directory organization
Semantic model files should be organized in a dedicated directory within your project:
retail_foundation/ # Project folder
├── base/ # CoginitiScript packages
├── semantic/ # Semantic layer folder
│ ├── dim_persona.smdl
│ ├── dim_offering_item.smdl
│ ├── fact_sales_header.smdl
│ └── fact_sales_detail.smdl
├── project.toml
└── linter.toml
For large projects, organize by subject area:
retail_foundation/
└── semantic/
├── sales/ # Subject area
│ ├── fact_sales_header.smdl
│ └── fact_sales_detail.smdl
├── customer/ # Subject area
│ └── dim_persona.smdl
└── offering/ # Subject area
└── dim_offering_item.smdl
File extension
All semantic model files must use the .smdl extension.
Project configuration
Semantic layers are registered in the project.toml file using the [semantic] section.
Semantic section structure
[semantic]
default = "all"
[semantic.all]
name = "Full Data"
path = "semantic"
environment = "prod"
[semantic.sales]
name = "Sales Domain"
path = "semantic"
environment = "prod"
Configuration parameters
| Parameter | Required | Description |
|---|---|---|
default | Yes | The default semantic layer to use |
name | Yes | Display name for the semantic layer |
path | Yes | Path to the directory containing .smdl files |
environment | Yes | Project environment to use for this semantic layer |
Entity
An entity represents a dataset within the semantic layer, conceptually similar to a table or view in SQL.
Syntax
entity "entity_name" {
label = "Display Name"
description = "Entity description"
owner = "email@example.com"
table_name = "schema.table"
# OR
query = <<-EOF
SELECT ...
EOF
dimension "dim_name" { ... }
measure "measure_name" { ... }
relation "relation_name" { ... }
}
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
label | string | No | User-friendly name displayed in the UI |
description | string | No | Entity description. Supports Markdown formatting |
owner | string | No | Email address of the person responsible for this entity |
table_name | string | Conditional | Fully-qualified table name (e.g., schema.table). Required if query is not specified |
query | string | Conditional | SQL query defining the entity. Required if table_name is not specified. Use heredoc syntax (<<-EOF) for multi-line queries |
dimension | block | No | Dimension definitions (can have multiple) |
measure | block | No | Measure definitions (can have multiple) |
relation | block | No | Relationship definitions (can have multiple) |
Table-based entity example
entity "sales_header" {
label = "Sales Header Fact"
description = "Contains sales header data"
owner = "data-team@company.com"
table_name = "base.fct_sales_header"
dimension "sales_header_key" {
label = "Sales Header Key"
type = "number"
expr = "sales_header_key"
}
}
Query-based entity example
entity "sales_detail" {
label = "Sales Detail Fact"
description = "Contains sales detail data with transformations"
query = <<-EOF
SELECT
id AS sales_detail_key,
header_key AS sales_header_key,
product_key,
unit_quantity,
unit_price_amount
FROM
fact_sales_detail
WHERE
is_deleted = FALSE
EOF
}
Dimension
A dimension is an entity attribute that provides descriptive context for measures. Dimensions are used for filtering, grouping, and organizing data.
Syntax
dimension "dimension_name" {
label = "Display Name"
type = "text"
description = "Dimension description"
hidden = false
expr = "column_name"
}
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
label | string | No | User-friendly name displayed in the UI |
type | string | Yes | Data type. Valid values: text, number, date, datetime, bool |
description | string | No | Description of the dimension's semantic meaning |
hidden | boolean | No | When true, hides the dimension from the UI. Default: false |
expr | string | Yes | SQL expression that calculates the dimension value |
Simple dimension example
dimension "first_name" {
label = "First Name"
type = "text"
expr = "first_name"
}
Calculated dimension example
dimension "full_name" {
label = "Full Name"
type = "text"
description = "Customer's full name (first and last)"
expr = "CONCAT(first_name, ' ', last_name)"
}
Date dimension example
dimension "transaction_date" {
label = "Transaction Date"
type = "date"
expr = "DATE(transaction_datetime)"
}
Measure
A measure represents quantitative data that you want to analyze. Measures are aggregated when analyzed across dimensions.
Syntax
measure "measure_name" {
label = "Display Name"
type = "number"
description = "Measure description"
hidden = false
aggregation_type = "sum"
expr = "column_name"
}
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
label | string | No | User-friendly name displayed in the UI |
type | string | Yes | Data type. Valid values: number, date, datetime |
description | string | No | Description of the measure's semantic meaning |
hidden | boolean | No | When true, hides the measure from the UI. Default: false |
aggregation_type | string | No | Aggregation function. Valid values: count, count_distinct, sum, avg, max, min, median, stdev, stdevp, var, varp, custom |
expr | string | Yes | SQL expression that calculates the measure value |
Aggregation types
| Aggregation Type | SQL Function | Additive | Description |
|---|---|---|---|
count | COUNT(*) | Yes | Count of rows |
count_distinct | COUNT(DISTINCT ...) | No | Count of unique values |
sum | SUM(...) | Yes | Sum of values |
avg | AVG(...) | No | Average of values |
max | MAX(...) | No | Maximum value |
min | MIN(...) | No | Minimum value |
median | MEDIAN(...) | No | Median value |
stdev | STDDEV(...) | No | Standard deviation |
stdevp | STDDEV_POP(...) | No | Population standard deviation |
var | VAR(...) | No | Variance |
varp | VAR_POP(...) | No | Population variance |
custom | (expression) | Varies | Custom aggregation expression |
Simple measure example
measure "unit_quantity" {
label = "Unit Quantity"
type = "number"
expr = "unit_quantity"
aggregation_type = "sum"
}
Calculated measure example
measure "sales_amount" {
label = "Sales Amount"
type = "number"
description = "Total sales amount (quantity × price)"
expr = "unit_quantity * unit_price_amount"
aggregation_type = "sum"
}
Custom aggregation example
measure "average_discount_rate" {
label = "Average Discount Rate"
type = "number"
description = "Weighted average discount rate"
aggregation_type = "custom"
expr = "SUM(discount_amount) / NULLIF(SUM(original_amount), 0)"
}
Additive vs. non-additive measures
Additive measures can be aggregated multiple times and still produce correct results:
countsum
Non-additive measures cannot be rolled up multiple times as it produces incorrect results:
count_distinctavgmedian- Standard deviation and variance measures
This distinction is important for pre-aggregation optimization strategies.
Relationship
Relationships define joins between entities, allowing the system to access attributes from related entities. The system uses LEFT JOIN by default. For INNER JOIN behavior, add an IS NOT NULL condition in the join expression.
Syntax
relation "relation_name" {
label = "Display Name"
type = "many_to_one"
expr = "left_table.key = right_table.key"
}
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
label | string | No | User-friendly name displayed in the UI |
type | string | Yes | Relationship cardinality. Valid values: one_to_one, one_to_many, many_to_one |
expr | string | Yes | SQL join condition expression |
Relationship types
| Type | Description | Example |
|---|---|---|
one_to_one | Each record in the source entity corresponds to exactly one record in the target entity | Order to order details (for single-item orders) |
one_to_many | Each record in the source entity corresponds to multiple records in the target entity | Customer to orders |
many_to_one | Multiple records in the source entity correspond to one record in the target entity | Orders to customer |
Many-to-one relationship example
entity "sales_header" {
table_name = "fact_sales_header"
relation "persona_dimension" {
label = "Persona"
type = "many_to_one"
expr = "sales_header.persona_key = dim_persona.persona_key"
}
dimension "persona_key" {
label = "Persona Key"
type = "number"
expr = "persona_key"
}
}
One-to-many relationship example
entity "sales_header" {
table_name = "fact_sales_header"
relation "sales_details" {
label = "Sales Details"
type = "one_to_many"
expr = "sales_header.sales_header_key = sales_detail.sales_header_key"
}
}
Inner join example
relation "persona_dimension" {
label = "Persona"
type = "many_to_one"
expr = "sales_header.persona_key = dim_persona.persona_key AND dim_persona.persona_key IS NOT NULL"
}
Data types
The semantic model defines its own set of data types that map to SQL data types.
Semantic data types
| Semantic Type | Description | Example Values |
|---|---|---|
text | Text and character data | "John Doe", "Product A" |
number | Numeric data (integers, decimals, floats) | 42, 3.14159, 1000000 |
bool | Boolean values | TRUE, FALSE |
date | Calendar dates | 2024-01-15 |
datetime | Timestamps with date and time | 2024-01-15 14:30:00 |
SQL to semantic type mapping
| SQL Data Type | Semantic Type |
|---|---|
VARCHAR, CHAR, TEXT, STRING | text |
INTEGER, BIGINT, SMALLINT, TINYINT | number |
DECIMAL, NUMERIC, FLOAT, DOUBLE, REAL | number |
BOOLEAN, BOOL | bool |
DATE | date |
TIMESTAMP, DATETIME, TIMESTAMPTZ | datetime |
Complete example
Here's a complete example showing multiple entities with dimensions, measures, and relationships:
// Customer dimension entity
entity "dim_persona" {
table_name = "dim_persona"
label = "Persona Dimension"
description = "Customer dimension containing persona information"
owner = "data-team@company.com"
dimension "persona_key" {
label = "Persona Key"
type = "number"
expr = "persona_key"
}
dimension "first_name" {
label = "First Name"
type = "text"
expr = "first_name"
}
dimension "last_name" {
label = "Last Name"
type = "text"
expr = "last_name"
}
dimension "full_name" {
label = "Full Name"
type = "text"
expr = "CONCAT(first_name, ' ', last_name)"
}
}
// Product dimension entity
entity "dim_product" {
table_name = "dim_product"
label = "Product Dimension"
description = "Product dimension containing offering information"
dimension "product_key" {
label = "Product Key"
type = "number"
expr = "product_key"
}
dimension "product_name" {
label = "Product Name"
type = "text"
expr = "product_name"
}
}
// Sales fact entity
entity "sales_detail" {
table_name = "base.fct_sales_detail"
label = "Sales Detail Fact"
description = "Contains detailed sales transaction data"
owner = "analytics@company.com"
// Dimensions
dimension "sales_detail_key" {
label = "Sales Detail Key"
type = "number"
expr = "sales_detail_key"
}
dimension "sales_header_key" {
label = "Sales Header Key"
type = "number"
expr = "sales_header_key"
}
dimension "product_key" {
label = "Product Key"
type = "number"
expr = "product_key"
}
// Measures
measure "unit_quantity" {
label = "Unit Quantity"
type = "number"
description = "Number of units sold"
expr = "unit_quantity"
aggregation_type = "sum"
}
measure "unit_price_amount" {
label = "Unit Price Amount"
type = "number"
description = "Price per unit"
expr = "unit_price_amount"
aggregation_type = "avg"
}
measure "sales_amount" {
label = "Sales Amount"
type = "number"
description = "Total sales amount (quantity × price)"
expr = "unit_quantity * unit_price_amount"
aggregation_type = "sum"
}
// Relationships
relation "sales_header" {
label = "Sales Header"
type = "many_to_one"
expr = "sales_detail.sales_header_key = sales_header.sales_header_key"
}
relation "dim_products" {
label = "Product"
type = "many_to_one"
expr = "sales_detail.product_key = dim_products.product_key"
}
}
Best practices
File organization
- One entity per file: Makes navigation and maintenance easier
- Use subject area folders: For large projects, organize by business domain
- Consistent naming: Use lowercase with underscores for file names (e.g.,
dim_customer.smdl)
Naming conventions
- Entity names: Use descriptive business names (e.g.,
sales_detail, notfct_sls_dtl) - Dimension/measure names: Use clear, business-friendly labels
- Relationship names: Describe the target entity or the relationship purpose
Performance considerations
- Use table_name when possible: Query-based entities add overhead
- Mark calculated measures as non-additive: Helps with pre-aggregation optimization
- Keep expressions simple: Complex calculations can impact query performance
Documentation
- Add descriptions: Use the description field to explain business meaning
- Specify owners: Helps users know who to contact about data questions
- Document calculated fields: Explain the logic in calculated dimensions/measures
Related documentation
- Semantic Layer Overview - Understanding the concepts behind semantic layers
- CoginitiScript Reference - Transform data before applying semantic models