Semantic Layer Overview
Understanding the Semantic Layer: a business-oriented abstraction that sits between your physical data model and end users, making data more accessible and meaningful.
Background
As Large Language Models (LLMs) continue to advance, integrating them into data-driven systems is becoming increasingly vital. Coginiti's AI assistance allows users to generate SQL queries through natural language by leveraging database schemas. This approach is effective when the physical model is well-designed with descriptive object names, particularly in data marts modeled using a star schema.
However, many organizations face a different reality: their physical data models feature cryptic names, complex and non-intuitive structures, and lack proper relationships like foreign keys. In these scenarios, LLMs struggle to generate correct SQL queries from natural language requests due to insufficient context about what the data actually represents from a business perspective.
The Problem
Consider these common challenges with physical data models:
Cryptic naming: Tables like TBL_987_XREF or columns like FLD_23_AMT provide no business context
Missing relationships: Lack of foreign keys means the system cannot understand how tables relate to each other
Complex structures: Denormalized or over-normalized designs that don't reflect business concepts
Technical focus: Models optimized for storage or performance rather than business understanding
When business users want to ask questions like "What were last quarter's sales by region?", the system needs to translate this into queries against tables with names like FCT_SLS_DTL joined to DIM_GEO_RGN—a translation that's difficult without business context.
The Solution
The Semantic Layer provides a business-oriented abstraction over your physical data model. Instead of forcing users and AI systems to understand technical table structures, you define a semantic model that:
- Assigns descriptive business names to entities and attributes
- Defines relationships between entities explicitly
- Specifies business metrics and calculations
- Organizes data around business concepts rather than technical structures
Key capabilities
SQL API exposure: The semantic layer exposes a SQL API, allowing users and tools to query using familiar SQL syntax against business-friendly names.
Driver support: JDBC/ODBC drivers enable connectivity from various analytics tools and applications.
Query translation: Coginiti automatically translates queries written against the semantic layer into queries against the underlying physical model.
Project integration: Semantic models are defined in .smdl files stored within your Coginiti project and referenced from the project.toml file.
Universal platform support: Works across all database platforms supported by Coginiti.
Namespace organization: Objects are organized within a logical hierarchy (Connection → Database → Schema).
Database Explorer integration: Browse semantic models from published projects directly in the Database Explorer.
Access control: Leverages database-level permissions, mapping visible entities to the semantic layer. A "Semantic User" access level allows sharing from the package hub while restricting direct access to project implementation details.
How It Works
Core concepts
The Semantic Layer is built on several key concepts:
Entities: Represent business datasets, similar to tables or views. Each entity maps to either a physical table or a SQL query.
Dimensions: Descriptive attributes that provide context (e.g., customer name, product category, transaction date). Dimensions can be organized into hierarchies for drill-down analysis.
Measures: Quantitative data you want to analyze (e.g., sales amount, quantity sold). Measures are aggregated when analyzed across dimensions.
Relationships: Explicit definitions of how entities relate to each other, enabling the system to automatically join data correctly.
Domains: Subsets of the semantic layer that can be exposed to specific user groups, providing data access control and simplified views.
Integration with AI
The Semantic Layer significantly enhances AI-powered query generation:
- Business context: LLMs receive semantic model definitions that include business names, descriptions, and relationships
- Accurate generation: With clear business context, AI can generate more accurate queries from natural language
- Semantic selection: Users can select specific semantic layers or domains as context for AI assistance
- Natural language to SQL: "Show me last quarter's sales by region" becomes a correct query, even when the underlying tables have cryptic names
Design Decisions
File organization
Semantic models are stored as .smdl files within your project directory. This approach:
- Version controlled: Semantic models live alongside your CoginitiScript code in version control
- Project scoped: Each project can have its own semantic layer definitions
- Modular: Entities can be defined in separate files for easier navigation
- Collaborative: Teams can work on different entities simultaneously
Relationship to CoginitiScript
The Semantic Layer and CoginitiScript complement each other:
CoginitiScript: Transforms raw data into clean, structured data using SQL transformations. It's your data transformation and preparation layer.
Semantic Layer: Adds business meaning and context to structured data, making it accessible for self-service analytics.
This separation allows:
- Data engineers to focus on building robust data pipelines with CoginitiScript
- Business analysts to define business metrics and concepts in the Semantic Layer
- Both to evolve independently while working together
Related documentation
- Semantic Model Reference - Complete SMDL language specification and syntax reference
- CoginitiScript Philosophy and Vision - Understanding the transformation layer that feeds the Semantic Layer
- Architecture - How Semantic Layer fits into Coginiti's overall architecture
Next steps
To start working with the Semantic Layer:
- Learn the specification: Review the Semantic Model Reference to understand the SMDL syntax and capabilities
- See examples: Explore example semantic models to understand best practices
- Build your model: Start defining entities, dimensions, and measures for your data
- Test with AI: Use the Semantic Layer as context for AI-powered query generation