TLDR: The paper argues that the long-standing practice in computer science conceptual modeling, known as the “schema turn,” which separates the conceptual schema from its information base, is a temporary evolutionary detour caused by past technological limitations. Modern technology now enables an “inclusive schema-and-base” approach, exemplified by the bCLEARer framework, which uses automated data pipelines to integrate schema and data from the start. This leads to more efficient, empirically testable, and cost-effective conceptual modeling practices by allowing early validation against real data.
In the realm of computer science and information systems development, a prevalent practice known as the “schema turn” has dominated conceptual modeling for decades. This approach emphasizes a conceptual schema that is entirely separate from its underlying information base, often to the extent that the schema itself is mistakenly referred to as the conceptual model. This schema-centric bias is deeply ingrained in database textbooks and mainstream development methodologies.
The research paper, “Disentangling the schema turn: Restoring the information base to conceptual modelling”, delves into the origins and implications of this schema turn. It argues that this separation, while historically understandable due to technological limitations, is not fundamental to conceptual modeling and may represent a temporary evolutionary detour in the field.
Understanding the Schema Turn’s Emergence
The concept of a “conceptual model” first appeared in the 1970s and 80s, initially representing the “real world.” Soon after, this model was divided into the conceptual schema and the information base. However, development practices quickly gravitated towards focusing almost exclusively on the schema. The information base, containing the actual data instances, typically only appeared at the very end of the development cycle as physical data, long after the main conceptual modeling work was completed. This late integration meant that any conceptual mistakes related to the data would only be discovered at a costly, late stage.
The authors explain that this bias was largely a product of its time. Early computing technology in the 1970s and 80s struggled to scale and handle large volumes of instance-level data during the modeling phase. Greenfield projects, where new systems were built from scratch, also meant that machine-readable information bases were not readily available. Therefore, focusing on a smaller, more manageable schema became a practical necessity, making a virtue out of technical constraints.
A New Perspective: Hylomorphism and Modularity
To analyze this evolution, the paper introduces a framework based on “modularity architectural styles,” drawing inspiration from Aristotle’s hylomorphism. This framework helps to understand how different components of a system (like schema and base) are separated or integrated. It categorizes mixing into three types: separated (no mixing), aggregated (like a collection), and integrated (a true, unified mixture). The schema turn, in this context, represents a “separated” style for the conceptual schema and information base.
Modern Technology Enables an Inclusive Approach
The core argument of the paper is that modern technology has largely removed the historical constraints that necessitated the schema turn. Today, scalable computing power and advanced data engineering tools make it feasible to work with both the conceptual schema and the information base together from the outset. This enables an “inclusive schema-and-base conceptual modeling approach.”
The authors illustrate this with the bCLEARer framework, a data pipeline-based approach they have developed over decades. bCLEARer automates the process of conceptual modeling, allowing for the incremental transformation of sizable information bases alongside their schemas. This contrasts sharply with traditional CASE tools and Model-Driven Engineering (MDE) practices, which remained largely schema-biased and relied on manual, graphical editing.
Also Read:
- The Future of Data: Redesigning Systems for LLM Agents
- AI’s Scientific Endeavor: The Indispensable Role of Verification
Benefits of Restoring the Information Base
Adopting an inclusive schema-and-base style, as demonstrated by bCLEARer, brings several significant advantages:
- Automation: The process of rolling out schema changes across the information base becomes automated, drastically reducing costs and human effort.
- Empirical Design: Instead of relying solely on rational analysis, modelers can empirically test conceptual schema changes against the actual information base data from the very beginning. This “shift-left” approach helps identify and correct mistakes earlier, preventing costly rework later in the project.
- Wider Possibilities: It opens up a broader space of conceptual modeling practices, moving beyond the limitations imposed by the schema turn.
The paper suggests that by embracing data pipeline strategies and automation, conceptual modeling can evolve to be more efficient, empirically grounded, and better suited to the complexities of modern information systems. The schema turn, therefore, might indeed be a temporary detour, and the future of conceptual modeling lies in restoring the information base to its central role.


