Designing the Mental Model: The Thinking Framework Behind Knowledge Graphs

Welcome to the thinking phase, the quiet step before any data pipeline starts, where we give our system a way to think clearly and purposefully.

Because smart graph doesn't begin with code or data. It begins with understanding.

1. Quick Recap - things we left off

Last time, we explored how Knowledge Graphs and Large Language Models learn differently.

👉 Read the previous article: Knowledge Acquisition in AI: How Knowledge Graphs and LLMs Learn Differently

One learns by connecting facts and mapping meaning.
The other learns by absorbing patterns and feeling context through words.

Together, they give AI both clarity and creativity.

But before we combine them, we need a thinking map, a framework that helps us move from ideas to implementation.

That's where this guide will help you.
We're not building the graph yet.
We're building the mental model behind it.

2. The Idea of a Mental Model - Thinking Before Building

2.1 What it Means

A Mental model is simply the way we think about how something work.
For Knowledge Graphs, it's how we decide what matters, how things connect, and what "Knowledge" actually means for our system.\

Without it, we collect data but lose direction. With it, we know what to look, what to link, and why it matters.

2.2 Why it Matters

This step is easy to skip but impossible to replace.
It aligns business goals with data, turns raw inputs into structured meaning, and helps every technical decision make sense.

In short, the mental model is the bridge between understanding and building.
It defines how intelligence will form inside the system - from goals to data, from structure to reasoning.

To build this framework, we'll borrow a proven approach from the world of data science, the CRISP-DM model, and adapt it for Knowledge Graphs.

This model gives us a way to think before we build.
A structured path that moves from business goals to data understanding, data preparation, and finally graph modeling.

3. The Framework - From Business Goals to Graph Model

Before we start building nodes and relationships, we need a clear route.

That route is the framework - the thinking map that connects why, what, and how.

Every Knowledge Graph starts with a purpose
Why are we building it? Who will use it? What decisions will it help make?

When we answer these questions first, everything that follows - the data we collect, the way we model it, and even the queries we run begins to make sense.

3.1 The Four Phases at a Glance

Think of this as the "blueprint" for our graph's mind.
It has four simple, repeatable steps:

Business Understanding - Define the goal, the people, and the questions the graph must answer.
Data Understanding - Explore the data sources, formats, and meaning behind them.
Data Preparation - Clean, aligns, and annotate the data so it's ready to connect.
Knowledge Graph Model Creation or Update - Turn everything into a living graph that can reason and grow.

These steps form the mental bridge from Problem to Representation.

3.2 Why This Framework Works

This structure is adapted from a tried-and-true process in data science called CRISP-DM short for Cross Industry Standard Process for Data Mining.
It was originally designed to bring order to messy analytics project.
We're using it here because Knowledge Graphs face the same challenge; plenty of data, but unclear direction.

By following this flow, we make sure our graph doesn't just collect facts.
It learns with purpose.
It grows in context.

let's break down the CRISP-DM model and try to understand what it really means and how we can make it work for Knowledge Graphs.

4. Understanding CRISP-DM - The Blueprint for Thinking

Before we bring data into our graph, we need a way to think through it.
That’s what CRISP-DM gives us.

It stands for Cross Industry Standard Process for Data Mining - a simple but powerful framework that has guided data science for over two decades. At its heart, CRISP-DM is about turning a vague idea into a working, data-driven system through a series of clear, logical steps.

4.1 The Six Phases of CRISP-DM

Here’s how it works in its original form:

Business Understanding - Define what we’re trying to achieve.
Data Understanding - Explore and familiarize ourselves with the data.
Data Preparation - Clean and shape the data so it’s ready for modeling.
Modeling - Build and test the model.
Evaluation - Check if the model meets the original goals.
Deployment - Put it into production.

4.2 Why It Fits Knowledge Graphs

When we apply CRISP-DM to Knowledge Graphs, we don’t use all six steps in one go.
Our focus is on the first four because that’s where thinking turns into structure.
The first four stages help us:

Connect business goals to data reality.
Prepare information for semantic modeling.
Choose the right graph representation - RDF or LPG - based on the use case.

It keeps us from jumping straight into the technical build without first understanding why and what we’re building.

So now that we know how to think before building, it’s time to decide how our knowledge will live inside the graph.
That choice shapes everything the structure, the reasoning power, and even how the graph grows over time.
Let’s look at the two main ways to represent knowledge and how to pick the one that fits your purpose.

5. Choosing the Right Model - RDF or LPG

Now that we’ve shaped how the system should think, the next question is simple How should our knowledge actually live inside the graph?

The answer lies in the model we choose.
There are two main ways to represent knowledge: RDF and LPG (Property Graphs).

5.1 RDF - The Meaning Builder

RDF stands for Resource Description Framework. It’s a standard way of describing data so that machines can understand and share it easily. We can think of it as a language of facts.

RDF stores information as triples - simple statements that follow a pattern: Subject → Predicate → Object
or simply, thing → relationship → thing.

For example:

Emma owns a Coffee Shop.
Coffee Shop is located in London.

Alt Text

Each of these is a small fact. When you connect many of them, you get a web of meaning a knowledge network that machines can reason about.

That’s what makes RDF powerful for:

Systems that rely on shared vocabularies or ontologies.
Scenarios where reasoning and inference are important.
Data that needs to connect across organizations or platforms.

It's great for projects like healthcare, research, or government data - where meaning and consistency matter more than speed.

But RDF can be hard to manage for everyday applications. It wasn’t originally built as a database model, so it can get complex when storing many types of relationships between the same entities.

For example:

If Emma visits London every year, RDF needs extra layers to track each visit separately which quickly becomes heavy to maintain.

So while RDF gives deep meaning, it also adds extra structure and effort.

5.2 LPG - The Structure Builder

LPG, or Labeled Property Graph, takes a simpler and more flexible approach.
It’s designed as a database model, built for real-world applications and analytics.

In a property graph, information is stored as:

Nodes (people, places, things)
Relationships (how they connect)
Properties (extra details on both)

For example:

Emma (Person) → [owns since: 2020] → Coffee Shop (Business)
Coffee Shop (Business) → [located_in] → London (City)

Alt Text

Both nodes and relationships can store data like the date Emma became the owner, or the address of the shop. This makes it easy to model real scenarios with many-to-many connections.

Property graphs are perfect when:

We need quick answer and pattern matching.
The structure changes often.
Relationships carry extra information (like "started working in 2022").

They’re used in recommendation systems, customer analytics, fraud detection, and any case where data grows fast and changes frequently.

5.3 Picking the Right One

Both models are valuable - they just serve different goals.

Goal	Best Choice
Shared meaning, reasoning, standard vocabularies	RDF
Flexibility, speed, rich relationships	LPG

Choosing between RDF and LPG isn’t about which one is better. It’s about what our graph needs to do.

If our goal is understanding and reasoning, RDF gives your data meaning.
If our goal is agility and application, LPG gives your data life.

And in many modern systems, the best approach is to mix both using RDF for structure and LPG for flexibility.
That way, your Knowledge Graph can reason like a scientist and move like an engineer.

6. Wrapping it up - From thinking to Building

Every intelligent system begins with thought.
Before a single line of code or data import, we first decide how the system will think what matters, how things connect, and what kind of knowledge it should hold.

That’s what this phase was about.
We built the mental framework that guides everything that follows from understanding goals to choosing the right model for our data.

RDF gives us shared meaning.
LPG gives us flexible structure.
Together, they remind us that building a Knowledge Graph is not just about linking data it’s about designing understanding.

In the next article, we’ll bring this thinking to life and walk through an example that turns this mental model into a real graph step by step, from concept to context.

1. Quick Recap - things we left off

2. The Idea of a Mental Model - Thinking Before Building

2.1 What it Means

2.2 Why it Matters

3. The Framework - From Business Goals to Graph Model

3.1 The Four Phases at a Glance

3.2 Why This Framework Works

4. Understanding CRISP-DM - The Blueprint for Thinking

4.1 The Six Phases of CRISP-DM

4.2 Why It Fits Knowledge Graphs

5. Choosing the Right Model - RDF or LPG

5.1 RDF - The Meaning Builder

5.2 LPG - The Structure Builder

5.3 Picking the Right One

6. Wrapping it up - From thinking to Building

Discussion Log (0)