All Posts By

Using Multi-Active Satellites the Correct Way (1/2)

By | Scalefree Newsletter | No Comments
With multi-active satellites, you’re able to store multiple active records for one business key. Depending on how the data arrives from your source, there are different ways to implement multi-activity in Data Vault 2.0. In this post, we’ll explain your options for modeling. 


What is a Multi-Active Satellite?

A multi-active satellite is similar to a standard satellite and its structure. As said before, it stores multiple active records per key at a point in time. This exact structure depends on the use case though.
See the exemple Data Vault model in figure 1.

Read More

Effort estimation in Data Vault 2.0 projects

By | Scalefree Newsletter | No Comments

There are many options available when choosing a method to estimate the necessary effort within agile IT projects.
In Data Vault 2.0 projects, we recommend estimating the effort by applying a Function Point Analysis (FPA). In this article, you will learn why FPA is a good choice and why you should consider using this method in your own Data Vault 2.0 projects.


Probably the best known method for estimating work in agile projects is Planning Poker. Within the process, so-called story points, based upon the Fibonacci sequence (0, 0.5, 1, 2, 3, 5, 8, 13, 20, 40 and 100), are used to estimate the effort of a given task. 

To begin the process, the entire development team sits together as each member simultaneously assigns story points to each user story that they feel are appropriate. If the story points match, the final estimate is made. Alternatively, if a consensus cannot be reached the effort is discussed until a decision is made.  Read More

Implementing Data Vault 2.0 ghost records

By | Scalefree Newsletter | No Comments

Implementing Data Vault 2.0 ghost records

During the development of Data Vault, from the first iteration to its latest Data Vault 2.0, we’ve mentioned the two terms “ghost records” and “zero keys” in our literature as well as in our Data Vault 2.0 Boot Camps. And since then, we’ve noticed these concepts oftentimes being referenced to interchangeably. 

In this blog entry, we’ll discuss the implementation of ghost records in Data Vault 2.0. Please note, that this article is part one of a multi-part blog series clarifying Ghost records vs. Zero Keys. Read More

Data Warehousing and why we need it

By | Scalefree Newsletter | No Comments

A data warehouse is a subject oriented, nonvolatile, integrated, time variant collection of data to support management’s decisions

  • Inmon, W. H. (2005). Building the Data Warehouse. Indianapolis, Ind.: Wiley.
It provides the technical infrastructure needed to run Business Intelligence effectively. Its purpose is to integrate data from different data sources and to provide a historicised database. Through a DWH, consistent and reliable reporting can be ensured. A standardised view of the data can prevent interpretation errors, improved data quality and leads to better decision-making. Furthermore, the historization of data offers additional analysis possibilities and leads to (complete) auditability.  Read More

Data Vault Games with Cindi

By | | No Comments

Cynthia Meyersohn

C​indi has worked in a variety of IT realms over the past 35 years and, as of 2018, had spent the last 17 years working in applications and data engineering development within the U.S. DoD. As a Data Vault 2.0 (DV2) Solution Architect and Certified Instructor, her responsibilities and expertise range from the design, development, implementation, and technical guidance of Enterprise Data Warehouse/Big Data builds to crafting processes surrounding data acquisition and ingest, data governance and Master Data Management policy and compliance, development and team leadership.

Cindi has spent the past seven years leading the architectural design, implementation, and development of Data Vault 2.0 solutions at the U.S. DoD and Department of State. She is a Certified Authorized Data Vault 2.0 Instructor.

Cindi holds a MS in Systems Engineering from George Washington University and a BS in Information Systems from Strayer University.

Christian Kurze

By | | No Comments

MongoDB: A general purpose, distributed, and highly scalable data platform for modern applications



The database for modern applications: MongoDB is a general purpose, document-based, distributed database built for modern application developers and for the cloud era. No database is more productive to use.
MongoDB emerged into a general purpose database that easily allows to build globally distributed data platforms that are highly available and scale almost indefinitely. While NoSQL is still considered as a “new” technology, many of the fortune 1000 companies already migrated mission-critical workloads and decided to use MongoDB as a strategic data platform.

Due to its flexibility, the JSON-based document model support a bandwidth of use cases, like Single View, Internet of Things, Mobile, Real-Time Applications, Personalization, Content Management, Catalogs and Mainframe Offloading.

This presentation provides and overview of MongoDB, the document model, and how data can be accessed in many different ways via native drivers in almost any programming language, but also connectors like Spark or R and even SQL. A practical example shows how to use MongoDB for Data Vault creation in the insurance industry.


Christian spent the last couple of years on data management and data integration in order to generate value out of data. In MongoDB he works as a Principal Solutions Architect. Prior to joining MongoDB, he worked on data virtualization, data warehousing and active metadata management. He holds a PhD in data warehouse automation.

What you will learn

  • Comparison of the document model vs. the relational model
  • Native high availability, horizontal scalability, workload isolation and data locality
  • Deployment agnostic: on-prem, hybrid, cloud, Kubernetes
  • Additional features for rich data usage like S3-based data lake access, full-text search, access by analytical tools, etc.
  • Example how to build a Data Vault in MongoDB



Neil Strange

By | | No Comments

What’s so scary? Safely migrating to a Big Data, Data Vault Solution from a legacy Kimball data warehouse


A frequent question we get asked is “how can I migrate from my existing Kimball data warehouse to a big data Data Vault solution?”
But what do we mean by migration? And what are the implications of choosing a big data architecture? Can we use Snowflake or a Azure SQL Data Warehouse to run our new system? Where do we start?
This presentation will explore the migration question and suggest some good practice for designing a big data Data Vault target architecture.


Neil is the founder and managing director of Datavault UK, a consultancy specialising in Data Vault 2.0 and Information Governance implementations and coaching. He has many years experience working with a diverse range of clients and industries helping organisations make the best strategic use of their IT systems and data services. Neil has presented at the previous three WWDVC events in the USA.


  • How to define your migration project.
  • Architecting your big data Data Vault target solution.
  • Working on the migration process.
  • Migration good practice.

André Dörr

By | | No Comments

Data Vault in sports analytics


Everything started with Moneyball in 2002. It’s the first well know use case, where a sports team used a data-driven approach to measure player value. In the meantime, many sports clubs tried to copy this method. And with more and more  technology entering sports, more and more data is collected and analyzed to get an edge in the competition.


This presentation will take a look at different sports analytics use cases for football clubs.
– Technical challenges in football clubs
– Building a compact analytical architecture based on Data Vault & Exasol
– Data Science in sports analytics with Data Vault & Exasol

Matthias Wegner

By | | No Comments

Data Vault + GDPR at


Matthias Wegner is senior technical consultant for Data Warehouse platforms. He initiated and implemented Data Wareouse platforms for multiple projects and customers in Germany using Data Vault and Talend as the main toolsets.  Providing a tailored set of standards and best practices for all aspects of a Data Warehouse project is one of his main missions.
Matthias is Head of BI at cimt AG – IT consulting since 5 years.
Currently he works as the architect for the Data Warehouse of where the concept of encryption of data for GDPR was developed and implemented.



In this case study we will give you an overview of the data warehouse migration project at You will see how we address GDPR requirements and which role Talend plays in this project. We’ll also show how easy it is to virtualize the access layer through the database switch to Exasol.

· Data Warehouse state-of-the-art, overview and source landscape
· Full Data Vault architecture
· Team setting
· Toolset (Talend, Exasol, Confluence)
· Loading procedures with Talend / Exasol – ELT
· GDPR requirements
· Encryption architecture and decryption approach on the fly in Exasol
· Lessons learned

Matthias Reiß

By | | No Comments

A day at the data lake

Matthias Reiß is a Senior Client Technical Professional within the IBM Cloud and Cognitive Technical Sales Team in Germany.He has more than 15 years experience in Analytics and data integration projects in heterogeneous environments.

A day at the Data Lake – Get your data working in your Data Lake and beyond
Catch the big fish faster. Get the most out of your data in your Data Lake and all the data stores connected to it.Imagine how you can easily combine the different data formats in your lake with other relational and non-relational Data Stores within one single query.

– IBMs common and hybrid SQL Engine
– Data Virtualization
– Data Caching
– Polymorphic Table Functions (i. e. Apache Spark Integration)