In general, we have defined the following types of satellite splits:
- Splitting by source system
- Splitting by rate of change
Additionally, we have defined two more types of splits as mentioned below:
- Splitting by level of security and by the level of privacy
- Business-driven split
A satellite split by source system is strongly recommended to prevent two issues when loading the data into the enterprise data warehouse: First, if two different source systems with different relational structures should be loaded into the same satellite entity, a transformation of the structure might be required. However, structural transformation requires business logic sooner or later and that should be deferred to the information delivery stage to support fully-auditable environments as well as the application of multiple business perspectives.
The second issue is that two sources loaded into the same satellite entity leads to the so-called “flip-flop effect”: if both systems store contradicting data (e.g. out-of-sync) regarding the business key to be described, the satellite will absorb two deltas per day, capturing both descriptions, leading to high storage consumption and data inconsistencies. Therefore, splitting a satellite by source system helps to reduce the storage consumption drastically.
The advantages of splitting satellites by source system include the enhancement of parallelism, multiple source systems data can be loaded in parallel, as well. It also allows for the integration of real-time data without the need to integrate with raw data from a batch load.
In addition to the split by source system, the storage consumption can be further reduced by splitting the satellite by rate of change: