Great set of articles over at Tod McKenna’s blog going in to a high level of detail on Kimball’s 34 subsystems of the ETL. So far, Todd has reached number 15 of 34. I liked his take on Surrogate Keys, it’s a topic worth investigating if you are one of the people who still try to forge ahead using natural keys only to find issues later on….
When integrating data into a Dimensional Model, you need a mechanism to assign new primary keys to each dimension. These primary keys will be used in your Fact table as foreign keys. You cannot use natural keys because they are likely to repeat — this is especially true if you are maintaining history using SCD Type 2 (more on this in a bit). Subsystem 10 addresses this important need, by specifying the need to generate surrogate keys for all dimensions.