07 April 2010

Handling Targets That Do Not Roll-up

Comparing actual figures against targets are fundamental to any Performance Management (PM) system. PM solutions are typically built upon cubes and correspondingly, cubes are built upon data marts. In a data mart, measurements are generally stored in fact tables at their lowest level e.g. Cost per Organisational Unit (OU) per Day. These measurements are then aggregated along dimension levels e.g. Cost per Area per Week is the sum of the daily site figures for that week and area. This solution works well with actual figures as they are almost always additive. Targets however can sometimes be independent e.g. Target Cost per OU per Week is not necessarily the sum of daily target costs for that week.

There may be legitimate business reasons for doing so e.g. cost is heavily market/EOS driven and needs to be targeted manually or, for whatever reason, targets cannot be decomposed into fully additive components. This means that the PM solution has to either store the data for all levels in the fact tables or not include targets in the data mart at all. This latter option creates its own issues though as many front-end BI/PM tools can only connect to a single data source at a time.

Design alternatives on how to store data for all levels in the fact tables are not well documented in BI/PM design literature. Options here are:

1) Single Fact Table. Storing all aggregates in the same fact table simplifies the data model but has the disadvantages of values being duplicated e.g. the year value is stored in every day record; the fact table will contain a large number of fields and the ETL process is more complex (either using updates or reloading every level for each load).
2) Multiple Measures. To reduce the number of fields in the fact table, the tables could be split along time levels or measure. This option has the advantages of a reduction in the number of measures per metric and no duplication along the Time dimension. Its disadvantages are that there is likely duplication along the OU dimension and there are multiple fact tables.
3) Fact Table per Dimension Level. Defining a table for each of the dimension levels has the advantages of no field duplication and the least number of measures overall. Its disadvantage is again that there are multiple fact tables.

Option 3 is generally recommended as this eliminates data duplication and simplifies the ETL process. It is somewhat unusual as typically within a star schema; a fact table is surrounded by multiple dimension tables. It is however, a practical solution and has been recently implemented for a large resources client.

The fact tables are connected to the dimension tables on differing granularities as shown below:


The CostDay fact table is linked to the Time dimension table through its TimeId Foreign Key (FK). Fact tables with different granularities need to have additional field linking them to the Time dimension at the desired granularity. All attributes within the fact table therefore have to share the same granularity.


This (admittedly complex) model can be later simplified using whatever tools you use to maintain your cubes. SQL Server for example provides Perspectives, Measure Groups and Calculated Members that can be used to hide the complexity of the underlying data model from the user. Perspectives can be used to hide objects e.g. fact tables from the user. Finally Measure Groups can be used to create logical groupings of fact members.

No comments:

Post a Comment