Understanding Partition(Fact) Processing in SSAS Multidimensional

In Analysis Services, when we first DEPLOY a database it contains only objects but not data. When we PROCESS a database, it directs the server to recursively process those objects that store data in the model, such as dimensions(see Understanding Dimension Processing in SSAS Multidimensional) and partitions. Partition processing refreshes or populates fact data from the source systems and relational data warehouse in the cube and store data into fact data stores, it also creates indexes and aggregates if any defined in the model. The storage engine perform these processes in phases, so we will first try to understand these different phases. 

Partition Processing Phases:

When we talk about processing a partition, there are two parts to it:
1) Processing Data which rebuilds fact/partition data store by Reading data from relational database and Writing compressed data in fact/partition data store.
2) Processing Indexes which creates bitmap indexes and aggregation defined in the model for that partition.

Partition Processing Options:

ProcessFull: We can execute a single command ProcessFull to perform above two operations together or execute separate commands ProcessData and ProcessIndexes for each of these operations, this way we can identify how much time each operation is taking. ProcessFull or ProcessData/ProcessIndexes both clears existing data in the partition and then reloads it with new data.

ProcessAdd: In case we want to incrementally update a partition with just new data while keeping old data as it is we can use ProcessAdd. SSAS actually creates new partition containing only new rows and merges this new partition with old partition. For using ProcessAdd we need to have SQL query created in a such a way that it returns new rows only. For handling Inserts, Updates and Deletes using ProcessAdd we need to have different type of of SQL query for operation as mentioned below

  • ProcessFull – caters for them all
  • Handling Inserts with ProcessAdd
    • SQL query that adds new rows
  • Handling Updates with ProcessAdd
    • SQL query that negates old rows and adds new rows
  • Handling Deletes with ProcessAdd
    • SQL query that negates old rows

You might choose to do the full process each time or you might do the full process followed by subsequent incremental processes. No matter what approach you use, SSAS uses job based architecture (creates a controller jobs and many other jobs depending on number of attributes, hierarchies, partitions etc.) for processing dimensions and facts.

NOTE: During processing, key/value pairs for all dimensions loaded into memory Large dimensions can cause SSAS to process in multiple passes.

Scalable Partition and Aggregation Strategy: ROLAP + MOLAP

We should almost always partition your measure groups by the DATE_KEY and match the underlying relational data warehouse (RDW) partitioning scheme. The basics of this is your ‘hot’ (the current period) partition should be optimized for query-execution time via setting a different Aggregation Design as opposed to the ‘colder’ (older) partitions. We can also set ‘hot’ (the current period, frequently updated) partition to use ROLAP as it reads Data directly from source database and ‘colder’ (older) partitions to use MOLAP as show in the diagram. However there is one issue with ROLAP as query cache can become ‘stale’.
Note we can also use HOLAP which builds indexes and aggregations in multidimensional store only, but HOLAP is very rarely used.