MDX Solutions: SORTING using ORDER and TOPCOUNT functions

As with most applications there are many different ways to complete a task and sorting in MDX is no exception. Furthermore, Analysis Services utilization of hierarchies further complicates the method and implementation of sorting routines. SSAS actually has several potential layers of sorting, we can set the sorting at the dimension level or force sorting during query run time. For dimension attributes, we can actually set the Order By property in SSDT to either a dimension attribute Name or a Key as displayed below for the Product attribute under the DimProduct dimension. Additionally, the sort order defined on the dimension attribute property screen can only be the name or the key and is always in ascending order.

SORTING in MDX

-- A MDX member order is defined in UDM
SELECT [Measures].[Order Quantity] on COLUMNS
    ,NON EMPTY [DimProduct].[Product].[Product].Members on ROWS
FROM [Cube];

-- B Use ORDER() function for sorting members in DESCENDING order
SELECT [Measures].[Order Quantity] on COLUMNS
,NON EMPTY
ORDER(
    [DimProduct].[Product].[Product].Members
    ,[DimProduct].[Product].CurrentMember.Member_Name
    ,DESC
    ) on ROWS
FROM [Cube];

-- C Use ORDER() function for sorting using measures 
SELECT [Measures].[Order Quantity] on COLUMNS
,NON EMPTY
ORDER(
    [DimProduct].[Product].[Product].Members
    ,[Measures].[Order Quantity]
    ,ASC
    ) on ROWS
FROM [Cube];

Getting TOP N rows sorted by dimension attributes

Sorting by attribute name does not requires sorting in ascending order as members are already stored in sorted order in cube so we can use HEAD function to return top members from ORDERED SET

SELECT [Order Quantity] on COLUMNS
, HEAD(
    [DimProduct].[Product].[Product].Members
    , 5
    ) on ROWS
FROM [Cube];

Now if there are members present at the start with no orders, output will contain nulls. To eliminate null we can use NON EMPTY keyword as shown below.

--Here NON EMPTY keyword removes null after tuples values have been calculated 
SELECT [Order Quantity] on COLUMNS
,NON EMPTY HEAD(
    [DimProduct].[Product].[Product].Members
    , 5
    ) on ROWS
FROM [Cube];

Now the less than 5 members are shown after eliminating nulls, we don’t want this, we want to remove null values before HEAD is applied so we have to use NONEMPTY function inside HEAD function.

SELECT [Order Quantity] on COLUMNS
, HEAD(
    NONEMPTY ([DimProduct].[Product].[Product].Members,[Order Quantity])
	, 5
	) on Rows
FROM [Cube];

Now if we want to sort in descending order of dimension attribute name we can use below query

SELECT [Order Quantity] on COLUMNS
, HEAD(
    ORDER(
       [DimProduct].[Product].[Product].Members
       ,[DimProduct].[Product].CurrentMember.Member_Name
       ,DESC
       )
       , 5
     ) on Rows
FROM [Cube];

Getting TOP N rows sorted by measures (using TOPCOUNT)

SELECT [Order Quantity] on COLUMNS
,HEAD(
   ORDER(
       [DimProduct].[Product].[Product].Members
       ,[Order Quantity]
       ,DESC
       )
       ,5
    )
 on ROWS
FROM [Cube];

Another way to do the same thing. Instead of Head + Order we can use TopCount function as shown below.

--TOPCOUNT( «Set», «Count», «Tuple» )
SELECT [Order Quantity] on COLUMNS
,TOPCOUNT(
    [DimProduct].[Product].[Product].Members
    ,5
    ,[Order Quantity]
  ) on ROWS
FROM [Cube];

Although Head-Order can be useful for getting the result, TopCount() is the preferred way of implementing the requirement of isolating the best N members. This is because the Order() function is a relatively slow MDX function because it materializes the set, and the query optimizer may not be successful in optimizing the query by recognizing the Head-Order construct as a TopCount() function.

Getting TOP N sorted rows with FILTERING

In above query you actually ask to the top 5 selling products for all times. All dimensions that are not mentioned in TopCount() function’s 3rd argument will point to their all level. Lets say we want to point to a specific year say year 2007.

WITH SET FiveBestProducts AS
TOPCOUNT(
    [DimProduct].[Product].[Product].Members
    ,5
    ,[Order Quantity]
  ) 
SELECT {[Order Quantity]}*{[DimDate].[Year].&[2007]} on COLUMNS
      ,FiveBestProducts on ROWS
FROM [Cube];

Above query results does not seems correct as the result contains the same top 5 products which were evaluated in the context of all years, with their order quantity in single year 2007 displayed on the columns. In other words, we got the best N members in all years but then displayed their sales amount for a single year, 2007.
This is happening because The top N members are being evaluated in the context of the slicer(All Periods), not in the context of the opposite query axis as axes are evaluated independent of each other, so only the members in slicer are implicitly included in the third argument of the TopCount() function. (which is a mechanism known as Deep Autoexists: http://tinyurl.com/AutoExists).
TopCount() function’s context can also be set by any outer MDX construct such as GENERATE() function (see example below), but here we didn’t have such a case, so we can focus on the slicer and axes relation only. To conclude, only when the year 2007 is found in the slicer, then the third argument will be expanded into a tuple, and the result will be evaluated as the top N member in the year 2007. To make it work we need to rewrite query to either include [DimDate].[Year].&[2007] in slicer or in the third argument of the TopCount() function by making it tuple by replacing [Order Quantity] with ([Order Quantity], [DimDate].[Year].&[2007]).

WITH SET FiveBestProducts AS
TOPCOUNT(
    [DimProduct].[Product].[Product].Members
    ,5
    ,[Order Quantity]
  ) 
SELECT [Order Quantity] on COLUMNS
      ,FiveBestProducts on ROWS
FROM [Cube]
WHERE ([DimDate].[Year].&[2007]);

For more indepth understanding follow links below.
http://sqlblog.com/blogs/mosha/archive/2007/08/25/mdx-in-katmai-dynamic-named-sets.aspx
https://docs.microsoft.com/en-us/sql/analysis-services/multidimensional-models/mdx/autoexists
https://www.packtpub.com/big-data-and-business-intelligence/mdx-ssas-2012-cookbook

Getting TOP N sorted rows with PAGING using SUBSET

Pagination is a common use case throughout client and web applications everywhere. Google shows you 10 results at a time, your online bank may show 20 bills per page.
Lets say you’re looking for the first 10 products(@PageNumber = 1) then the query is going to be pretty efficient as it only has to store that 10 cells in memory but the same query can get much more expensive as the @PageNumber gets higher since more cells needs to be stored in memory(for @PageNumber = 10 we need to store 100 cells in memory).

--SUBSET( «Set», «Start»[, «Count»] )
SELECT [Order Quantity] on COLUMNS
,SUBSET(
  TOPCOUNT(
         [DimProduct].[Product].[Product].Members
         ,100
         ,[Order Quantity]
          ) 
      ,90  --start from 90th position
      ,10  --include 10 members
      ) on ROWS
FROM [Cube];

Getting TOP N sorted rows with PAGING using RANK

It is also important to understand that the Rank() function with only first two arguments specified does not order the set and because of this fact, we tend to do the ordering and ranking at the same time. This way, the sorting is done only once with complexityO(n*Log(n)) and then followed by a linear scan resulting in O(n) complexity for rank() function. As a good practice, we recommend using the Order() function first to order the set and then Rank() function to rank the tuples that are already sorted.

--RANK( «Tuple», «Set»[, «Numeric Expression | String Expression»] )
WITH SET [HundredBestProducts] AS 
        TOPCOUNT( [DimProduct].[Product].[Product].Members 
		, 100 
		,[Order Quantity] 
		) 
      MEMBER [Measures].[ProductRank] AS 
	  RANK ( [DimProduct].[Product].CurrentMember
		, [HundredBestProducts] 
		)
SELECT {[Order Quantity], [Measures].[ProductRank]} on COLUMNS
	,[HundredBestProducts] on ROWS
FROM [Cube];

Like the other MDX sorting functions, the RANK() function can also operate on a numeric expression. If a numeric expression is specified (3rd argument), the Rank() function will scan all the members (n) for assigning rank for every single member which results in O(n2) complexity. So Rank() with 3rd argument should be used with caution as performance can easily be degraded
For more indepth understanding follow links below.
http://sqlblog.com/blogs/mosha/archive/2006/03/14/ranking-in-mdx.aspx
https://www.mssqltips.com/sqlservertip/3530/sql-server-analysis-services-rank-and-row-number-ordering/

Getting TOP N sorted rows for each YEAR

WITH 
  SET FiveBestProducts AS 
    TopCount
        (
          [DimProduct].[Product].[Product].MEMBERS
         ,5
         ,[Order Quantity]
        )
SELECT 
  [Order Quantity] ON COLUMNS
 ,[DimDate].[Year].MEMBERS * FiveBestProducts ON ROWS
FROM [Cube];

The reason above query doesn’t work and shows same top 5 products for every year is that calculated sets are evaluated once – after the slicer and before the iteration on cells. Therefore, the TopCount() function is evaluated in the context of the default year, the root member(All Member), not within the context of each year. That’s why the products repeat in each year.
While calculated sets are evaluated only once before each cell is evaluated, calculated members, on the other hand, are evaluated for each cell. We cannot use a calculated member in this case, because we need a set of 5 members. The only thing that’s left is to use the outer GENERATE() function to push each year into the set. By having everything in advance before the iteration on the cells begins, we can prepare the required multidimensional set of year and their best products.

--GENERATE( «Set1», «Set2»[, ALL] )
WITH 
  SET FiveBestProductsInEachYear AS 
    Generate
    (
      [DimDate].[Year].MEMBERS
     ,
        {[DimDate].[Year].CurrentMember}
      * 
        TopCount
        (
          [DimProduct].[Product].[Product].MEMBERS
         ,5
         ,[Order Quantity]
        )
    ) 
SELECT 
  [Order Quantity] ON COLUMNS
 ,FiveBestProductsInEachYear ON ROWS
FROM [Cube];