By the way, an atomic cell is not necessarily at the lowest level of the hierarchy; Mondrian would prefer to load cells at a coarse granularity, and leave the hard work of aggregating values to the database, or even better, an aggregate table. And Mondrian does its best to retrieve atomic cells in batches. It gathers together requests for lots of cells of the same granularity and generates a single SQL statement to retrieve them all at once.
Mondrian's 'keep it simple' scheme comes unstuck when a particular calculation is repeated many times over. Nick Goodman came up with a classic example of this in bug MONDRIAN-552. The query is as follows:
Note how [Measures].[Running Total] is recursive. The running total for week 3 is defined as the running total for week 2 plus the profit for week 3. To calculate the average running total for week 99, Mondrian computes profit for the first 99 weeks and to calculate the average running total for week 100, Mondrian computes profit 100 for the first 100 weeks. There's lots of wasted effort: Mondrian has computed profit 50,000 times when it could have done it just 100 times and cached the results.with member [Measures].[Profit Change] as
([Measures].[Profit], [Time].CurrentMember)
- ([Measures].[Profit], [Time].PrevMember)
member [Measures].[Running Total] as
([Measures].[Profit], [Time].CurrentMember)
+ ([Measures].[Running Total], [Time].PrevMember)
member [Measures].[Average Daily Running Total] as
Avg(
Descendants(
[Time].CurrentMember, [Time.Weekly].[Day])
[Measures].[Running Total])
select
{[Measures].[Profit Change],
[Measures].[Running Total],
[Measures].[Average Daily Running Total]} ON COLUMNS,
{[Time.Weekly].[Week].Members} ON ROWS
from [Sales]
The solution is simple: wrap the calculation for [Measures].[Running Total] in the Cache() function, and Mondrian will compute the value only once.
You will see that in the bug I come up with a couple of proposals for making Mondrian better. I don't think Mondrian should automatically cache every expression, because caching costs time and memory, and most expressions are only evaluated once or twice. And by the way, you should use the Cache function sparingly, for the same reason.
But it would be nice if Mondrian could automatically detect some cases where expression caching is desirable. The proposed 'cache' property of a calculated member would have three values: 0 (never cache), 1 (always cache) and null (Mondrian should use its best judgment). Most calculated members would leave the caching up to Mondrian, so we would need to come up with a simple, effective rule that governs caching before we implemented this feature. What do you think the rule should be?
1 comment:
Hi Julian,
i was curious to know how Microsoft analysis service OLAP engine evaluate MDX expressions!!!!
how that is different from way mondrian does the same?.
cheers,
Mayur
Post a Comment