Business intelligence is one of the fastest growing segments of the successful business IT portfolio. The ability to analyze and interpret the large volumes of data a business collects on a daily basis can be extremely valuable and can assist in making decisions and give any company an advantage over its competitors.
Building the multidimensional data models and related technology to be able to effectively analyze and interpret this data is no easy feat and companies must spend significant time and resources building these foundational components before the real analysis can begin. Unfortunately, often times either the excitement or the budget runs out before these components are fully completed. Fortunately, as long as the multidimensional data model was well designed and built, several useful things can be easily added to realize the full potential of the technology that was put into place.
Recently, I had the opportunity to work on a project tasked with rebuilding some executive level reports using a new OLAP Cube (online analytical processing with multidimensional data) as the data source. The cube was built on SQL Server Analysis Services technology and the reports were created using SQL Server Reporting Services. I was very surprised to find that many of the basic measures that would make these reports work quickly and accurately were missing from the cube. Cleaning it up and getting useful data was a chore. So in this article I will share what I believe are five important types of calculated measures that every cube should contain in order to make the most of your data.
PERIOD TO DATE
In certain types of businesses the ability to view sales data over different time periods is critical.
– Retail companies pay special attention to same store sales for time periods in one year versus the prior year. – Publicly traded companies are concerned with their quarterly progress toward revenue and profit goals. – An online retailer may want to be able to compare growth in number of online orders collected over a specific period of time, such as the holidays.
All of these types of calculations can be easily implemented using Calculated Measures.
In this example we have a date dimension with a date hierarchy that defines fiscal week, fiscal month, fiscal quarter, and fiscal year. Our goal is to create measures on the [Total Sales] amount to view it by week to date, month to date, quarter to date and year to date for any given day selected by the end user when generating reports.
The first step is to make sure your cube is temporally aware, meaning that your date dimension members are defined using known time periods. Next you take advantage of the “PeriodsToDate” function to return an aggregate for the desired time period. In the above example we are aggregating [Total Sales] for the year to date period. When a user adds this measure to their query and selects a member within the [Fiscal To Date Hierarchy] they will see the total sales for that year. The same principle applies for week, month and quarter. Just replace the [Year] component of the [Date] hierarchy with your desired period.
Another helpful measure that’s simple to create and adds significant value when aggregating data is to be able to rank certain facts based on a particular measure. In the next example I will create a Set and a Measure, which will allow me to rank my customers based on the Total Sales amount generated by that customer.
The Dynamic Set is going to allow you to order customers by total sales. The measure [Customer Sales Rank] is going to use that set to return a number indicating where any given customer ranks with regards to [Total Sales]. When the [Customer Sales Rank] measure is added as a field to a query those results can then be ordered by rank in the report to give insight into who your biggest customers are and which customers could use a little more attention. You could also take this one step further and make the ranking temporally aware by combining the concepts from the first example with the concepts of the second.
Always a favorite of hardcore SQL developers, the ability for a database to quickly count the number or records that meet a certain criteria is critical in good business data analysis. However, when it comes to multidimensional data structures, COUNT can get a little more difficult. By adding a few calculated measures for count to your cube you can provide valuable data which compliments many of the more tangible measures for a typical business. For example, when looking at aggregated data such as election results, news agencies always accompany that data with the “Number of Precincts Reporting” because the number counted greatly influences your interpretation of the aggregated results. The same concept applies with business data. If the sales numbers for yesterday look very low it might also be helpful to know that sales data has only been collected from 50% of your stores. That can make a big difference between having to call the sales department versus having to call the IT department.
In the example above we use the COUNT function to look at the customer hierarchy where [Total Sales] are greater than zero. When added to a query with the [Total Sales] measure and filtered by a certain region you can easily see how many customers are being counted in that [Total Sales] amount. Couple that measure with an almost identical measure which filters by “[Total Sales] = 0” and you have some very useful data analysis.
No matter how technically advanced we become we will forever have to be careful with the dreaded divide by zero principle. Being able to create calculated measures which calculate averages and percentages can be very useful, however if you don’t put in a little extra work to ensure that you are not dividing by zero, your end users could get a lot of unusual looking data. Typically the data comes back as “NaN,” which stands for “Not a Number,” but we would prefer the end-users just see “0.” In the next example we will combine our temporally aware sales calculation principles with laws of good mathematical programming to create a measure for average sales year to date.
Since you cannot create a calculated measure using another calculated measure you must put the PeriodsToDate aggregation directly inline with your division. First we check to make sure that [Order Count] is not null and is not empty, then we divide the year to date aggregation of [Total Sales] by the year to date aggregate of [Order Count]. The result is a temporally aware average with no risk of seeing “NaN” in your results.
Now You Can Slice & Dice the Data By adding a few simple calculated measures we have now given our business users a much more comprehensive portfolio of ways to slice and dice the data. That will give them better data and result in more informed decisions.
To stay up to date with what’s happening in the Microsoft world, follow @CrederaMSFT on Twitter and follow Credera on LinkedIn.