May 222015
 

Hi,

In today’s post I am venturing out in the BI arena of Microsoft’s new addition to the analysis service “Tabular”. And I can fairly see why this would have been brought in to counter the market which was being flooded with in-memory cubes and similar techniques of data-driven discovery (primarily like Qlikview, you click on data and get the associated information driving it).

And at first looks the armory around this tabular technology (which includes Power Suite for Excel: Power Query, Power Pivot, Power View & Power Map) is all very impressive and definitely opens a lot of doors to the users in their data collations across multiple sources, and some really great visualizations all in Excel !!!

Well me not talking more on the technology (as many other resources online depict the great potential of the same, in-depth) and its pro’s and con’s, So I will get back to my post which intends to help the developer facing similar challenge:

Problem:

I have a “Tabular” analysis model for sales, and I want to visualize it across multiple currency for varying time period.
The exchange rate are provided in the model for calculations.

Solution:

Now, correlating the above problem which is very easily solved in the SSAS (Multi-Dimensional model) as mentioned in my previous post:

SSAS : Muti-Currency Conversions, missing intermediate measure group ??

Things are not so straight forward in “Tabular”.

Because Many-To-Many (M2M) relationship don’t quite work as they should logically seem, hence in this solution instead of going conventional M2M, I would transform the data being loaded to the SSAS model in the form of One-To-Many (O2M).

So to solve the above problem,

1. When loading the Sales data in the model, convert all the multi-currency source sales data to the one pivoting currency (against which the exchange rate table will be populated for conversion) in this case “US Dollar”, in SQL.

If you wish to report on multi-currency reported sales data you can also keep the original measure in a separate column in tabular, with the currency key linked to the currency table termed as “FromCurrency”.

CropperCapture[14]

Note: The columns are grayed out in the above image, because I don’t want them to be visible as dimensions hence I have hidden them from client tools appearing them as grayed out

2. In model establish the following relationship in the table view diagram:

CropperCapture[15]

3. In you currency rate table punch in the following DAX measure formula to get the EOD rate for the currency

Sum of EndOfDayRate:=SUM([EndOfDayRate])

4. Now with the series of DAX formulas in the Sales table you would be able to achive the desired.

–This will provide a plain sum of the SalesAmount Column, but in excel it will populate the table for all currencies even if they dont have exchange rates
Sum of SalesAmount:=SUM([SalesAmount])

–This will represent the filtered sales amount in USD value only for which valid conversion exchange rates are available
Filtered Sales Amount:=if(isblank([Exchange Rate Applied]),BLANK(),[Sum of SalesAmount])

–This will provide the applicable exchange rate conversion value for the selections
Exchange Rate Applied:=IFERROR([Sum of EndOfDayRate],BLANK())

–This will provide the final converted sales value in the desired currency
Sales Amount Converted:=IFERROR([Sum of SalesAmount]/[Sum of EndOfDayRate],BLANK())

5. Result

CropperCapture[16]

To my experience with couple of transformation in the load data and with above simple DAX formulas, I was able to achieve a perfectly working multi-currency reporting model in SSAS tabular.

Which was later extended to incorporate the additional column of reported currency sales with addition of “FromCurrency” dimension.

Hope it helps !!!

Note: The above model is built on Adventureworks2014DW database downloadable from code plex site

Downloads

May 192015
 

Hi,

In this post I would like to highlight the techniques usually involved in SSAS (multi-dimensional model) for planning purposes, the challenges faced and some tricks to optimize the write-back model.

The motivation for me in this post is to efficiently apply the multiple features of cube at work and to derive an optimal planning model.

Because in most of the post I have come across so far (though very helpful) mainly exhibits the functionality of achieving the write-back functionality, via partition. But beyond it, the efficient utilization of the same in completing the model isn’t yet available.

Hence this is my attempt to present the complete picture of the same.

(In the entire post I will assume when ever I mention the SSAS, I will be referring to the multi-dimensional model)

Basics:

SSAS, offers write-back’s at both dimensional and cell level. Thus, I wont be digging much deeper into the fundamentals of the same as it has been highlighted across multiple post as follows:

Dimensional Writeback:

http://ram-msbi.blogspot.com.au/2012/03/learn-msbi-dimension-writeback-and-cell_15.html

https://msdn.microsoft.com/en-us/library/ms174540.aspx

As there are not many articles describing it completely are available on the web, I will highlight the feature with few screen shots for better covering the topic:

To enable a write back on any dimension, ensure that its built on plain vanilla table, because based on views, or named calculations (in the dsv) wouldn’t allow the write-back and you will keep pulling your hairs with the errors.

So to enable write-back:

CropperCapture[2]

In the SSAS designer (BIDS) you will see the following icon, which will enable you to add members to the dimension , but beware the dimensional write-back is not supported from excel so unfortunately if you were planning for the same.

You would have to take a different development approach for the same as nothing out of the box would support to my knowledge so far.

CropperCapture[3]

CropperCapture[4]

And there you go, we have established our selves a write-back dimension !!!

Cell Writeback:

https://sstoltze.wordpress.com/2011/05/04/enabling-writeback-to-ssas-in-excel-2010/

The cell write-back in SSAS is more popular among’st bloggers so you would be easily be able to find many articles on the same. But for dimensional write back the number of articles are less but it ai’nt that complex either.

Planning Model

So now after completing the ground work for write-back, in this article we will try to build a planning model for the following challenge.

Objective:

Build a simple HR planning model, where employee can plan only monthly level salaries (only , not daily), but for reporting those planned monthly salaries should be spread across daily (equally weighted) for reporting purposes.

Solution:

To achieve the solution, once of the challenge for me to address was that of single dimension linking at multiple granularity level to different fact tables (one containing the daily fact data for the employees, and one used for planing purposes)

Leading me to adopt the following structure for my dsv

CropperCapture[7]

Now in the above the the model comprises of only 4 tables out of which (Employee and Date) are dimensional tables and (FactSalary and FactPlanSalary) are the tables representing factual data.

Now coming back to model of the cube, the dimensional usage of the date dimension is where all the trick lies of utilizing the dimension related to fact at different granularity level.

CropperCapture[8]

Now one of the most important setting to set in the measure group representing the FactPlanSalary data (also remember to remove the count measure from this group as this will become the write back area for users and any thing apart from sum is not liked by the SSAS for write back partition to be created). And create the write back partition for the FactPlanSalary data.

CropperCapture[9]

CropperCapture[11]

This setting will avoid the replicated measure values repeating for date levels carried over from monthly levels. A common problem across many cubes caused by incorrect relationships.

And once all the above is done, time to get hands dirty by firing up the MDX studio to prepare the computed calculation for the cube which will accomplish all the magic.

CropperCapture[10]

Now the above caluation will pick the monthly value from the planned measure group and with the count of days in each month will divide the value to get the daily rate and spread out the data in the reporting area.

Results

Now to analyse the fruits of the efforts:

CropperCapture[12]

In the above screen shot we can see that the measure belonging to the FactPlanSalary is editable under the what-if scenario in excel.

And the major power of calcuations is relaized in the following screen shot where the value in the plan measure is picked and distributed across dates via the calculated measure.

CropperCapture[13]

Advantages

The major advantage of building such a planning model is that for the SSAS the writeback is achived via data insertion in the associated writeback table (delta).

And if the number of rows that are to be inserted are less in number for each write-back the faster and quicker the user experience will be with the model.

Now with the write back only enabled for monthly data the amount of rows being pushed in the database to achieve the writeback affects are substantially reduced hence providing a great user experience while planning.

So indeed write back can be little complex to achieve effectively (if granularity/grain of planning and reporting are different) for different business and in different scenarios, but when planned comprehensively utilizing all the features of SSAS at play. One can build really powerful planning models using SSAS.

Hope it helps !!!

Downloads

Note:
This article has been long and complex, and my write-up couldn’t have covered every single details of implementation. So in case of any issues please leave a comment or drop me a mail for clarification.

May 182015
 

Hi,

This post, is more about the design approach being adopted towards enterprise data warehousing recently.
So nothing much technical this time…

In recent days I came across the term “Data Vault” and I was keen to know more about it, hence i ventured into the territory of the same. And due to very limited re-sources available online on the topic has prompted me to write this article to shed more light into the same.

So back to business:

What is Data Vault: “Data Vault Modeling is a database modeling method that is designed to provide long-term historical storage of data coming in from multiple operational systems. It is also a method of looking at historical data that, apart from the modeling aspect, deals with issues such as auditing, tracing of data, loading speed and resilience to change.”
from : Wiki

As I have faced the problem of understanding the concept, just by reading the wiki wasn’t enough. So I will try to explain it in my own words.
(Excuse me !!! If I deviate in my words from the original ground work laid for the same)

Description:
Before progressing on further in this article I assume that you have read the wiki article about data vault and know the rules behind the Hubs, Links and Satellites, as further I will be building on top of them in the post.

From my experience in the past building an data warehouse on a green field project is always challenging, and is prone to many cycles of re-development. As data from multiple sources start integrating into one data warehouse, the key problems start to emerge are duplication of similar business data across different tables from multiple sources, multiple sources providing different set of attributes for similar business data, incrementally incorporating new data and more.

Example: Lets take an example of Stock Tickers

In any finance database if you maintain list of tickers, the sources can be Yahoo, Nasdaq, and many more stock exchanges across the world. Though the entity is one termed as “Ticker” but multiple sources will provide different set of data associated with the tickers, some more some less..

CropperCapture[1]

In the above image as you can see, the Yahoo doesn’t provide the data for Annualized Dividend, Yield and where as for Nasdaq source doesn’t provide Market-Cap.

Now later suppose new sources bring additional data, say for example google finance would bring me “Volume” for the tickers then with the above I would have to add new column to the table for the same. Which in a way is not that difficult nor it would disrupt/break the existing warehouse but yes the table structure will change.

And once table structures are changed (lucky if you don’t have schema-binded views/stored procedures else you wont be able to change your table structure also) you are in a question if other interacting systems are impacted or not who haven’t been using explicit column names in their operations.

Thus to me data vault is a technique of data warehousing in which the following steps play a key role for above similar situations

1. Identification of Hubs: It is a key process in building of data vault to first identify your hubs in your warehouse. These will form the pillars of the warehouse upon which you will establish upon your relationships representing the business.

2. Creation of Links: These tables with a good naming convention will clearly establish the relationship across your multiple hub’s, thus providing the logic.

3. Creation of Satellites: All the attributes further defining your hubs can be places in satellites. One hub can have multiple satellites.

And from the above we are able to achieve a relatively flexible architecture where new information is easy to incorporate, with the price of extra joins for information (but even those can be addressed by the smaller data marts emanating from your enterprise data warehouse) or with SuperNova models.
(seems with the names of data warehouse modelling we are going beyond stars…)

Well in summary as you may have experienced in this article, I have just scratched the surface of the topic and the topics spans much beyond it. But re-writing the same concepts is not I am aiming, hence I am providing you some good reads for the same which helped me understand the concept more thoroughly and leverage it at my work.

Good Reads

Good Read for Concept
Data Vault Wiki

Good Article for applying the concept to work, with advantages and dis-advantages
What is Data Vault ?

A very good description of data vault and its extension with super-nova models with cached views will give a great insight on the application of the concept and its usage in the real world.
Data Vault and Data Virtualization: Double Agility

Generate Data Vault by Tool ??

And lastly, in the note of the above technique. Many people are trying to find algorithmic ways to implement the data-vault based on the referential constraint in their existing transactional systems or legacy warehouses.

I have reviewed few algorithmic pseudo-code behind these, but to me nothing can replace the human designer for the same, as at many instances the referential integreties (or overdone) are incomplete and the logic interprets the schema to in-correct table type in data vault context.

So in summary my advise is to better design the data vault manually as it would yield more beneficial result and leveraging tools for automating the repetitive processes but the design/categorization of Hubs,Links and Satellites are better done with human designers rather than relying on algorithms based on keys and referential integreties.

Hope it helps.

May 102015
 

(MIST Only Feature)

Many times when I was getting stared on the MIST platform I was getting confused at multiple occasions where I was trying out the concept of transformers out in the BIDS Helper.

So to clear out the air, this is only an MIST platform specific feature.

Now moving on to the concepts:

Transformer: This is a piece of code which we write up in MIST, which run post build process and scans over our entire BIML structure and where ever the instructed node type is matched based on the “MergeMode” directive the specified script is appended/replace or more…

This is a very powerful feature which I will try to scratch the surface in this article, exhibiting its utility.

Scenario:

I have built a package using MIST/BIML which generates a create table via “ExecuteSqlTask” for each table. And just to increase the complexity that the count of my tables is >10. Now I am presented with a business requirement that the SSIS must be error handling capable, so that “OnError” we can log which component caused the error.

Now I have a situation where manually I can open the package and create an “OnError” event handler manually for >10 components (sound tedious) , or in the BIML script I copy and past the following script >10 times at relevant places where error handling is required or amend the actual package generating script.

<Events>
		<Event EventType="OnError" Name="SayHello">
			<Tasks>
				<ExecuteSQL Name="Error_<#=TargetNode.Name #>" ConnectionName="AdventureWorksLT2012">
					<DirectInput>
						SELECT 1;
                    </DirectInput>
				</ExecuteSQL>	
	        </Tasks>
	    </Event>
	</Events>

(Please ignore my DirectInput in the example, as its for representational purposes only, instead it should be of an insert/store procedure execution for error logging)

So in all the options above the, error handling code is included my coding logic, where as it would be nice if some thing could detect post build that where ever I have an execute SQL task container attach an standard “OnError” handler to it.

So stopping my talks, and getting down to code:

Transformer code: (PackageErrorLogging.bimlt)


<#@ target type="ExecuteSQLTask" mergemode="LocalMerge" #>

<ExecuteSQL>
	<Events>
		<Event EventType="OnError" Name="SayHello">
			<Tasks>
				<ExecuteSQL Name="Error_<#=TargetNode.Name #>" ConnectionName="AdventureWorksLT2012">
					<DirectInput>
						SELECT 1;
                    </DirectInput>
				</ExecuteSQL>	
	        </Tasks>
	    </Event>
	</Events>
</ExecuteSQL>

Now how do I instruct MIST to execute post build:

They have now introduced some thing called as “Frameworks”, just add one of the MIST Frameworks and specify the code path to your transfoemr and you are done, setting up the build behaviour, which you can control by Ribbon interface of the MIST also.

Framework: (ErrorHandlerBuild.bimlfx)


<FrameworkSettings RootItemName="RootNode" TargetItemName="TargetNode" xmlns="http://schemas.varigence.com/FlowModel.xsd">
    <TransformerBimlScripts>
        <TransformerBimlScript TransformerBimlScriptFilePath="C:\Users\Kostya\Documents\Mist\Sales_DV\addedBiml\Transformers\PackageErrorLogging.bimlt" />
    </TransformerBimlScripts>
</FrameworkSettings>


CropperCapture[12]

And the build of the code can be instructed from the ribbon whether to include transformer or not…

CropperCapture[13]

And the results can any time be looked on to your generated SSIS packages:

CropperCapture[17]

More additional help on Transformers is available at the following links:

BimlScript Transformers Primer
Transformer targeting multiple SSIS versions
Biml Transformers – Webinar (YouTube)

Hope it helps…

Downloads

May 102015
 

Hi,

Today in a series of post about BIML, I am mentioning a tool in the world of BIML called as MIST which is far more powerful than previous BIDS Helper and primarly geared to serve BIML scripts and build more on functionality which I will try to highlight some in this and forth coming post.

The main highlights of the MIST can be read over its website, but some of the notable features where this tool can leverage beyond BIDS helper is its in-memory representation of the BIML objects such as Tables, Schemas, Database and more and to add a very powerful mechanism of Transformers, code intellesense, live BIML preview and more…

In-Memory Object Representation:

CropperCapture[10]

Code Intellesense:

CropperCapture[11]

Transformers: This is more of a coding feature , which I will exhibit in up-coming blogs about the same.

Now the problem faced in BIDS Helper previously is not any more an issue with MIST because now this tool offers a build feature which generates the entire visual studio project containing the SSIS packages as scripted out in the script.

Some of the resources I leveraged to gain insight into this tool and the language are listed as below highlighting the power of combination of BIML and MIST.

MIST Overview
MIST User Guide
MIST Standard Sample

BIML Sample Scripts
BIML API

BIDS Helper Code Plex
BIDS Helper Sample & Tutorial

BIML Script Portal

BIML Import DB Documentation

MIST Overview (You Tube)

I think the above material would give any one a good insight about the tool and its working and utlity in the space of BI, and as I stumble upon some great ways of adapting the same on my work, m all here to post it out…

Hope it helps…