Microsoft’s business intelligence tools have arrived squarely on the doorstep of serious big data analytics.
By Scott Robinson
Out-of-the-box Excel vs. Pricey business intelligence
Since Microsoft Office SharePoint Server (MOSS), Microsoft has faced a huge quandary: how to get more business-side users to exploit the rich business intelligence (BI) tools in SQL Server? Excel turns out to be the weapon of choice, since everybody and their great aunt know how to use it. As analytics have begun to supplant strategic imperative on the front lines of competition, this became an even stronger priority (in SP 2010), making Excel an indispensable component in the Microsoft BI toolkit. If I can play out ad hoc analytics via Excel, without the inconvenient, expense, and limited scope of a third-party or home-grown BI app, why in the world wouldn’t I?
There are some good reasons, if you listen to the third-party app marketers. Excel is a disastrous BI client, they maintain, because it provides no means of detecting an error in the set-up of an analytical job; its ubiquity is a red herring, because it invites people into the analytics business who don’t know squat about analytics; and it simply isn’t designed to rally the processing resources that big data requires.
The first two are training issues, and not Microsoft’s problem. But that last objection is a valid one, and has been no small hurdle for getting in-memory analytics onto the SharePoint community’s radar, where it certainly should be. And now that hurdle has been cleared.
Big data, big memory
The In-Memory BI Engine (Vertipaq, or xVelocity) is an efficiency enhancement that optimizes memory for high-volume analytical processing. Columnar storage, in-memory caching, compression and algorithms for aggregating and scanning data in parallel are leveraged to empower the business user to potentially process hundreds of millions of rows of data in an ad hoc process (the implementation of OData and the expansion of PowerPivot now empower that same user to gather those hundreds of millions of rows from many different sources, but that’s a different blog entry).
The under-the-hood efficiency means that this data crunch happens in seconds, not minutes, and that is itself a game-changer for ad hoc analytics. But an even greater resource barrier has been the fact that Excel can’t hold that much data in one sitting, and SharePoint can only store 2GB, single-object-wise.
However, that’s no longer an issue. Excel Services can now make use of as much free memory as is physically available on the box, meaning that the ceiling in terms of in-memory processing is now virtually limitless: if you need more, just scale up. Therefore, the 2GB object size limit (which is actually a SQL Server limit) is unlikely to be a problem, because you don’t need to do temporary storage if you’ve got that much memory available, and analytical results tend to be relatively small, compared to the body of data that produces them.
Microsoft’s BI tools, then, are not only improved, but have arrived squarely on the doorstep of serious analytics. These new toys don’t help the learning curve, but they certainly place unprecedented processing power in the hands of the business user. Just don’t tell the BI app marketing guys.