T-SQL Tuesday #016: Aggregations in Reporting Services

It’s Tuesday, which means two things:

I may or may not be out of Craisins…not sure.
Time for a T-SQL Tuesday post.

I’ll check on the Craisins later. Let’s do the post.

Last I checked, most elementary school children can take a list of numbers and add them up as they go. That is, give them a list of 3 + 4 + 5 + 1 and they’ll probably tell you 3 + 4 = 7, 7 + 5 = 12, and 12 + 1 = 13. Simple, right? If you ask them to sort it from lowest to highest, omit the odd numbers, and make a second list based on numbers in the first, they’ll stare at you, then tell you they had an accident in their pants.

advise: coloring in the lines by woodleywonderworks, on Flickr — Did I mention I'm pretty good at coloring, too?

Reporting Services works much the same way. If you ask it to do things it can do as processes data row by row, like RowCount, Sum (with or without an explicitly defined scope), or even Previous, it’s pretty efficient. However, if you ask it to start shuffling items around (sorting), cutting items out (filtering), or relating items to each other (subreports), Reporting Services is not able to deal with it as efficiently. Why is that?

Reporting Services has no indexing on its datasets. None. Zero. Zilch. Nada. The null set.

BingGoogle “Reporting Services Indexes” and you’ll get results less useful than those Ensure coupons the grocery store registers insist on printing for me. (I’m a 35 year-old man! Give me something I’ll actually use, like a dollar off Craisins! I’ll bet that Ensure nonsense comes from a bloated Access database in Branson.)

Once the dataset comes back from the data source, it’s nothing more than an ordered heap. Performance-wise, datasets in Reporting Services behave a lotlike ASP.NET datasets. If you return thousands of records in either one, then sort and/or filter them in the dataset rather than getting them pre-sorted/filtered, you’re wasting precious seconds. Several seconds may not seem like a big deal, but to a user watching the spinning green circle, the difference between two seconds and ten is an eternity. That said, if your report is interactive and the typical use case involves a lot of manipulation, it may be worth letting Reporting Services shuffle the dataset, rather than pounding the data source again and again with variations of the same data. It depends.

If you’re trying to decide where the summation logic should go when designing a report, treat the Reporting Services like a little 2nd grader. If you want it to count the items as it goes (RowCount), add them up as it goes (Sum), or tell you what item it just looked at (Previous), you’ll be just fine. If you want something more complicated than that and your dataset is not insignificant in size, ask its older sibling — the database engine — to do the work. Otherwise, you’ll likely see Report Server poop its performance pants when you check ExecutionLog2.

But then, what do I know? I’m still not sure if I’m out of Craisins.

Comments