Monday, April 12, 2010

T-SQL Tuesday #005: Reporting

This blog entry is participating in T-SQL Tuesday #005, hosted this month by Aaron Nelson. You are invited to visit his blog to join the party and read more blogs participating in this month’s theme: Reporting.

Speaking of “Reporting”, we now join Action News in progress…



Maria the Anchorwoman… fear that one airline charging $45 for carry-on bags will create a domino effect, with other airlines most likely following suit and charging as well.

In a related story, one airline is considering offering discounts for children as long as their parents are willing to store their children in overhead bins.

A spokesman for the airlines also confirmed that all meals will eventually be eliminated from flights and passengers will instead receive buckets of peanuts.

And now, Action News technical reporter Ella Vader has some exciting news from the SQL Server world. Ella?


Ella Vader, ReporterThank you, Maria. Since the introduction of SQL Server 2005, developers have had the ability to create recursive CTE’s that can process a self-referential table and produce a simple hierarchical report of manager/employee relationships or a bill of materials report.

For example, Books Online provides a crude example of displaying a hierarchical list using vertical bars to represent relationships in the AdventureWorks database. Here is an example of an organizational chart of EmployeeID #3, Roberto Tamburello:

with DirectReports(Name,EmployeeID,EmployeeLevel,Sort) as
(
select cast(c.LastName+', '+c.FirstName as varchar(255))
,e.EmployeeID
,1
,cast(c.LastName+', '+c.FirstName as varchar(255))
from AdventureWorks.HumanResources.Employee e
join AdventureWorks.Person.Contact c on e.ContactID=c.ContactID
where e.EmployeeID=3 /* Roberto Tamburello */
union all
select cast(replicate('| ',EmployeeLevel)+c.LastName+', '+c.FirstName as varchar(255))
,e.EmployeeID
,d.EmployeeLevel+1
,cast(rtrim(Sort)+'| '+c.LastName+', '+FirstName as varchar(255))
from AdventureWorks.HumanResources.Employee e
join AdventureWorks.Person.Contact c on e.ContactID=c.ContactID
join DirectReports d on e.ManagerID=d.EmployeeID
)
select Name
from DirectReports
order by Sort
/*
Tamburello, Roberto
| Cracium, Ovidiu
| | D'Hers, Thierry
| | Galvin, Janice
| Erickson, Gail
| Goldberg, Jossef
| Miller, Dylan
| | Margheim, Diane
| | Matthew, Gigi
| | Raheem, Michael
| Salavaria, Sharon
| Sullivan, Michael
| Walters, Rob
*/
This report is functional, but the final output is about as exciting as watching a CURSOR routine finish up.

However, all is not lost. SQL Blogger Brad Schulz has come up with a routine that will produce a hierarchical report in T-SQL with a little more pizazz. Simply create a temp table with specific columns called #TreeData and call his stored procedure called usp_DrawTree:

if object_id('tempdb..#TreeData','U') is not null drop table #TreeData
select ID=EmployeeID
,ParentID=ManagerID
,DataForBox=LastName+', '+FirstName
,ExtraInfo=''
,SortColumn=LastName+', '+FirstName
into #TreeData
from AdventureWorks.HumanResources.Employee e
join AdventureWorks.Person.Contact c on e.ContactID=c.ContactID

exec usp_DrawTree 3 /* Employee ID #3 = Roberto Tamburello */
/*
┌─────────────────┐
┌─┤ D'Hers, Thierry │
│ └─────────────────┘
┌─────────────────┐ │
┌─┤ Cracium, Ovidiu ├─┤
│ └─────────────────┘ │
│ │ ┌────────────────┐
│ └─┤ Galvin, Janice │
│ └────────────────┘
│ ┌────────────────┐
├─┤ Erickson, Gail │
│ └────────────────┘
│ ┌──────────────────┐
├─┤ Goldberg, Jossef │
│ └──────────────────┘
┌─────────────────────┐ │
│ Tamburello, Roberto ├─┤
└─────────────────────┘ │
│ ┌─────────────────┐
│ ┌─┤ Margheim, Diane │
│ │ └─────────────────┘
│ ┌───────────────┐ │
├─┤ Miller, Dylan ├─┤
│ └───────────────┘ │
│ │ ┌───────────────┐
│ ├─┤ Matthew, Gigi │
│ │ └───────────────┘
│ │ ┌─────────────────┐
│ └─┤ Raheem, Michael │
│ └─────────────────┘
│ ┌───────────────────┐
├─┤ Salavaria, Sharon │
│ └───────────────────┘
│ ┌───────────────────┐
├─┤ Sullivan, Michael │
│ └───────────────────┘
│ ┌──────────────┐
└─┤ Walters, Rob │
└──────────────┘
*/
This stored procedure accepts two optional parameters. The first parameter, @ID, is an integer indicating the identifying key of the person at the top of the hierarchy. If one does not pass a value for @ID, then the stored procedure will automatically use the ID in the #TreeData table that has a ParentID of NULL.

The second parameter, @HangStyle, is a parameter of datatype bit. A value of 0 (the default) indicates that one wants a balanced tree output, as shown previously, while a value of 1 indicates that a hanging style is desired, as illustrated here:

exec usp_DrawTree 3,1  /* or exec usp_DrawTree @ID=3, @HangStyle=1 */
/*
┌─────────────────────┐
│ Tamburello, Roberto ├─┐
└─────────────────────┘ │
│ ┌─────────────────┐
├─┤ Cracium, Ovidiu ├─┐
│ └─────────────────┘ │
│ │ ┌─────────────────┐
│ ├─┤ D'Hers, Thierry │
│ │ └─────────────────┘
│ │ ┌────────────────┐
│ └─┤ Galvin, Janice │
│ └────────────────┘
│ ┌────────────────┐
├─┤ Erickson, Gail │
│ └────────────────┘
│ ┌──────────────────┐
├─┤ Goldberg, Jossef │
│ └──────────────────┘
│ ┌───────────────┐
├─┤ Miller, Dylan ├─┐
│ └───────────────┘ │
│ │ ┌─────────────────┐
│ ├─┤ Margheim, Diane │
│ │ └─────────────────┘
│ │ ┌───────────────┐
│ ├─┤ Matthew, Gigi │
│ │ └───────────────┘
│ │ ┌─────────────────┐
│ └─┤ Raheem, Michael │
│ └─────────────────┘
│ ┌───────────────────┐
├─┤ Salavaria, Sharon │
│ └───────────────────┘
│ ┌───────────────────┐
├─┤ Sullivan, Michael │
│ └───────────────────┘
│ ┌──────────────┐
└─┤ Walters, Rob │
└──────────────┘
*/
The key to making this all work is that the procedure calls itself recursively. For each node in the hierarchy, the child nodes on the right (if you look at the chart by turning your head counter-clockwise) are recursively drawn first, then the node itself, and then the child nodes on the left are recursively drawn. In the case of the hanging style output, there are no nodes on the “right” side.

The #TreeData table that acts as the source data is required to consist of the following 5 columns:
  • ID: An integer column that contains a unique identifier of the node in the hierarchy.

  • ParentID: An integer column that contains an identifier indicating the node’s parent in the hierarchy.

  • DataForBox: A column (convertible to nvarchar) that contains the data that should appear in the node’s box in the report. You can introduce a multiple lines within the box by using a vertical bar character to indicate where the line breaks will occur.

  • ExtraInfo: A column (convertible to nvarchar) that contains an extra (single) line of information that you would like to include in the report for that node.

  • SortColumn: A column that the procedure will use to sort the child nodes under their parent.

Here is another example using the NorthWind database. The DataForBox column contains the Employee’s FirstName and LastName and Phone Extension, separated by vertical bars, indicating that they should be output as three separate lines within the box. The ExtraInfo column contains the Employee’s title. Note that no parameters are passed, indicating that the procedure should produce the report for the Employee whose ParentID is NULL:

if object_id('tempdb..#TreeData','U') is not null drop table #TreeData
select ID=EmployeeID
,ParentID=ReportsTo
,DataForBox=FirstName+'|'+LastName+'|'+'Ext'+cast(Extension as varchar(20))
,ExtraInfo=Title
,SortColumn=LastName+', '+FirstName
into #TreeData
from Northwind.dbo.Employees

exec usp_DrawTree
/*
┌───────────┐
│ Anne │
┌─┤ Dodsworth │ Sales Representative
│ │ Ext452 │
│ └───────────┘
┌──────────┐ │
│ Steven │ │
┌─┤ Buchanan ├─┤ Sales Manager
│ │ Ext3453 │ │
│ └──────────┘ │
│ │ ┌────────┐
│ │ │ Robert │
│ ├─┤ King │ Sales Representative
│ │ │ Ext465 │
│ │ └────────┘
│ │ ┌─────────┐
│ │ │ Michael │
│ └─┤ Suyama │ Sales Representative
│ │ Ext428 │
│ └─────────┘
│ ┌──────────┐
│ │ Laura │
├─┤ Callahan │ Inside Sales Coordinator
│ │ Ext2344 │
│ └──────────┘
┌─────────┐ │
│ Andrew │ │
│ Fuller ├─┤ Vice President, Sales
│ Ext3457 │ │
└─────────┘ │
│ ┌─────────┐
│ │ Nancy │
├─┤ Davolio │ Sales Representative
│ │ Ext5467 │
│ └─────────┘
│ ┌───────────┐
│ │ Janet │
├─┤ Leverling │ Sales Representative
│ │ Ext3355 │
│ └───────────┘
│ ┌──────────┐
│ │ Margaret │
└─┤ Peacock │ Sales Representative
│ Ext5176 │
└──────────┘
*/
And here is one more example, back in AdventureWorks, of a Bill Of Materials hierarchy for ProductID #826, the LL Road Rear Wheel. Note that the boxes contain a Product Number and a Color, if one is defined for the component, and the ExtraInfo column contains the Weight, if one is defined:

if object_id('tempdb..#TreeData','U') is not null drop table #TreeData
select ID=ComponentID
,ParentID=ProductAssemblyID
,DataForBox=Name+'|'+ProductNumber+coalesce('|('+Color+')','')
,ExtraInfo=coalesce('(Weight='
+convert(varchar(10),Weight)
+rtrim(WeightUnitMeasureCode)+') '
,'')
,SortColumn=Name
into #TreeData
from AdventureWorks.Production.BillOfMaterials b
join AdventureWorks.Production.Product p on b.ComponentID=p.ProductID
where b.StartDate<=getdate()
and (b.EndDate is null or b.EndDate>=getdate())

exec usp_DrawTree 826
/*
┌──────────┐
┌─┤ LL Shell │
│ │ SH-4562 │
│ └──────────┘
┌─────────┐ │
┌─┤ LL Hub ├─┤
│ │ HU-6280 │ │
│ └─────────┘ │
│ │ ┌─────────────────┐
│ └─┤ LL Spindle/Axle │
│ │ SD-2342 │
│ └─────────────────┘
│ ┌───────────┐
├─┤ LL Nipple │
│ │ NI-4127 │
│ └───────────┘
│ ┌─────────────┐
├─┤ LL Road Rim │ (Weight=445.00G)
│ │ RM-R436 │
│ └─────────────┘
┌────────────────────┐ │
│ LL Road Rear Wheel │ │
│ RW-R623 ├─┤ (Weight=1050.00G)
│ (Black) │ │
└────────────────────┘ │
│ ┌──────────────┐
├─┤ LL Road Tire │
│ │ TI-R092 │
│ └──────────────┘
│ ┌───────────┐
├─┤ Reflector │
│ │ RF-9198 │
│ └───────────┘
│ ┌────────────────┐
├─┤ Road Tire Tube │
│ │ TT-R982 │
│ └────────────────┘
│ ┌─────────┐
└─┤ Spokes │
│ SK-9283 │
└─────────┘
*/
We don’t have the room to show you the complete BOM for the entire bicycle, ProductID #770, the Black Road-650-52, but you can download the code from Mr Schulz’s SkyDrive and give it a try yourself.

So will this technique of hierarchy reporting revolutionize the SQL Server industry? Of course not. But it is our duty to report on these new developments for our viewers as they occur.

This is Ella Vader, reporting to you from Silicon Valley, California. Back to you, Maria.


Thank you for that very informative report, Ella. May I do the usual anchor-desk gimmick of asking you a seemingly insightful follow-up question that appears to be impromptu, but in reality, is something we rehearsed earlier?

You just did, Maria.

Well, thank you again, Ella.

In other news, Starbucks has responded to customer feedback by introducing two new drink sizes, the 128-ounce Plenta and the 2-ounce Micra. Starbucks CEO Howard Schultz, who has an extraneous letter “t” in his name and is therefore not related to Brad Schulz in any way, said that he is very excited about the introduction of…





Note: Go to my subsequent blog entry to see the source code for usp_DrawTree. It used to be right here, but it made my feed too large to show up in Google Reader.

9 comments:

  1. My favorite bits:
    "as exciting as watching a CURSOR routine finish up."
    and
    "... something we rehearsed earlier"

    ReplyDelete
  2. I loved putting that in there (the "rehearsed earlier" thing)... it's one of the reasons that local news drives me up the wall.

    Back to you, Michael. 8^)

    ReplyDelete
  3. Brad, should that be a 'temp table' not a "temporary file"?

    Love the bit about the airlines! :-)

    ReplyDelete
  4. @Aaron: You're right... I changed it to "temp table". Thanks. I figured airlines were an easy target that everyone could relate to... thank God I don't fly that much. 8^)

    ReplyDelete
  5. Brad,
    Excellent script I use for creating diagrams for our ETL process. However, some tabels at the end of the tree got corrupted. (width > 220 chars, 6 levels deep). text width already 1024 chars. Any ideas?

    ReplyDelete
  6. @Wilfred:

    Can you supply test data to dulicate the problem? Email it to brad at stockciphering dot com.

    --Brad

    ReplyDelete
  7. @Wilfred:

    The corruption may have occurred because of a bug in the procedure. The contents of the #TreeResult temp table were not being SELECTed in the correct order. I've since updated the code at my SkyDrive and in the blog post that comes after this one.

    --Brad

    ReplyDelete
  8. I have a question for you. First, thank you for posting what you have done. I think it great that you do that.

    I found a problem and I was hoping you could help me out. I have a hierarchy that has a duplicate ID value. It is used in multiple relationships where the Parent + ID are a unique instance. When generating the data for the box, only the FIRST instance of the box data is retrieved based on the ID value. I would think it would be a retrieval on the parent/child ID's rather than just the ID. The result is that I get the same Data For Box values where the ID is the same.

    What would you recommend that I do to accommodate that situation with the tree?

    Thanks so much for your time!

    ReplyDelete
  9. Hi James...

    Thanks for the feedback.

    It's been a while since I've looked at this... but from what I recall, the data is all retrieved via a created-on-the-fly sequence number.

    What are you populating into the #TreeData temp table?

    If you want, you can email me directly at brad dot e dot schulz at gmail dot com.

    ReplyDelete