Monday, September 21, 2009

T-SQL Fu: The Story Continues

Master Po and Kwai Chang Caine from the 1970's TV series Kung FuThis is a continuation of my last blog post, in which I mentioned that the online T-SQL Forums are like temples where young novices learn the mysteries of T-SQL under the patient guidance of a Master, reminiscent of the 1970’s TV series Kung Fu.

I wanted to provide a way for novices to get up to speed a little more quickly by revealing many of the mysteries of T-SQL all at once so they don't have to stumble at the beginning.

And so, continuing our story…

We left off with a fateful morning, when our young novice asked about his training in the mysterious art of T-SQL Fu…



Master, is my training almost over? When will I reach my full potential in T-SQL Fu?

Grasshopper, you still have much to learn. Quickly as you can, snatch this pebble from my hand. (The young novice tries and fails.) When you can take the pebble from my hand, it will be time for you to leave.



Time continues its irreversible march forward, and the training and questions continue…



Master, the ISNULL() function is cutting off my string. What’s going on?

declare @c char(1) 
set @c=null
select isnull(@c,'12345')
/* Returns '1' */
Grasshopper, the result of the ISNULL() function is always the exact same datatype as the first argument, which most people discover by accident because they don’t read the documentation. I would suggest using COALESCE() instead since it is an ANSI standard function, whereas ISNULL() is not. Also, COALESCE() will take multiple parameters:

declare @c char(1) 
set @c=null
select coalesce(@c,null,reverse(@c),null-null,'12345')
/* Returns '12345' */



Master, my data is getting cut off again. And why is it getting cut off at different points?

declare @s varchar 
set
@s='ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890'
select @s
/* Returns 'A' */

select cast('ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890' as varchar)
/* Returns 'ABCDEFGHIJKLMNOPQRSTUVWXYZ1234' */
Grasshopper, you did not stipulate the number of characters in your VARCHAR declarations. For whatever reason, the CONVERT() and CAST() functions assume 30 bytes, but DECLARE assumes only 1 byte. The following XML example also implies only 1 byte:

declare @x xml 
set
@x='<data>ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890</data>'
select @x.value('data[1]','varchar')
/* Returns 'A' */
The lesson here is that you should always explicitly specify the number of characters in a VARCHAR or NVARCHAR declaration.



Master, this is clearly a bug. Everyone knows that all years divisible by 4 are leap years!

declare @LeapDay2000 datetime 
set
@LeapDay2000='2000-02-29'
select LeapDay1900=dateadd(year,-100,@LeapDay2000)
,LeapDay2100=dateadd(year,+100,@LeapDay2000)
/*
LeapDay1900 LeapDay2100
----------------------- -----------------------
1900-02-28 00:00:00.000 2100-02-28 00:00:00.000
*/
Grasshopper, I am afraid you are mistaken. All years divisible by 4 are leap years EXCEPT for years divisible by 100; HOWEVER, years that are divisible by 400 (like 2000) ARE leap years. I am thankful that I will not be alive in 2100 to see the confusion that arises when people with “every 4 years” drilled into their heads insist that there should be a 29th of February.

So be assured that T-SQL knows what it is doing. However, you may be amused to know that Excel does NOT know what it’s doing. It thinks that 1900 was a leap year, as you can see below, where I subtract 1 from March 1, 1900:

Excel mistakenly thinks that 1900 was a leap year

Again, there was no such date as February 29, 1900. Interestingly, though, Excel does seem to be aware that 2100 will not be a leap year. As you young people say, “Go figure.”



Master, my system must be corrupt. It says that it cannot convert the date, but everyone knows that a date string in YYYY-MM-DD format is universally accepted.

select convert(datetime,'2009-09-15') 
/*
Msg 242, Level 16, State 3, Line 2
The conversion of a char data type to a datetime data type
resulted in an out-of-range datetime value.
*/
Grasshopper, I’m afraid you are incorrect. When you stepped away from your computer to go out for a Jamba Juice, I secretly typed SET LANGUAGE BRITISH in your SSMS session. The ANSI SQL format of YYYY-MM-DD is NOT language independent. It will not work in language settings that put day before month. The only sure-fire language-independent way of using date strings is to enter them as an 8-character string with no separators in YYYYMMDD format, or the more clumsy ODBC or ISO8601 formats:

select convert(datetime,'20090915') 
select convert(datetime,{d '2009-09-15'})
select convert(datetime,'2009-09-15T00:00:00')



Many weeks later, our young novice is pulling the soggy earbuds of his iPod out of his morning cup of Green Tea as his Master approaches…



Master, this query keeps bombing. It doesn’t make sense, because shouldn’t T-SQL evaluate the left-most ISNUMERIC() function first and, if false, not even bother to do the CONVERT()? All programming languages do that!

select CustomerID,CompanyName,PostalCode 
from Northwind.dbo.Customers
where isnumeric(PostalCode)=1 and convert(int,PostalCode)>99000
/*
Msg 245, Level 16, State 1, Line 1
Conversion failed when converting the nvarchar value 'WA1 1DP' to data type int.
*/
Grasshopper, T-SQL works differently from other languages. It does not always evaluate WHERE predicates from left to right. It is free to evaluate them in any order that it sees as the most cost-effective. In this case, it considers the CONVERT() function to be more cost-effective than the ISNUMERIC() function and so it does the CONVERT() predicate first. But note the following example:

select CustomerID,CompanyName,PostalCode 
from Northwind.dbo.Customers
where Country='USA' and convert(int,PostalCode)>99000
/*
CustomerID CompanyName PostalCode
---------- ----------------------- ----------
LAZYK Lazy K Kountry Store 99362
OLDWO Old World Delicatessen 99508
*/
This query completed successfully. In this case it evaluated the Country=’USA’ predicate first, because it is more cost-effective than the CONVERT() predicate. Note that even though it scanned all the other customers, it ONLY evaluated the CONVERT() predicate AFTER it determined that the Country was ‘USA’.

If one of your predicates is dependent on the other, as was your case with ISNUMERIC() and CONVERT(), then the safest way to construct the query is with a CASE statement:

select CustomerID,CompanyName,PostalCode 
from Northwind.dbo.Customers
where case
when isnumeric(PostalCode)=1
then convert(int,PostalCode)
else -1
end>99000
/*
CustomerID CompanyName PostalCode
---------- ----------------------- ----------
LAZYK Lazy K Kountry Store 99362
OLDWO Old World Delicatessen 99508
*/



Master, I don’t understand why this query is behaving this way. I used a LEFT JOIN because I wanted ALL employees to come out regardless of whether they have Orders with Freight over $800 or not. But it’s not producing ALL employees. Doesn’t a LEFT JOIN guarantee that?

select e.EmployeeID,LastName,OrderID,Freight 
from Northwind.dbo.Employees e
left join Northwind.dbo.Orders o on e.EmployeeID=o.EmployeeID
where o.Freight>800
order by e.EmployeeID
/*
EmployeeID LastName OrderID Freight
----------- ---------- -------- --------
2 Fuller 10691 810.05
3 Leverling 10540 1007.64
5 Buchanan 10372 890.78
7 King 11030 830.75
*/
Grasshopper, the query IS doing a LEFT JOIN in the manner you expect, and it is creating an intermediate set of ALL employees with their orders (if any). However, it is your WHERE clause that is effectively filtering out your employees with no orders at all. Remember this important rule: If you’re doing a LEFT JOIN, do not refer to any of the right-hand-side’s fields in the WHERE clause. Instead, put the condition in the ON clause of the JOIN itself:

select e.EmployeeID,LastName,OrderID,Freight 
from Northwind.dbo.Employees e
left join Northwind.dbo.Orders o on e.EmployeeID=o.EmployeeID
and o.Freight>800
order by e.EmployeeID
/*
EmployeeID LastName OrderID Freight
----------- ---------- -------- --------
1 Davolio NULL NULL
2 Fuller 10691 810.05
3 Leverling 10540 1007.64
4 Peacock NULL NULL
5 Buchanan 10372 890.78
6 Suyama NULL NULL
7 King 11030 830.75
8 Callahan NULL NULL
9 Dodsworth NULL NULL
*/
The only exception to this rule is if you’re specifically checking for NULLs in the right-hand side, as in this example, which finds customers with no orders at all:

select c.CustomerID 
from Northwind.dbo.Customers c
left join Northwind.dbo.Orders o on c.CustomerID=o.CustomerID
where o.CustomerID is null
/*
CustomerID
----------
FISSA
PARIS
*/



The days melt into weeks, and the passing of many moons beget years… until one day…



The young novice comments on a strange man that wanders the temples and monasteriesMaster, there is a man who wanders about this temple and other monasteries, talking in a rude manner. He tells me that I have many design flaws in my data and that I’m performing operations that should be done in the application layer instead. He tells me I’m using “hillbilly” naming conventions instead of ISO 11179-5 metadata standards, and that IDENTITY keys are not valid because they are meaningless “Kabbalah” numbers with none of the relational properties of a proper key. He told me that I shouldn’t do attribute splitting and that Date and Time are an atomic component representing a temporal dimension as an interval. He also says the ROW_NUMBER() function is a disastrous sequential operation that… Master? MASTER?

Grasshopper, I am sorry. I didn’t mean to nod off. (He straightens his robes). I know of this man. What is your question?

Master, I don’t have one. I just wanted to talk about him. I think that he just goes out of his way to taunt me and others. If you filter out the inappropriate references to “hillbilly” and “Kabbalah”, perhaps small parts of some of the things he says have merit in a dry textbook-like theoretical sense, but he doesn’t seem to be living in the real world. I just ignore him and pretend he’s not there.

Grasshopper, I think you may have reached enlightenment.

Master, I am honored. Please look at this T-SQL Script:

select Pebble 
into Grasshoppers.Hand
from Masters.Hand
option (fast 1)
Grasshopper, I am most proud. You are ready to leave. Remember always that a wise man walks with his head bowed, humble like the dust.



And so, our young hero, once a novice, now a T-SQL Fu Master, walks off into the sunset with head bowed. He has mastered the art of T-SQL Fu, but his training is not yet finished. There is much work ahead in studying SSRS Fu and SSIS Fu and SSAS Fu and BI Fu and MDX Fu and…

A new T-SQL Fu Master wanders on to his next adventure

18 comments:

  1. "I know of this man. What is your question?"
    ... was awesome.

    (and the last select statement was the icing.)

    ReplyDelete
    Replies
    1. This comment has been removed by a blog administrator.

      Delete
  2. Great post. I like the Kung-fu narrative.

    ReplyDelete
  3. Master,

    select Pebble
    into Grasshoppers.Hand
    from Masters.Hand
    option (Fast 1)

    does not work, as Grasshoppers.Hand already exists.

    :)

    Good work, great article.

    ReplyDelete
  4. Thank you all for the comments.

    @Vinit: LOL! Excellent point! Touché!

    ReplyDelete
  5. Another great post!! I LOL'd at the part about the rude nomad that wonders the great halls of our monastery. Thanks for a pick me up after a day of dealing with my sick children.

    ReplyDelete
  6. Heh, even kudis to what is salvageable from Celko's posts.

    So silly:

    select Pebble
    into Grasshoppers.Hand
    from Masters.Hand
    option (fast 1)

    So, let's see, you are SELECT an entire TABLE's worth of pebbles INTO a new TABLE, from another database. That would be like Kane taking the master's pile from another room, building his own storehouse that stores exactly that pile.

    Sheesh! No wonder he kicked him out of the monastery!

    Why not be more direct:


    USE Guile;

    SELECT Hand
    FROM Master
    WHERE Item = 'Pebble'
    AND Hand_Status = 'Open';

    ReplyDelete
  7. @Brian: Boy, you and Vinit are so literal in interpreting the query I wrote! I have to admit, I find your query very clever, though.

    ReplyDelete
  8. great i also ran into ISNULL issue before and had a hard time to get out of it,,, and u described it here very well..kinda Master...grasshoper thats rill gr88 i told you i gonna be a fan of this..

    ReplyDelete
  9. @Brad

    Thanx for the compliment!

    You posted a nice article, but i think you went for the cheap laugh at the end. So i fixed it for you. :P

    ReplyDelete
  10. @Ashish: Thanks... ISNULL is bad news. ;)

    @Brian: Cheap laugh? C'mon! Yes, the ending was cheesy. It was meant to be cheesy. But it was not smelly, Limburger cheese... it was a fine, sophisticated cheese, like a Délice de Bourgogne.

    ReplyDelete
  11. Only of the best article in my life time. Fantastic ... No words to express my joyfullness on reading this blog. You are the master mind and I can see the maturity in your article.

    Keep going.

    Regards,
    Venkatesan Prabu .J
    http://venkattechnicalblog.blogspot.com/

    ReplyDelete
  12. @Venkatesan:

    Wow! Thank you so much for the kind words! I'm so glad you enjoyed it.

    ReplyDelete
  13. duconFenpuWilmington Michelle Gibson link
    link
    link
    click
    planerstordis

    ReplyDelete