This is a continuation of
my last blog post, in which I mentioned that the online T-SQL Forums are like temples where young novices learn the mysteries of T-SQL under the patient guidance of a Master, reminiscent of the 1970’s TV series
Kung Fu.
I wanted to provide a way for novices to get up to speed a little more quickly by revealing many of the mysteries of T-SQL all at once so they don't have to stumble at the beginning.
And so, continuing our story…
We left off with a fateful morning, when our young novice asked about his training in the mysterious art of T-SQL Fu…
Master, is my training almost over? When will I reach my full potential in T-SQL Fu?
Grasshopper, you still have much to learn. Quickly as you can, snatch this pebble from my hand. (
The young novice tries and fails.) When you can take the pebble from my hand, it will be time for you to leave.
Time continues its irreversible march forward, and the training and questions continue…
Master, the ISNULL() function is cutting off my string. What’s going on?
declare @c char(1)
set @c=null
select isnull(@c,'12345')
/* Returns '1' */
Grasshopper, the result of the ISNULL() function is always the
exact same datatype as the first argument, which most people discover by accident because they don’t read the documentation. I would suggest using COALESCE() instead since it is an ANSI standard function, whereas ISNULL() is not. Also, COALESCE() will take multiple parameters:
declare @c char(1)
set @c=null
select coalesce(@c,null,reverse(@c),null-null,'12345')
/* Returns '12345' */
Master, my data is getting cut off again. And why is it getting cut off at different points?
declare @s varchar
set @s='ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890'
select @s
/* Returns 'A' */
select cast('ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890' as varchar)
/* Returns 'ABCDEFGHIJKLMNOPQRSTUVWXYZ1234' */
Grasshopper, you did not stipulate the number of characters in your VARCHAR declarations. For whatever reason, the CONVERT() and CAST() functions assume 30 bytes, but DECLARE assumes only 1 byte. The following XML example also implies only 1 byte:
declare @x xml
set @x='<data>ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890</data>'
select @x.value('data[1]','varchar')
/* Returns 'A' */
The lesson here is that you should always explicitly specify the number of characters in a VARCHAR or NVARCHAR declaration.
Master, this is clearly a bug. Everyone knows that all years divisible by 4 are leap years!
declare @LeapDay2000 datetime
set @LeapDay2000='2000-02-29'
select LeapDay1900=dateadd(year,-100,@LeapDay2000)
,LeapDay2100=dateadd(year,+100,@LeapDay2000)
/*
LeapDay1900 LeapDay2100
----------------------- -----------------------
1900-02-28 00:00:00.000 2100-02-28 00:00:00.000
*/
Grasshopper, I am afraid you are mistaken. All years divisible by 4 are leap years EXCEPT for years divisible by 100; HOWEVER, years that are divisible by 400 (like 2000) ARE leap years. I am thankful that I will not be alive in 2100 to see the confusion that arises when people with “every 4 years” drilled into their heads insist that there should be a 29th of February.
So be assured that T-SQL knows what it is doing. However, you may be amused to know that Excel does NOT know what it’s doing. It thinks that 1900 was a leap year, as you can see below, where I subtract 1 from March 1, 1900:
Again, there was no such date as February 29, 1900. Interestingly, though, Excel does seem to be aware that 2100 will not be a leap year. As you young people say, “Go figure.”
Master, my system must be corrupt. It says that it cannot convert the date, but everyone knows that a date string in YYYY-MM-DD format is universally accepted.
select convert(datetime,'2009-09-15')
/*
Msg 242, Level 16, State 3, Line 2
The conversion of a char data type to a datetime data type
resulted in an out-of-range datetime value.
*/
Grasshopper, I’m afraid you are incorrect. When you stepped away from your computer to go out for a Jamba Juice, I secretly typed SET LANGUAGE BRITISH in your SSMS session. The ANSI SQL format of YYYY-MM-DD is NOT language independent. It will not work in language settings that put day before month. The only sure-fire language-independent way of using date strings is to enter them as an 8-character string with no separators in YYYYMMDD format, or the more clumsy ODBC or ISO8601 formats:
select convert(datetime,'20090915')
select convert(datetime,{d '2009-09-15'})
select convert(datetime,'2009-09-15T00:00:00')
Many weeks later, our young novice is pulling the soggy earbuds of his iPod out of his morning cup of Green Tea as his Master approaches…
Master, this query keeps bombing. It doesn’t make sense, because shouldn’t T-SQL evaluate the left-most ISNUMERIC() function first and, if false, not even bother to do the CONVERT()? All programming languages do that!
select CustomerID,CompanyName,PostalCode
from Northwind.dbo.Customers
where isnumeric(PostalCode)=1 and convert(int,PostalCode)>99000
/*
Msg 245, Level 16, State 1, Line 1
Conversion failed when converting the nvarchar value 'WA1 1DP' to data type int.
*/
Grasshopper, T-SQL works differently from other languages. It does not always evaluate WHERE predicates from left to right. It is free to evaluate them in any order that it sees as the most cost-effective. In this case, it considers the CONVERT() function to be more cost-effective than the ISNUMERIC() function and so it does the CONVERT() predicate first. But note the following example:
select CustomerID,CompanyName,PostalCode
from Northwind.dbo.Customers
where Country='USA' and convert(int,PostalCode)>99000
/*
CustomerID CompanyName PostalCode
---------- ----------------------- ----------
LAZYK Lazy K Kountry Store 99362
OLDWO Old World Delicatessen 99508
*/
This query completed successfully. In this case it evaluated the Country=’USA’ predicate first, because it is more cost-effective than the CONVERT() predicate. Note that even though it scanned all the other customers, it ONLY evaluated the CONVERT() predicate AFTER it determined that the Country was ‘USA’.
If one of your predicates is dependent on the other, as was your case with ISNUMERIC() and CONVERT(), then the safest way to construct the query is with a CASE statement:
select CustomerID,CompanyName,PostalCode
from Northwind.dbo.Customers
where case
when isnumeric(PostalCode)=1
then convert(int,PostalCode)
else -1
end>99000
/*
CustomerID CompanyName PostalCode
---------- ----------------------- ----------
LAZYK Lazy K Kountry Store 99362
OLDWO Old World Delicatessen 99508
*/
Master, I don’t understand why this query is behaving this way. I used a LEFT JOIN because I wanted ALL employees to come out regardless of whether they have Orders with Freight over $800 or not. But it’s
not producing ALL employees. Doesn’t a LEFT JOIN guarantee that?
select e.EmployeeID,LastName,OrderID,Freight
from Northwind.dbo.Employees e
left join Northwind.dbo.Orders o on e.EmployeeID=o.EmployeeID
where o.Freight>800
order by e.EmployeeID
/*
EmployeeID LastName OrderID Freight
----------- ---------- -------- --------
2 Fuller 10691 810.05
3 Leverling 10540 1007.64
5 Buchanan 10372 890.78
7 King 11030 830.75
*/
Grasshopper, the query IS doing a LEFT JOIN in the manner you expect, and it is creating an intermediate set of ALL employees with their orders (if any). However, it is your WHERE clause that is effectively filtering out your employees with no orders at all. Remember this important rule: If you’re doing a LEFT JOIN, do not refer to any of the right-hand-side’s fields in the WHERE clause. Instead, put the condition in the ON clause of the JOIN itself:
select e.EmployeeID,LastName,OrderID,Freight
from Northwind.dbo.Employees e
left join Northwind.dbo.Orders o on e.EmployeeID=o.EmployeeID
and o.Freight>800
order by e.EmployeeID
/*
EmployeeID LastName OrderID Freight
----------- ---------- -------- --------
1 Davolio NULL NULL
2 Fuller 10691 810.05
3 Leverling 10540 1007.64
4 Peacock NULL NULL
5 Buchanan 10372 890.78
6 Suyama NULL NULL
7 King 11030 830.75
8 Callahan NULL NULL
9 Dodsworth NULL NULL
*/
The only exception to this rule is if you’re specifically checking for NULLs in the right-hand side, as in this example, which finds customers with no orders at all:
select c.CustomerID
from Northwind.dbo.Customers c
left join Northwind.dbo.Orders o on c.CustomerID=o.CustomerID
where o.CustomerID is null
/*
CustomerID
----------
FISSA
PARIS
*/
The days melt into weeks, and the passing of many moons beget years… until one day…
Master, there is a man who wanders about this temple and other monasteries, talking in a rude manner. He tells me that I have many design flaws in my data and that I’m performing operations that should be done in the application layer instead. He tells me I’m using “hillbilly” naming conventions instead of ISO 11179-5 metadata standards, and that IDENTITY keys are not valid because they are meaningless “Kabbalah” numbers with none of the relational properties of a proper key. He told me that I shouldn’t do attribute splitting and that Date and Time are an atomic component representing a temporal dimension as an interval. He also says the ROW_NUMBER() function is a disastrous sequential operation that… Master? MASTER?
Grasshopper, I am sorry. I didn’t mean to nod off.
(He straightens his robes). I know of this man. What is your question?
Master, I don’t have one. I just wanted to talk about him. I think that he just goes out of his way to taunt me and others. If you filter out the inappropriate references to “hillbilly” and “Kabbalah”, perhaps small parts of some of the things he says have merit in a dry textbook-like theoretical sense, but he doesn’t seem to be living in the real world. I just ignore him and pretend he’s not there.
Grasshopper, I think you may have reached enlightenment.
Master, I am honored. Please look at this T-SQL Script:
select Pebble
into Grasshoppers.Hand
from Masters.Hand
option (fast 1)
Grasshopper, I am most proud. You are ready to leave. Remember always that a wise man walks with his head bowed, humble like the dust.
And so, our young hero, once a novice, now a T-SQL Fu Master, walks off into the sunset with head bowed. He has mastered the art of T-SQL Fu, but his training is not yet finished. There is much work ahead in studying SSRS Fu and SSIS Fu and SSAS Fu and BI Fu and MDX Fu and…