Tuesday, January 12, 2010

The Troll's Puzzle: A SQL Fable

This blog entry is participating in the second T-SQL Tuesday, hosted once again by Adam Machanic. You are invited to visit his blog to join the party and read more blogs participating in this month’s theme: Puzzling Situations.

Now please sit back and enjoy our story…



The SQL Troll's CastleOnce upon a time, in a faraway land, there lived three T-SQL programmers.

Frederick Function just loved to call T-SQL functions in his queries. He thought it made his code look more complicated and that made him feel good. He especially liked working with dates because his code would be sprinkled with lots and lots of DATEADDs and DATEDIFFs.

Osric Ordinal liked to take shortcuts in his queries any time he had the opportunity.

And Colin Columnname carefully named his columns in his SELECT list and used those column names throughout the query where possible.

These three young lads were on a quest to save a fair maiden who was held prisoner in a castle in a deep dark forest. This castle was guarded by an evil ugly troll. Legend had it that the troll had three seemingly simple yet deceptive T-SQL puzzles… If any brave soul was able to answer all three T-SQL puzzles successfully, he could enter the castle and save the fair maiden. However, if unsuccessful in solving any of the T-SQL puzzles, he would be eaten by the troll.

Our three young heroes were brave and confident men, and so they approached the troll.

“Whaddya want?” asked the troll of the three men, spitting green slime as he spoke.

“We have come to solve your T-SQL challenges and save the fair maiden,” answered each of the men.

“You’re all fools!” snarled the troll. “You are delving into territory where hundreds have failed.”

None of the men said anything… they were steadfast in their intentions.

“Hah! We shall proceed then. You all arrived just in time… I was getting a bit hungry. Heh-heh-heh.” The troll's stomach growled and his eyes narrowed and earwax dripped out of his right ear. “Take out your laptops, and prepare for the first challenge!” he bellowed.

The men were ready at their keyboards.

“Challenge Number One: Write a query that pulls out the first and last initial of all the names in the Contacts table in AdventureWorks. Name those columns FirstInitial and LastInitial. Finally, sort the result in order of the first and last initial. You have 60 seconds. Go!”

For a few seconds, the three brave men were in shock, because they couldn’t believe how easy this first challenge was, but after a short while, they started typing furiously at their keyboards. They all looked up long before the 60 seconds were over.

“Let’s see what you all have,” said the troll.

Frederick Function demonstrated his query. He used the same LEFT() functions in the ORDER BY clause that he had used in the SELECT list… he thought it looked really cool to have the functions repeated like that:

select FirstInitial=left(FirstName,1)
,LastInitial=left(LastName,1)
from AdventureWorks.Person.Contact
order by left(FirstName,1)
,left(LastName,1)
/*
FirstInitial LastInitial
------------ -----------
A A
A A
A A
A A
...
Z W
Z W
Z Y
Z Z
(19972 rows)
*/
Osric Ordinal’s query took advantage of using ordinal numbers in the ORDER BY to represent column positions in the SELECT list. He loved taking this kind of shortcut:

select FirstInitial=left(FirstName,1)
,LastInitial=left(LastName,1)
from AdventureWorks.Person.Contact
order by 1,2
/*
FirstInitial LastInitial
------------ -----------
A A
A A
A A
A A
...
Z W
Z W
Z Y
Z Z
(19972 rows)
*/
Colin Columnname’s query used the names of the columns that were established in the SELECT list in his ORDER BY clause:

select FirstInitial=left(FirstName,1)
,LastInitial=left(LastName,1)
from AdventureWorks.Person.Contact
order by FirstInitial
,LastInitial
/*
FirstInitial LastInitial
------------ -----------
A A
A A
A A
A A
...
Z W
Z W
Z Y
Z Z
(19972 rows)
*/
“You have all successfully passed the first challenge,” growled the troll. “But that’s no great accomplishment… Most people do.”

The troll’s stomach rumbled again, this time so loud that the ground shook. A pimple on his left cheek popped spontaneously and violently, its putrid contents spewing forth and missing Colin by mere centimeters.

“Challenge Number Two: Wrap the query of your first challenge inside a stored procedure called #GetContactInitials which accepts a single string parameter called @SortOrder. If @SortOrder is equal to ‘First’ then sort by the first initial and last initial. If @SortOrder is equal to ‘Last’ then sort by last initial and first initial. No IF statements are allowed… you must have only a single query in the stored procedure. You have 60 seconds. Go!”

The men got to work.

Frederick Function finished very quickly, and came up with the following stored procedure, which compiled without error. Again, he was even more proud of his procedure because it had even more function calls in it:

if object_id('tempdb..[#GetContactInitials]','P') is not null 
drop procedure #GetContactInitials
go
create procedure #GetContactInitials
@SortOrder
varchar(20)
as
select
FirstInitial=left(FirstName,1)
,LastInitial=left(LastName,1)
from AdventureWorks.Person.Contact
order by case @SortOrder
when 'First' then left(FirstName,1)
when 'Last' then left(LastName,1)
end
,case @SortOrder
when 'First' then left(LastName,1)
when 'Last' then left(FirstName,1)
end
Osric Ordinal also produced his stored procedure very quickly,using his beloved ordinal number approach in the ORDER BY. His procedure also compiled without error:

if object_id('tempdb..[#GetContactInitials]','P') is not null 
drop procedure #GetContactInitials
go
create procedure #GetContactInitials
@SortOrder
varchar(20)
as
select
FirstInitial=left(FirstName,1)
,LastInitial=left(LastName,1)
from AdventureWorks.Person.Contact
order by case @SortOrder
when 'First' then 1
when 'Last' then 2
end
,case @SortOrder
when 'First' then 2
when 'Last' then 1
end
Colin Columnname at first had trouble. He had put his procedure together like so, but it produced compile errors:

if object_id('tempdb..[#GetContactInitials]','P') is not null 
drop procedure #GetContactInitials
go
create procedure #GetContactInitials
@SortOrder
varchar(20)
as
select
FirstInitial=left(FirstName,1)
,LastInitial=left(LastName,1)
from AdventureWorks.Person.Contact
order by case @SortOrder
when 'First' then FirstInitial
when 'Last' then LastInitial
end
,case @SortOrder
when 'First' then LastInitial
when 'Last' then FirstInitial
end
/*
Msg 207, Level 16, State 1, Procedure #GetContactInitials, Line 9
Invalid column name 'FirstInitial'.
Msg 207, Level 16, State 1, Procedure #GetContactInitials, Line 10
Invalid column name 'LastInitial'.
Msg 207, Level 16, State 1, Procedure #GetContactInitials, Line 13
Invalid column name 'LastInitial'.
Msg 207, Level 16, State 1, Procedure #GetContactInitials, Line 14
Invalid column name 'FirstInitial'.
*/
At first this seemed confusing. Why was he getting error messages saying that ‘FirstInitial’ and ‘LastInitial’ were invalid column names? Clearly they were valid in his first challenge. And the ORDER BY is one of the last operations taking place in a query, so the column names would have already been established by the SELECT list. It was puzzling indeed.

Then, in a flash of insight, Colin understood. Because the ORDER BY clause consisted of CASE expressions, T-SQL had to evaluate those expressions for each and every row in the FROM table. In other words, during the Clustered Index Scan of the Contacts table, T-SQL would have to compute the CASE expression values. And then, only after all rows were evaluated in this manner, finally the Sort required by the ORDER BY would take place on those already-evaluated CASE expressions. So, while it was true that the actual ORDER BY Sort logically took place at the end of a query operation, the expressions within the ORDER BY list had to be evaluated at the beginning of the query operation.

So Colin quickly re-wrote his query to instead use a derived table in his FROM clause. This way he could keep his column names in the ORDER BY. He managed to type the very last character of code just as the 60 seconds were up:

if object_id('tempdb..[#GetContactInitials]','P') is not null 
drop procedure #GetContactInitials
go
create procedure #GetContactInitials
@SortOrder
varchar(20)
as
select
FirstInitial
,LastInitial
from (select FirstInitial=left(FirstName,1)
,LastInitial=left(LastName,1)
from AdventureWorks.Person.Contact) InitialValues
order by case @SortOrder
when 'First' then FirstInitial
when 'Last' then LastInitial
end
,case @SortOrder
when 'First' then LastInitial
when 'Last' then FirstInitial
end
“Now we will test out your stored procedures,” snarled the troll.

Frederick Function demonstrated his stored procedure, and it produced the correct output:

exec #GetContactInitials 'First'
/*
FirstInitial LastInitial
------------ -----------
A A
A A
A A
A A
...
Z W
Z W
Z Y
Z Z
(19972 rows)
*/

exec #GetContactInitials 'Last'
/*
FirstInitial LastInitial
------------ -----------
A A
A A
A A
A A
...
W Z
W Z
W Z
Z Z
(19972 rows)
*/
Colin Columnname demonstrated his stored procedure, and it, too, produced the same correct output.

Finally, Osric Ordinal demonstrated his stored procedure. But it produced output that was unexpected:

exec #GetContactInitials 'First'
/*
FirstInitial LastInitial
------------ -----------
G A
C A
K A
H A
P A
...
C G
I R
C H
C Z
C H
(19972 rows)
*/

exec #GetContactInitials 'Last'
/*
FirstInitial LastInitial
------------ -----------
G A
C A
K A
H A
P A
...
C G
I R
C H
C Z
C H
(19972 rows)
*/
“I don’t understand,” said Osric. “I wrapped my original query in a stored procedure, and I used a CASE exp---“

But Osric didn’t get a chance to finish his sentence, because his head had suddenly disappeared. It was being pulverized by the troll’s sharp teeth. Before the rest of Osric’s headless body could crumple to the ground, the troll scooped it up and swallowed it too.

The troll let out a loud belch. Frederick and Colin tried not to faint from the foul stench of his breath.

Colin glanced over at Osric’s laptop and saw that Osric’s query had the same CASE expression evaluation problem that Colin’s original query had. Osric had (incorrectly) figured that his use of ordinal column positions would still be honored within the CASE expressions, but instead, they were just evaluated as simple constant integers of 1 and 2, not referring to column positions anymore at all. For example, when his procedure was called with the parameter ‘First’, each row had the same CASE expression evaluations… the first CASE would evaluate to the constant integer 1 and the second would evaluate to the constant integer 2. Since every single row had the same constant integer expressions to ORDER BY, the final result set simply came out in the order it was scanned… in clustered index order.

“Challenge Number Three: This is similar to Challenge Number Two. The only difference is that the stored procedure is to be called #GetDistinctContactInitials and the result set should be the DISTINCT set of first and last initials. The same rules apply regarding the @SortOrder parameter and having only a single query in the stored procedure. You have 60 seconds. Go!”

Colin took his procedure from Challenge Number Two and renamed the procedure as instructed and simply added a DISTINCT to his derived table (and also renamed the derived table’s alias name for clarity and completeness). It compiled without error:

if object_id('tempdb..[#GetDistinctContactInitials]','P') is not null 
drop procedure #GetDistinctContactInitials
go
create procedure #GetDistinctContactInitials
@SortOrder
varchar(20)
as
select
FirstInitial
,LastInitial
from (select
DISTINCT FirstInitial=left(FirstName,1)
,LastInitial=left(LastName,1)
from AdventureWorks.Person.Contact) DistinctValues
order by case @SortOrder
when 'First'
then FirstInitial
else LastInitial
end
,case @SortOrder
when 'First'
then LastInitial
else FirstInitial
end
Frederick did the same thing… he renamed the procedure and added a DISTINCT to his query, but he got the following error when trying to compile it:

if object_id('tempdb..[#GetDistinctContactInitials]','P') is not null 
drop procedure #GetDistinctContactInitials
go
create procedure #GetDistinctContactInitials
@SortOrder
varchar(20)
as
select

DISTINCT FirstInitial=left(FirstName,1)
,LastInitial=left(LastName,1)
from AdventureWorks.Person.Contact
order by case @SortOrder
when 'First' then left(FirstName,1)
when 'Last' then left(LastName,1)
end
,case @SortOrder
when 'First' then left(LastName,1)
when 'Last' then left(FirstName,1)
end
/*
Msg 145, Level 15, State 1, Procedure #GetDistinctContactInitials, Line 5
ORDER BY items must appear in the select list if SELECT DISTINCT is specified.
*/
What the heck? All Frederick really did was add a DISTINCT to his query. Why was it giving him problems? And what did the error message mean? The ORDER BY items were clearly in the SELECT list, he thought.

Frederick wasn’t as clever as Colin in figuring out the fact that having CASE expressions in his ORDER BY clause forced T-SQL to evaluate those expressions during the Clustered Index Scan phase of the query. It had worked fine for his procedure in Challenge Number Two. But by adding the DISTINCT in Challenge Number Three, everything changed. Since adding the DISTINCT keyword shrinks the result set, one must ORDER BY only the columns in that resulting shrunken set… One cannot ORDER BY any expressions based on columns in the original FROM table any more.

Frederick started to get nervous. As the precious seconds ticked by, he could see no way to correct his query.

“Time’s up!” growled the troll, “Let’s see what you have.” He looked at Frederick’s procedure and saw that it had compile errors.

Rivulets of sweat were gushing down Frederick’s face and body like waterfalls.

“Ahhhh…” said the troll, “Thank you for adding the fine salty marinade.” And without another word, he gobbled up Frederick, swallowing him whole.

The troll turned to Colin.

“Demonstrate your stored procedure!” commanded the troll.

Colin did just that, coming up with the correct result:

exec #GetDistinctContactInitials 'First'
/*
FirstInitial LastInitial
------------ -----------
A A
A B
A C
A D
A E
...
Z S
Z T
Z W
Z Y
Z Z
(544 Rows)
*/

exec #GetDistinctContactInitials 'Last'
/*
FirstInitial LastInitial
------------ -----------
A A
B A
C A
D A
E A
...
S Z
T Z
V Z
W Z
Z Z
(544 Rows)
*/
“Finally! It’s about time!” said the troll. “I was getting sick and tired of this whole SQL puzzle gig. Congratulations… The maiden is yours. I’ll see ya later… I’m off to terrorize a village.” And he left without another word, trudging over a nearby hill.

Colin raced into the castle and took the steps up to the tower three at a time.

And there was the fair maiden. She turned to look at him, the twin silks from her wimple rippling in the soft breeze from the window.

“My word!” he exclaimed, “You’re absolutely beautiful! In fact, in that white dress and wimple, you look just Yeoman Tonia Barrows in---”

The fair maiden interrupted, completing his thought: “…in Shore Leave, the 17th episode of the original series of Star Trek.”

Colin, a certified Star Trek geek, was speechless.

“Thank you,” continued the maiden, “That’s kind of you to say. And that was a fairly good episode, don’t you think? However, I have to say that it doesn’t hold up to other first-season episodes like Devil In The Dark or The City On The Edge of Forever, does it? Still, Theodore Sturgeon, who wrote the teleplay of Shore Leave, did go on to write Amok Time, which was a classic second-season episode.”

Colin couldn’t help himself. “Wow! What a woman!” he exclaimed. He took her into his arms.

And they lived long and prospered happily ever after.

The End.

8 comments:

  1. At the risk of sounding like a broken record. Your latest post is excellent (again).

    (Although, I thought a TROLLBACK TRANSACTION pun was in order)

    ReplyDelete
  2. @Michael: LOL... Touché! Thanks for your comments... I'm glad you enjoyed it!

    ReplyDelete
  3. Very cute. Though that last part makes the whole thing unbelievable.

    ReplyDelete
  4. @Brian:

    What? You thought it was believable up until the last part? ;-)

    ReplyDelete
  5. Trolls, head biting, hey, that happens on USENET everyday. But the rest, that never happens, not even in fantasy novels. :)

    ReplyDelete
  6. Quite apropos for a tsql tuesday about puzzling situations: I just revisited usenet after umpteen years. In particular to my old favorite, rec.puzzles: http://groups.google.com/group/rec.puzzles

    It's so sad. Spam as far as the eye can see.

    ReplyDelete
  7. Hi Brad,
    This is really good article...
    Thanks -- Vijaya Kadiyala
    www.DotNetVJ.com

    ReplyDelete

  8. It was so nice article. I was really satisfied by seeing this article. SQL server dba Online Training Bangalore

    ReplyDelete