Thursday, April 8, 2010

Playing For High Stakes

Playing For High StakesYes, we use T-SQL to manage and manipulate data for multi-billion dollar enterprises, but every now and then we need to take some time for some recreation.

And I don’t mean “lie around on a Hawaiian beach sipping a Mai-Tai” kind of recreation… I’m talking about something more cerebral.

We’re going to teach SQL Server how to play Poker!

First of all, we need a deck of cards. That just involves combining the 13 spots (2, 3, 4, …, Queen, King, Ace) with the 4 suits (Spades, Diamonds, Clubs, Hearts). Each spot has a value… either the value of the numeric spot, or a value of 11, 12, 13 for Jack, Queen, King, and a value of 14 for Ace.

(Note that I use SQL2008’s row constructors to build the lists of spots and suits… if you’re using SQL2005, you will have to re-factor the Spots and Suits derived tables using UNION ALL’s).

(By the way, in running any of the scripts in this article, if you don’t see the Suit Symbols in your results window, you will have to change its font. For example, by default, SSMS is installed with the font of the Grid Results Window to be MS Sans Serif, which does not support all Unicode characters. Change the Grid Results font via Tools -> Options -> Environment -> Fonts and Colors -> Grid Results. I would suggests Arial for Grid Results and Courier New for Text Results).

with DeckOfCards as
(
select SpotValue
,SpotSymbol
,SuitSymbol
from (values ('2',2),('3',3),('4',4),('5',5),('6',6)
,('7',7),('8',8),('9',9),('10',10)
,('J',11),('Q',12),('K',13),('A',14)) Spots(SpotSymbol,SpotValue)
cross join (values (N'♠'),(N'♦'),(N'♣'),(N'♥')) Suits(SuitSymbol)
)
select *
from DeckOfCards
/*
SpotValue SpotSymbol SuitSymbol
--------- ---------- ----------
2 2 ♠
3 3 ♠
4 4 ♠
...
12 Q ♠
13 K ♠
14 A ♠
2 2 ♦
3 3 ♦
...
11 J ♥
12 Q ♥
13 K ♥
14 A ♥
(52 Rows Total)
*/
Now that we have that, we need to shuffle the cards and deal them out. Since a poker hand is 5 cards, the 52 cards will be distributed to 10 players, with 2 cards going to a Player #11. Those 2 cards will be discarded later. We will ORDER BY NEWID() to shuffle the cards into random order as we pass them out. (Note that the DeckOfCards CTE is abbreviated in the code below, just so it doesn’t get too long… I’ll continue to do this as we go step-by-step).

with DeckOfCards as
(. . .)
,
ShuffleAndDeal as
(
select PlayerID=(row_number() over (order by newid())-1)/5+1
,CardName=SpotSymbol+SuitSymbol
,CardValue=SpotValue
,SuitSymbol
from DeckOfCards
)
select *
from ShuffleAndDeal
/*
PlayerID CardName CardValue SuitSymbol
-------- -------- --------- ----------
1 K♣ 13 ♣
1 9♠ 9 ♠
1 3♦ 3 ♦
1 4♦ 4 ♦
1 Q♣ 12 ♣
2 9♥ 9 ♥
2 A♣ 14 ♣
2 J♣ 11 ♣
2 8♦ 8 ♦
2 6♦ 6 ♦
...
10 7♥ 7 ♥
10 10♦ 10 ♦
10 4♣ 4 ♣
10 7♦ 7 ♦
10 Q♥ 12 ♥
11 J♥ 11 ♥
11 9♦ 9 ♦
*/
You may wonder why I didn’t use NTILE(10) in order to distribute the cards. Well, that’s because NTILE(10) would insist on splitting the 52 cards up with the numbers 1 through 10 and two of the players would get 6 cards instead of 5. Thus I had to use the ROW_NUMBER() approach instead to assign the PlayerID.

Now comes the first challenge. How do we evaluate what kind of hand each player has? I came up with the following table that outlines the characteristics of each poker hand:

/*                                                Qnty     Qnty     Diff In
Num Num In In 2nd Value Btwn
Hand Hand Distinct Distinct Num Largest Largest LowCard And
Description Points Values Suits Groups Group Group HighCard Comments
----------------------------------------------------------------------------------------
Royal Flush 9 5 1 0 N/A N/A 4 HiCard=Ace
Straight Flush 8 5 1 0 N/A N/A 4
Four Of A Kind 7 2 N/A 1 4 N/A N/A
Full House 6 2 N/A 2 3 2 N/A
Flush 5 N/A 1 0 N/A N/A N/A
Straight 4 5 N/A 0 N/A N/A 4
Three Of A Kind 3 3 N/A 1 3 N/A N/A
Two Pair 2 3 N/A 2 2 2 N/A
Two Of A Kind 1 4 N/A 1 2 N/A N/A
*/
Each poker hand is assigned an arbitrary number of points (I just used 1 through 9), just so we can see which hand wins over another hand. For each player’s hand, we’re going to have to figure out how many distinct spot values are in the hand and how many distinct suits there are in the hand. We also have to figure out how many groups are in the hand. By groups, I mean combinations of cards with the same value. For example, a Full House has two groups… a group of 3 cards with the same value and a group of 2 cards with the same value.

So in addition to finding the number of groups in a hand, we also have to calculate the quantity of cards in each group. Since a poker hand only consists of 5 cards, we will never have more than 2 groups in any given hand, so having that limit helps us a lot.

As you may recall, I wrote in a previous blog post that the OVER clause is not able to calculate DISTINCT aggregations. So we can’t directly do a COUNT(DISTINCT SuitSymbol) with the OVER clause. And we also can’t introduce a CTE that will calculate our COUNT(DISTINCT) values with a GROUPing BY PlayerID (in order to JOIN it with our set of hands), because that would involve two passes through the data, and we can’t do that because our source data is random.

So, as I outlined in that previous blog post, we can calculate COUNT(DISTINCT) by first doing a DENSE_RANK() in one CTE, and then calculate a MAX() on that DENSE_RANK() value in a second CTE.

So let’s introduce the first CTE, which I cleverly call HandEvaluation1. First of all, we will throw out the extra 2 cards of PlayerID 11 (WHERE PlayerID<=10). Next, we will calculate the DENSE_RANK() of both the Card Values and the Suits. We will also introduce a Card Sequence Number in order to sort the individual hands (PARTITION BY PlayerID) by the Card Value (ORDER BY CardValue). Finally, we are also going to introduce a Group Sequence Number, which is similar to the Card Sequence Number, except it will be PARTITIONed by both PlayerID and CardValue. This will become clearer when you look at some actual sample data below:

with DeckOfCards as
(. . .)
,
ShuffleAndDeal as
(. . .)
,
HandEvaluation1 as
(
select PlayerID
,CardSeq=row_number() over (partition by PlayerID
order by CardValue)
,CardName
,CardValue
,SuitSymbol
,ValDenseRank=dense_rank() over (partition by PlayerID
order by CardValue)
,SuitDenseRank=dense_rank() over (partition by PlayerID
order by SuitSymbol)
,GroupSeq=row_number() over (partition by PlayerID,CardValue
order by CardValue)
from ShuffleAndDeal
where PlayerID<=10
)
select *
from HandEvaluation1
/*
PlayerID CardSeq CardName CardValue SuitSymbol ValDenseRank SuitDenseRank GroupSeq
-------- ------- -------- --------- ---------- ------------ ------------- --------
1 1 3♣ 3 ♣ 1 2 1
1 2 5♥ 5 ♥ 2 3 1
1 3 5♠ 5 ♠ 2 1 2
1 4 9♥ 9 ♥ 3 3 1
1 5 9♣ 9 ♣ 3 2 2
2 1 7♣ 7 ♣ 1 2 1
2 2 8♦ 8 ♦ 2 3 1
2 3 A♣ 14 ♣ 3 2 1
2 4 A♦ 14 ♦ 3 3 2
2 5 A♠ 14 ♠ 3 1 3
...
10 1 3♠ 3 ♠ 1 1 1
10 2 6♠ 6 ♠ 2 1 1
10 3 7♠ 7 ♠ 3 1 1
10 4 10♦ 10 ♦ 4 2 1
10 5 J♠ 11 ♠ 5 1 1
(50 Rows Total)
*/
So take a look at PlayerID #1’s hand. He has a 3 and two 5’s and two 9’s. The CardSeq column shows the sequence numbers of those cards (1 through 5) in CardValue order. The ValDenseRank column is the DENSE_RANK() of the CardValue column… the 3 gets a ValDenseRank of 1, the two 5’s get a value of 2, and the 9’s get a value of 3. The SuitDenseRank does something similar with the Suits in Player #1’s hand.

Finally, take a look at the GroupSeq column. The 3 gets a GroupSeq value of 1, the two 5’s get a value of 1 and 2, and the two 9’s get a value of 1 and 2. By doing this, we can figure out that a hand has groups if there is a GroupSeq value of 2 or more in the hand.

Take a look at PlayerID #2’s hand. He has three Aces. Note the GroupSeq for those cards is 1,2,3. PlayerID #10’s hand has all GroupSeq values of 1, because there are no groups in his hand… he has 5 different cards.

If we look at any Player’s hand, and calculate the MAX() of the GroupSeq, we will know the quantity of cards in the largest group in the hand. This is something important in figuring out what kind of hand a Player has.

So let’s continue building the query. In the next CTE, HandEvaluation2, we will calculate the MAX(ValDenseRank) and MAX(SuitDenseRank) values, which, as I explained earlier, will give us the COUNT(DISTINCT) of CardValues and Suits respectively. We will calculate the MAX(GroupSeq) to figure out the quantity of cards in our largest group. And we will calculate the MAX() and MIN() of the CardValue column to figure out the high and low card in the hand. This will be important in figuring out whether a person has a Straight or not, because a Straight will have a difference of 4 between the HighCardValue and LowCardValue.

with DeckOfCards as
(. . .)
,
ShuffleAndDeal as
(. . .)
,
HandEvaluation1 as
(. . .)
,
HandEvaluation2 as
(
select PlayerID
,CardSeq
,CardName
,CardValue
,GroupSeq
,NumDistinctVals=max(ValDenseRank) over (partition by PlayerID)
,NumDistinctSuits=max(SuitDenseRank) over (partition by PlayerID)
,QtyInLargestGroup=max(GroupSeq) over (partition by PlayerID)
,HighCardValue=max(CardValue) over (partition by PlayerID)
,LowCardValue=min(CardValue) over (partition by PlayerID)
from HandEvaluation1
)
select PlayerID
,CardName
,NumDistinctVals
,NumDistinctSuits
,QtyInLargestGroup
from HandEvaluation2
/*
PlayerID CardName NumDistinctVals NumDistinctSuits QtyInLargestGroup
-------- -------- --------------- ---------------- -----------------
1 3♣ 3 3 2
1 5♥ 3 3 2
1 5♠ 3 3 2
1 9♥ 3 3 2
1 9♣ 3 3 2
2 7♣ 3 3 3
2 8♦ 3 3 3
2 A♣ 3 3 3
2 A♦ 3 3 3
2 A♠ 3 3 3
...
10 3♠ 5 2 1
10 6♠ 5 2 1
10 7♠ 5 2 1
10 10♦ 5 2 1
10 J♠ 5 2 1
(50 Rows Total)
*/
At this point we have almost everything we need in order to figure out what kind of hand a Player has. The only thing that is missing is the Quantity of cards in our second Group. This is only important in evaluating a Full House and Two Pair, since they are the only hands that involve two groups. But they are easy enough to tell apart simply because a Full House has 3 cards in its largest group and a Two Pair has 2 cards in its largest group (or 2 cards in both groups actually). All that is important, really, is this: Is there a second group or not? This is how we will differentiate a Full House from a Three of a Kind or a Two Pair from a Two of a Kind.

We also have, at this point, the difficult task of ranking the hands. We can easily rank two different hands (e.g. a Straight vs. a Flush), but things get complicated when you have to rank two of the same kind of hand. For example, two different players may both have a Two of a Kind. One player has a pair of Jacks and the other player has a pair of 8’s. We have to figure out the CardValue of each player’s group in order to compare them. In this example, the Jack has a higher value than the 8. But, to get even more complicated, what if two players both have a pair of Jacks? Then we have to compare the other 3 cards between the players.

The same is true of two players that have no kind of poker hand at all… we have to compare the actual individual cards to see which player wins out over another. For example, let’s say Player #1 has (J,9,8,7,2) and Player #2 has (J,9,8,7,5). They have 4 cards in common, but it’s that last card that puts Player #2 ahead.

So, we have three tasks… We have to figure out the CardValue of the cards in the largest group (if there are any groups) and the CardValue of the cards in the second largest group (if there is one). We also have to figure out a way to rank the individual cards that are not part of a group.

Let’s do the first two tasks first. The CTE HandEvaluation3 will figure out the CardValue of the largest group. It does this by looking for a card that has a GroupSeq>1 (indicating that it is part of a group) and it has a GroupSeq equal to the QtyInLargestGroup column, which we calculated in HandEvaluation2. Note that we apply a MAX() aggregate. This is because a person may have Two Pair… there are two groups, each with 2 cards. The QtyInLargestGroup is 2, but in reality, it’s the quantity in both groups, so by using the MAX() aggregate we get the CardValue of the highest pair.

The CTE HandEvaluation4 will figure out the CardValue of the second largest group. It does this in a similar way to HandEvaluation3, except it just makes sure that the CardValue is NOT the CardValue of the first largest group.

(Note, by the way, that the output of this is a different set of hands than we looked at previously).

with DeckOfCards as
(. . .)
,
ShuffleAndDeal as
(. . .)
,
HandEvaluation1 as
(. . .)
,
HandEvaluation2 as
(. . .)
,
HandEvaluation3 as
(
select PlayerID
,CardSeq
,CardName
,CardValue
,GroupSeq
,NumDistinctVals
,NumDistinctSuits
,QtyInLargestGroup
,HighCardValue
,LowCardValue
,CardValueInLargestGroup=max(case
when GroupSeq>1
and GroupSeq=QtyInLargestGroup
then CardValue
else 0
end)
over (partition by PlayerID)
from HandEvaluation2
)
,
HandEvaluation4 as
(
select PlayerID
,CardSeq
,CardName
,CardValue
,GroupSeq
,NumDistinctVals
,NumDistinctSuits
,QtyInLargestGroup
,HighCardValue
,LowCardValue
,CardValueInLargestGroup
,CardValueIn2ndLargestGroup=max(case
when GroupSeq>1
and CardValue<>CardValueInLargestGroup
then CardValue
else 0
end)
over (partition by PlayerID)
from HandEvaluation3
)
select PlayerID
,CardName
,QtyInLargestGroup
,CardValueInLargestGroup
,CardValueIn2ndLargestGroup
from HandEvaluation4
/*
PlayerID CardName QtyInLargestGroup CardValueInLargestGroup CardValueIn2ndLargestGroup
-------- -------- ----------------- ----------------------- --------------------------
1 2♥ 2 7 2
1 2♦ 2 7 2
1 4♥ 2 7 2
1 7♥ 2 7 2
1 7♠ 2 7 2
2 3♣ 2 10 0
2 6♣ 2 10 0
2 9♠ 2 10 0
2 10♥ 2 10 0
2 10♦ 2 10 0
...
10 3♥ 1 0 0
10 5♦ 1 0 0
10 8♥ 1 0 0
10 9♥ 1 0 0
10 A♣ 1 0 0
(50 Rows Total)
*/
So, you can see that PlayerID #1 has two pair… a pair of 7’s (CardValueInLargestGroup) and a pair of 2’s (CardValueIn2ndLargestGroup). PlayerID #2 has a single pair… a pair of 10’s… so his CardValueInLargestGroup is 10 and CardValueIn2ndLargestGroup is 0, because there is no second group. Finally PlayerID #10 has nothing… no groups at all… so those two columns are set to 0.

Now comes the part of evaluating the individual non-group loner cards. Instead of trying to evaluate them individually, I decided to put together a string containing all their values. This way, in order to evaluate whether one Player’s hand outranks another, I could just compare strings instead of trying to compare a bunch of individual cards. So the CTE HandEvaluation5 will concatenate together those lone cards into a column called LoneCardValString. Since we already have a CardSeq value, indicating the sequence of the cards in ascending order, we can use that to construct the LoneCardValString in descending order, looking at CardSeq=5, then CardSeq=4, etc, on down to CardSeq=1.

with DeckOfCards as
(. . .)
,
ShuffleAndDeal as
(. . .)
,
HandEvaluation1 as
(. . .)
,
HandEvaluation2 as
(. . .)
,
HandEvaluation3 as
(. . .)
,
HandEvaluation4 as
(. . .)
,
HandEvaluation5 as
(
select PlayerID
,CardSeq
,CardName
,CardValue
,GroupSeq
,NumDistinctVals
,NumDistinctSuits
,QtyInLargestGroup
,HighCardValue
,LowCardValue
,CardValueInLargestGroup
,CardValueIn2ndLargestGroup
,LoneCardValString=coalesce(
str(
max(case
when CardSeq=5
and CardValue<>CardValueInLargestGroup
and CardValue<>CardValueIn2ndLargestGroup
then CardValue
end) over (partition by PlayerID)
,2)
,'')
+coalesce(
str(
max(case
when CardSeq=4
and CardValue<>CardValueInLargestGroup
and CardValue<>CardValueIn2ndLargestGroup
then CardValue
end) over (partition by PlayerID)
,2)
,'')
+coalesce(
str(
max(case
when CardSeq=3
and CardValue<>CardValueInLargestGroup
and CardValue<>CardValueIn2ndLargestGroup
then CardValue
end) over (partition by PlayerID)
,2)
,'')
+coalesce(
str(
max(case
when CardSeq=2
and CardValue<>CardValueInLargestGroup
and CardValue<>CardValueIn2ndLargestGroup
then CardValue
end) over (partition by PlayerID)
,2)
,'')
+coalesce(
str(
max(case
when CardSeq=1
and CardValue<>CardValueInLargestGroup
and CardValue<>CardValueIn2ndLargestGroup
then CardValue
end) over (partition by PlayerID)
,2)
,'')
from HandEvaluation4
)
select PlayerID
,CardName
,CardValueInLargestGroup
,CardValueIn2ndLargestGroup
,LoneCardValString
from HandEvaluation5
/*
PlayerID CardName CardValueInLargestGroup CardValueIn2ndLargestGroup LoneCardValString
-------- -------- ----------------------- -------------------------- -----------------
1 2♥ 7 2 4
1 2♦ 7 2 4
1 4♥ 7 2 4
1 7♥ 7 2 4
1 7♠ 7 2 4
2 3♣ 10 0 9 6 3
2 6♣ 10 0 9 6 3
2 9♠ 10 0 9 6 3
2 10♥ 10 0 9 6 3
2 10♦ 10 0 9 6 3
...
10 3♥ 0 0 14 9 8 5 3
10 5♦ 0 0 14 9 8 5 3
10 8♥ 0 0 14 9 8 5 3
10 9♥ 0 0 14 9 8 5 3
10 A♣ 0 0 14 9 8 5 3
(50 Rows Total)
*/
So take a look at PlayerID #2’s hand, which is a pair of 10’s. His 3 individual non-group cards are a 9 and a 6 and a 3, which you can see in his LoneCardValString column. If another Player had a pair of 10’s, then I could compare their LoneCardValString’s to see which one outranks the other.

Now comes the time to figure out what hand each player has. The CTE HandEvaluation6 will assign a HandPoints column with a value of 0 through 9, where 0 represents a “Nothing” hand and 1 represents Two of a Kind, up to 9, which represents a Royal Flush. This is done based on the table of rules that I had presented towards the beginning of this article.

Then the CTE HandEvaluation7 will assign a HandDescript column with a string describing the type of hand. It also calculates a PlayerRanking column, which will rank the players on the following combination: HandPoints (based on the kind of poker hand the player has), CardValueInLargestGroup, CardValueIn2ndLargestGroup, and finally, the LoneCardValString.

(Note, by the way, that the output of this is a different set of hands than we looked at previously).

with DeckOfCards as
(. . .)
,
ShuffleAndDeal as
(. . .)
,
HandEvaluation1 as
(. . .)
,
HandEvaluation2 as
(. . .)
,
HandEvaluation3 as
(. . .)
,
HandEvaluation4 as
(. . .)
,
HandEvaluation5 as
(. . .)
,
HandEvaluation6 as
(
select PlayerID
,CardSeq
,CardName
,CardValueInLargestGroup
,CardValueIn2ndLargestGroup
,LoneCardValString
,HandPoints=case
/* Royal Flush: */
when NumDistinctSuits=1
and NumDistinctVals=5
and HighCardValue-LowCardValue=4
and HighCardValue=14
then 9
/* Straight Flush: */
when NumDistinctSuits=1
and NumDistinctVals=5
and HighCardValue-LowCardValue=4
then 8
/* Four of a Kind: */
when QtyInLargestGroup=4
then 7
/* Full House: */
when QtyInLargestGroup=3
and CardValueIn2ndLargestGroup<>0
then 6
/* Flush: */
when NumDistinctSuits=1
then 5
/* Straight: */
when NumDistinctVals=5
and HighCardValue-LowCardValue=4
then 4
/* Three of a Kind: */
when QtyInLargestGroup=3
then 3
/* Two Pair: */
when QtyInLargestGroup=2
and NumDistinctVals=3
then 2
/* Two of a Kind: */
when QtyInLargestGroup=2
then 1
/* Nothing: */
else 0
end
from HandEvaluation5
)
,
HandEvaluation7 as
(
select PlayerID
,CardSeqName='Card'+str(CardSeq,1)
,CardName
,HandDescript=case
when HandPoints=9 then 'Royal Flush'
when HandPoints=8 then 'Straight Flush'
when HandPoints=7 then 'Four of a Kind'
when HandPoints=6 then 'Full House'
when HandPoints=5 then 'Flush'
when HandPoints=4 then 'Straight'
when HandPoints=3 then 'Three of a Kind'
when HandPoints=2 then 'Two Pair'
when HandPoints=1 then 'Two of a Kind'
else 'Nothing'
end
,PlayerRanking=dense_rank() over (order by HandPoints desc
,CardValueInLargestGroup desc
,CardValueIn2ndLargestGroup desc
,LoneCardValString desc)
from HandEvaluation6
)
select PlayerID
,CardSeqName
,CardName
,HandDescript
,PlayerRanking
from HandEvaluation7
order by PlayerRanking
/*
PlayerID CardSeqName CardName HandDescript PlayerRanking
-------- ----------- ----------- --------------- -------------
3 Card1 5♦ Flush 1
3 Card2 7♦ Flush 1
3 Card3 8♦ Flush 1
3 Card4 10♦ Flush 1
3 Card5 Q♦ Flush 1

6 Card1 3♠ Two of a Kind 2
6 Card2 9♣ Two of a Kind 2
6 Card3 10♠ Two of a Kind 2
6 Card4 A♣ Two of a Kind 2
6 Card5 A♠ Two of a Kind 2

7 Card1 6♥ Two of a Kind 3
7 Card2 9♥ Two of a Kind 3
7 Card3 J♦ Two of a Kind 3
7 Card5 Q♠ Two of a Kind 3
7 Card4 Q♣ Two of a Kind 3

5 Card2 5♣ Two of a Kind 4
5 Card1 5♠ Two of a Kind 4
5 Card3 6♠ Two of a Kind 4
5 Card4 7♣ Two of a Kind 4
5 Card5 A♦ Two of a Kind 4

4 Card2 4♥ Two of a Kind 5
4 Card1 4♠ Two of a Kind 5
4 Card3 5♥ Two of a Kind 5
4 Card4 8♠ Two of a Kind 5
4 Card5 K♥ Two of a Kind 5

10 Card1 2♠ Two of a Kind 6
10 Card2 3♦ Two of a Kind 6
10 Card3 3♥ Two of a Kind 6
10 Card4 9♠ Two of a Kind 6
10 Card5 K♦ Two of a Kind 6

1 Card1 3♣ Nothing 7
1 Card2 6♦ Nothing 7
1 Card3 9♦ Nothing 7
1 Card4 J♥ Nothing 7
1 Card5 A♥ Nothing 7

9 Card1 2♥ Nothing 8
9 Card2 7♥ Nothing 8
9 Card3 10♣ Nothing 8
9 Card4 Q♥ Nothing 8
9 Card5 K♣ Nothing 8

2 Card1 2♦ Nothing 9
2 Card2 8♣ Nothing 9
2 Card3 10♥ Nothing 9
2 Card4 J♣ Nothing 9
2 Card5 K♠ Nothing 9

8 Card1 2♣ Nothing 10
8 Card2 4♣ Nothing 10
8 Card3 7♠ Nothing 10
8 Card4 8♥ Nothing 10
8 Card5 J♠ Nothing 10
*/
So it looks like PlayerID #3 is the winner, with a Flush. Many of the other players had Two of a Kind… note that the player with a pair of Aces outranked the player with a pair of Queens, who outranked the player with a pair of 5’s, etc. Also note the players who had no poker hand at all and how they were ranked based on all their individual cards. The player ranked last had a Jack high, which is pretty good, but the other players had an Ace high or King high, so they outranked him.

Now the final piece of the puzzle is to PIVOT all those 50 rows into just 10. Each row will show the PlayerID, his hand, and the description of that hand. The 10 players will be output in descending order of rank, so that the player on top is the winner of the round. Note that the CTE HandEvaluation7 above already created a new column called CardSeqName with the values 'Card1' through 'Card5', and we can use that column to PIVOT on in the final query.

(Once again, the output of this is a different set of hands than we looked at previously… Hey, it is random, isn’t it?).

with DeckOfCards as
(. . .)
,
ShuffleAndDeal as
(. . .)
,
HandEvaluation1 as
(. . .)
,
HandEvaluation2 as
(. . .)
,
HandEvaluation3 as
(. . .)
,
HandEvaluation4 as
(. . .)
,
HandEvaluation5 as
(. . .)
,
HandEvaluation6 as
(. . .)
,
HandEvaluation7 as
(. . .)
select PlayerID
,Hand=Card1+' '+Card2+' '+Card3+' '+Card4+' '+Card5
,HandDescript
from HandEvaluation7
pivot (max(CardName) for CardSeqName in (Card1,Card2,Card3,Card4,Card5)) P
order by PlayerRanking
/*
PlayerID Hand HandDescript
-------- --------------- ---------------
3 2♠ 7♠ 8♠ 9♠ 10♠ Flush
1 5♥ 5♣ 5♠ 6♣ K♣ Three of a Kind
5 2♥ 2♦ 6♥ K♠ K♦ Two Pair
4 6♠ 10♦ Q♠ Q♥ K♥ Two of a Kind
6 4♦ 7♣ 7♦ 8♥ J♦ Two of a Kind
2 2♣ 3♣ 3♠ 9♥ J♣ Two of a Kind
9 5♦ 9♦ J♥ Q♦ A♦ Nothing
7 3♥ 4♣ 6♦ 10♥ A♠ Nothing
8 3♦ 4♥ 8♣ 9♣ A♣ Nothing
10 4♠ 8♦ 10♣ J♠ Q♣ Nothing
*/
So we’ve done it! You can download the entire script from my Sky Drive. Execute it over and over and see how often you get a Flush or a Four of a Kind or even a Royal Flush.

Oh, yes, one more thing…

The great thing about a computer is that you can give it an incredibly boring and tedious task and it will gladly do it for you. Since I had a script to evaluate poker hands, I decided to see what the probabilities were for dealing out each kind of hand. With a fresh set of shuffled cards, I dealt out a single hand and evaluated it. I wrote a loop to do this over and over 2,598,960 times. And here is what I came up with:

/*
HandDescript Occurrences PercentOccurred
-------------------------------------------
Nothing 1,303,701 50.162411%
Two of a Kind 1,098,481 42.266176%
Two Pair 123,259 4.742628%
Three of a Kind 54,955 2.114500%
Straight 9,095 0.349948%
Flush 5,007 0.192654%
Full House 3,777 0.145327%
Four of a Kind 653 0.025125%
Straight Flush 24 0.000923%
Royal Flush 8 0.000308%
-------------------------------------------
GRAND TOTAL 2,598,960 100.000000%
*/
This very closely resembles the statistics in the Poker Probability article at Wikipedia, which mathematically calculates out the probabilities on 2,598,960 hands.

One glaring difference, though, is that I came up with a Royal Flush with twice the frequency than the Wikipedia article mathematically calculated.

Perhaps I’m just lucky.

16 comments:

  1. This is awesome!!! I loved the graphical effects and the concept of playing poker in SQL. Great job!!

    ReplyDelete
  2. You are insane. A good type of insane, but insane nonetheless.

    I like the idea of doing all these things in SQL. Sets are everywhere, and if we just look, we'll see them.

    A critique: I find your posts, such as this, confusing. It's like i'm coming along for the ride, but if i don't catch your train of thought, the rest is just a blur. Everything here is explained after the fact "i did this because"..."i didn't do that because", which are great comments inside an SP or VIEW that someone else might need to maintain, but is expected to be comfortable with the context. Here, it takes a moment to get me into your mode of (insane) thinking.

    I think i'm asking for a little more explanation of the approach, before diving into it. Maybe even showing what doesn't work. (I'm not so clear on the whole thing myself.) Then you can publish the SQL to Mother Goose and Grimm.

    ReplyDelete
  3. I think you need a statistician to see how significant the discrepancy between your royal flush experience and the expected royal flush experience is.
    Stats was not my favorite subject.

    ReplyDelete
  4. Nice! I started something similar several years ago but you've taken it to a much higher level.

    http://sqlblog.com/blogs/adam_machanic/archive/2006/07/12/texas-hold-em-hint-1.aspx

    ReplyDelete
  5. @Brian:

    Yes, "Insane" is my middle name.

    Regarding your critique... I appreciate your bringing it up. This kind of complicated stuff does get hard to communicate. But the steps that I outlined in the blog were the actual steps that I took in developing the solution (but I left out a couple things I tried and eliminated). My explanations mirrored what I was thinking when composing the code.

    I read over the post several times before posting it, trying to make sure it would be clear, and I thought it was. But perhaps I am "just too close" to it... it was already ingrained in my brain, so what seemed clear to me may be "clear as mud" to others.

    It was a challenge to write the code... my brain did go through several contortions... but you eventually get "into the zone" and it all becomes clear. But it is admittedly hard to get into that same zone if you haven't begun the entire process from scratch... It's hard for me to follow someone else's code sometimes... especially something as wild as this was.

    Wait till next Tuesday... Most likely I won't even bother trying to explain the code in my T-SQL Tuesday post because it is even more complex than the Poker stuff... I may just let the comments in the code speak for themselves... We'll see.

    ReplyDelete
  6. @Adam:

    Thanks for the comment... I was wondering if someone else was crazy enough to do something like this, and I'm glad to see I wasn't the first. 8^)

    I did consider doing a more mathematical approach at first, but didn't come up with anything that I liked. Admittedly the light bulb went on with regard to my eventual approach before I was able to pursue mathematics any further. I like your idea of the powers of two... I look forward to looking into your code in more detail... thanks for that.

    ReplyDelete
  7. @Brad

    Understood. Perhaps you can post your goofs as well. If you're documenting you're "process", why not?

    ReplyDelete
  8. Good idea & read Brad. Brilliant stuff here!

    ReplyDelete
  9. This was amazing! And perhaps to explain your "luck" : they [wikipedia] were computing probability directly, whereas your SQL server has the luxury of presenting reality. I'm willing to "bet" (pun intended!) that if you re-ran this program 10, 20, 100, 1000 times with averaged results, you'd achieve numbers closer to those that the math predicts. (And SQL Server would absolutely LOVE drawing 2,598,960,000 (2.6 BILLION) poker hands for you!)

    ReplyDelete
  10. @Kevin:

    I'm glad you enjoyed it! Thanks for the feedback... Now if I can just find the time to run the program a few thousand times... 8^)

    --Brad

    ReplyDelete
  11. This comment has been removed by the author.

    ReplyDelete
  12. This comment has been removed by the author.

    ReplyDelete
  13. I like the approach you've taken and have made use of the SQL too. I thought I found an issue with the straight Ace though five, when I should have been using one though five, took me a while to spot my mistake in the SQL I modified to compare a number of hands.

    ReplyDelete
  14. Thanks for the comments, Daniel... I think you're right... I only counted the Ace as a "high ace" and didn't account for it possibly being a "low ace".

    This probably explains why I didn't get as many straights as the Wikipedia probability indicates.

    --Brad

    ReplyDelete
  15. I wanted to keep that beautiful script bug-free and corrected the case with straights. Revised version is at below path.

    https://docs.google.com/open?id=0B9ceZntjqm9EMjlpcEVFckhPTk0

    New simulation results show that the difference at straights with Wikipedia values are eliminated.

    HandDescript Occurrences PercentOccurred
    -------------------------------------------
    Nothing 1,301,768 50.088035214 %
    Two of a Kind 1,099,008 42.286453043 %
    Two Pair 123,739 4.761096746 %
    Three of a Kind 54,699 2.104649552 %
    Straight 10,185 0.391887524 %
    Flush 5,181 0.199348970 %
    Full House 3,757 0.144557823 %
    Four of a Kind 592 0.022778342 %
    Straight Flush 27 0.001038877 %
    Royal Flush 4 0.000153908 %
    -------------------------------------------
    GRAND TOTAL 2,598,960 100.0000000 %

    ReplyDelete
  16. Hi, this is awesome! May I challenge you? 13 cards per player in 3 group (3-5-5).. this is Chinese Poker. No-one has the calculation yet... I'm working on it now... and losing half of my hair from my head... :)

    ReplyDelete