The PATH option was introduced in SQL2005 to provide a flexible and easier approach to constructing XML output. I thank my lucky stars that I started with T-SQL at the SQL2005 level, because the SQL2000 method of using the EXPLICIT option looks like a complete nightmare. (If you’re into torture, take a look at Books Online for documentation on how to use the EXPLICIT option. When you're done screaming, then come back and read on).
Let's take a quick look at the output that results with the FOR XML PATH clause. If you pass no specific path name, then it assumes a path of ‘row’:
select ID=ContactIDFor the query below, let's supply a specific path name of ‘Contact’. And it’s usually good practice to create XML with a root tag, and we can do that by adding the ROOT directive like so:
,FirstName
,LastName
,Phone
from Person.Contact
where ContactID between 90 and 94
for xml path
/*
<row>
<ID>90</ID>
<FirstName>Andreas</FirstName>
<LastName>Berglund</LastName>
<Phone>795-555-0116</Phone>
</row>
<row>
<ID>91</ID>
<FirstName>Robert</FirstName>
<LastName>Bernacchi</LastName>
<Phone>449-555-0176</Phone>
</row>
<row>
<ID>92</ID>
<FirstName>Matthias</FirstName>
<LastName>Berndt</LastName>
<Phone>384-555-0169</Phone>
</row>
<row>
<ID>93</ID>
<FirstName>John</FirstName>
<LastName>Berry</LastName>
<Phone>471-555-0181</Phone>
</row>
<row>
<ID>94</ID>
<FirstName>Steven</FirstName>
<LastName>Brown</LastName>
<Phone>280-555-0124</Phone>
</row>
*/
select ID=ContactIDYou’ll note that the column names were used as the tags for each element in the XML. For example, I renamed the first column to be ID rather than ContactID and therefore the element tag <ID></ID> was created.
,FirstName
,LastName
,Phone
from Person.Contact
where ContactID between 90 and 94
for xml path('Contact'),root('Contacts')
/*
<Contacts>
<Contact>
<ID>90</ID>
<FirstName>Andreas</FirstName>
<LastName>Berglund</LastName>
<Phone>795-555-0116</Phone>
</Contact>
<Contact>
<ID>91</ID>
<FirstName>Robert</FirstName>
<LastName>Bernacchi</LastName>
<Phone>449-555-0176</Phone>
</Contact>
<Contact>
<ID>92</ID>
<FirstName>Matthias</FirstName>
<LastName>Berndt</LastName>
<Phone>384-555-0169</Phone>
</Contact>
<Contact>
<ID>93</ID>
<FirstName>John</FirstName>
<LastName>Berry</LastName>
<Phone>471-555-0181</Phone>
</Contact>
<Contact>
<ID>94</ID>
<FirstName>Steven</FirstName>
<LastName>Brown</LastName>
<Phone>280-555-0124</Phone>
</Contact>
</Contacts>
*/
You have the ability to shape the XML in whatever ways you wish based on what names you give to your columns. For example, any column that starts with an at-sign (@) will create attributes rather than elements, as illustrated below:
select "@ID"=ContactIDYou can mix attributes and elements together like so:
,"@FirstName"=FirstName
,"@LastName"=LastName
,"@Phone"=Phone
from Person.Contact
where ContactID between 90 and 94
for xml path('Contact'),root('Contacts')
/*
<Contacts>
<Contact ID="90" FirstName="Andreas" LastName="Berglund" Phone="795-555-0116" />
<Contact ID="91" FirstName="Robert" LastName="Bernacchi" Phone="449-555-0176" />
<Contact ID="92" FirstName="Matthias" LastName="Berndt" Phone="384-555-0169" />
<Contact ID="93" FirstName="John" LastName="Berry" Phone="471-555-0181" />
<Contact ID="94" FirstName="Steven" LastName="Brown" Phone="280-555-0124" />
</Contacts>
*/
select "@ID"=ContactIDAnd you can create nested attributes and elements, as illustrated below:
,FirstName
,LastName
,Phone
from Person.Contact
where ContactID between 90 and 94
for xml path('Contact'),root('Contacts')
/*
<Contacts>
<Contact ID="90">
<FirstName>Andreas</FirstName>
<LastName>Berglund</LastName>
<Phone>795-555-0116</Phone>
</Contact>
<Contact ID="91">
<FirstName>Robert</FirstName>
<LastName>Bernacchi</LastName>
<Phone>449-555-0176</Phone>
</Contact>
<Contact ID="92">
<FirstName>Matthias</FirstName>
<LastName>Berndt</LastName>
<Phone>384-555-0169</Phone>
</Contact>
<Contact ID="93">
<FirstName>John</FirstName>
<LastName>Berry</LastName>
<Phone>471-555-0181</Phone>
</Contact>
<Contact ID="94">
<FirstName>Steven</FirstName>
<LastName>Brown</LastName>
<Phone>280-555-0124</Phone>
</Contact>
</Contacts>
*/
select "@ID"=ContactIDIn the above query, I introduced a Name element with two attributes (Title and Suffix) and two sub-elements (First and Last). You can also see that some of the contacts had NULL for the Title and Suffix and therefore those attributes were not created for those contacts.
,"Name/@Title"=Title
,"Name/@Suffix"=Suffix
,"Name/First"=FirstName
,"Name/Last"=LastName
from Person.Contact
where ContactID between 92 and 94
for xml path('Contact'),root('Contacts')
/*
<Contacts>
<Contact ID="92">
<Name Title="Mr.">
<First>Matthias</First>
<Last>Berndt</Last>
</Name>
<Phone>384-555-0169</Phone>
</Contact>
<Contact ID="93">
<Name>
<First>John</First>
<Last>Berry</Last>
</Name>
<Phone>471-555-0181</Phone>
</Contact>
<Contact ID="94">
<Name Title="Mr." Suffix="IV">
<First>Steven</First>
<Last>Brown</Last>
</Name>
<Phone>280-555-0124</Phone>
</Contact>
</Contacts>
*/
Note that attributes must be introduced first, before the elements. For example, if I tried to do the following, I would get an error:
select "@ID"=ContactIDWhen you have two adjacent columns with the same name, then their data will be concatenated together in one element, like so:
,"Name/First"=FirstName
,"Name/Last"=LastName
,"Name/@Title"=Title
,"Name/@Suffix"=Suffix
,Phone
from Person.Contact
where ContactID between 92 and 94
for xml path('Contact'),root('Contacts')
/*
Msg 6852, Level 16, State 1, Line 1
Attribute-centric column 'Name/@Title' must not come after a
non-attribute-centric sibling in XML hierarchy in FOR XML PATH.
*/
select Name=TitleNote again that NULL column values are ignored in the concatenation.
,Name=FirstName
,Name=MiddleName
,Name=LastName
,Name=Suffix
from Person.Contact
where ContactID between 90 and 94
for xml path('Contact'),root('Contacts')
/*
<Contacts>
<Contact><Name>AndreasBerglund</Name></Contact>
<Contact><Name>Mr.RobertM.Bernacchi</Name></Contact>
<Contact><Name>Mr.MatthiasBerndt</Name></Contact>
<Contact><Name>JohnBerry</Name></Contact>
<Contact><Name>Mr.StevenB.BrownIV</Name></Contact>
</Contacts>
*/
If you wanted to construct a nice readable single element consisting of the contact’s full name (Title, FirstName, MiddleName, LastName, and Suffix), you could approach it like this:
select Name=coalesce(Title+' ','')But look all the logic required to handle possible NULL values in the Title and MiddleName and Suffix columns. Well, good news! You can use the following trick. Incorporate data() into the column name as illustrated below, and it will take care of concatenating it all together with spaces between and eliminating all the NULL values automatically:
+FirstName+' '
+coalesce(MiddleName+' ','')
+LastName
+coalesce(' '+Suffix,'')
from Person.Contact
where ContactID between 90 and 94
for xml path('Contact'),root('Contacts')
/*
<Contacts>
<Contact><Name>Andreas Berglund</Name></Contact>
<Contact><Name>Mr. Robert M. Bernacchi</Name></Contact>
<Contact><Name>Mr. Matthias Berndt</Name></Contact>
<Contact><Name>John Berry</Name></Contact>
<Contact><Name>Mr. Steven B. Brown IV</Name></Contact>
</Contacts>
*/
select "Name/data()"=TitleHowever, this approach will not work if you were trying to construct a Name attribute as opposed to a Name element:
,"Name/data()"=FirstName
,"Name/data()"=MiddleName
,"Name/data()"=LastName
,"Name/data()"=Suffix
from Person.Contact
where ContactID between 90 and 94
for xml path('Contact'),root('Contacts')
/*
<Contacts>
<Contact><Name>Andreas Berglund</Name></Contact>
<Contact><Name>Mr. Robert M. Bernacchi</Name></Contact>
<Contact><Name>Mr. Matthias Berndt</Name></Contact>
<Contact><Name>John Berry</Name></Contact>
<Contact><Name>Mr. Steven B. Brown IV</Name></Contact>
</Contacts>
*/
select "@Name/data()"=TitleBut you can handle that through a sub-query like so:
,"@Name/data()"=FirstName
,"@Name/data()"=MiddleName
,"@Name/data()"=LastName
,"@Name/data()"=Suffix
from Person.Contact
where ContactID between 90 and 94
for xml path('Contact'),root('Contacts')
/*
Msg 6850, Level 16, State 1, Line 1
Column name '@Name/data()' contains an invalid XML identifier as required by FOR XML;
'@'(0x0040) is the first character at fault.
*/
select "@Name"=(select "data()"=TitleBesides data(), you can also incorporate text() or node() into a column name or give a column a wildcard name (*) and the data will be inserted directly as text. They are all interchangeable, as you can see in the following example:
,"data()"=FirstName
,"data()"=MiddleName
,"data()"=LastName
,"data()"=Suffix
from Person.Contact c2
where c2.ContactID=Contact.ContactID
for xml path(''))
from Person.Contact
where ContactID between 90 and 94
for xml path('Contact'),root('Contacts')
/*
<Contacts>
<Contact Name="Andreas Berglund" />
<Contact Name="Mr. Robert M. Bernacchi" />
<Contact Name="Mr. Matthias Berndt" />
<Contact Name="John Berry" />
<Contact Name="Mr. Steven B. Brown IV" />
</Contacts>
*/
select "text()"=TitleRemember, two adjacent columns with names that incorporate data() will be separated by a space, but, as you see above, those named with text() or node() or a wildcard are just concatenated directly with no intervening space.
,"node()"=FirstName
,"*"=MiddleName
,"node()"=LastName
,"text()"=Suffix
from Person.Contact
where ContactID between 90 and 94
for xml path('Contact'),root('Contacts')
/*
<Contacts>
<Contact>AndreasBerglund</Contact>
<Contact>Mr.RobertM.Bernacchi</Contact>
<Contact>Mr.MatthiasBerndt</Contact>
<Contact>JohnBerry</Contact>
<Contact>Mr.StevenB.BrownIV</Contact>
</Contacts>
*/
You only really need to specify text() or node() or wildcard names if you want to insert a text element directly subordinate to the main path element, as we saw in the previous query. If, on the other hand, you are inserting text in a sub-element like so…:
select "Name/text()"=Title…then you’ll see that they are really unnecessary, since the following query (which we looked at earlier) does the exact same thing:
,"Name/node()"=FirstName
,"Name/*"=MiddleName
,"Name/node()"=LastName
,"Name/text()"=Suffix
from Person.Contact
where ContactID between 90 and 94
for xml path('Contact'),root('Contacts')
/*
<Contacts>
<Contact><Name>AndreasBerglund</Name></Contact>
<Contact><Name>Mr.RobertM.Bernacchi</Name></Contact>
<Contact><Name>Mr.MatthiasBerndt</Name></Contact>
<Contact><Name>JohnBerry</Name></Contact>
<Contact><Name>Mr.StevenB.BrownIV</Name></Contact>
</Contacts>
*/
select Name=TitleYou can also incorporate comment() or processing-instruction() into the column names to create those kinds of elements, as illustrated below:
,Name=FirstName
,Name=MiddleName
,Name=LastName
,Name=Suffix
from Person.Contact
where ContactID between 90 and 94
for xml path('Contact'),root('Contacts')
/*
<Contacts>
<Contact><Name>AndreasBerglund</Name></Contact>
<Contact><Name>Mr.RobertM.Bernacchi</Name></Contact>
<Contact><Name>Mr.MatthiasBerndt</Name></Contact>
<Contact><Name>JohnBerry</Name></Contact>
<Contact><Name>Mr.StevenB.BrownIV</Name></Contact>
</Contacts>
*/
select "@ID"=ContactIDYou’ll note above that the two adjacent columns named comment() do NOT concatenate together like other adjacent columns with the same name. They are always separate elements. The same is true for processing-instruction() columns.
,"comment()"='Modified on '+convert(varchar(30),ModifiedDate,126)
,"comment()"=case when ContactID=92 then 'Here is Contact#92' end
,"processing-instruction(EmailPromo)"=EmailPromotion
,"Name/@First"=FirstName
,"Name/@Last"=LastName
,"Name"='This is inserted directly as text'
,"Name"='...And so is this'
,"*"='This is inserted as text in the main Contact path'
from Person.Contact
where ContactID between 90 and 92
for xml path('Contact'),root('Contacts')
/*
<Contacts>
<Contact ID="90">
<!--Modified on 2001-08-01T00:00:00-->
<?EmailPromo 0?>
<Name First="Andreas" Last="Berglund">
This is inserted directly as text...And so is this
</Name>
This is inserted as text in the main Contact path
</Contact>
<Contact ID="91">
<!--Modified on 2002-09-01T00:00:00-->
<?EmailPromo 1?>
<Name First="Robert" Last="Bernacchi">
This is inserted directly as text...And so is this
</Name>
This is inserted as text in the main Contact path
</Contact>
<Contact ID="92">
<!--Modified on 2002-08-01T00:00:00-->
<!--Here is Contact#92-->
<?EmailPromo 1?>
<Name First="Matthias" Last="Berndt">
This is inserted directly as text...And so is this
</Name>
This is inserted as text in the main Contact path
</Contact>
</Contacts>
*/
You can also concatenate whole individual XML documents together, as illustrated below, where we use two scalar subqueries to construct XML data from the Sales.SalesPerson and Sales.SalesReason tables. Since we did not give actual column names to the two subqueries, they are inserted directly as is. (Note that we could have named each of them node() or a wildcard and it would have worked the same. However, it’s important to note that you may NOT use text() or data() in naming true XML datatype columns):
select (select "@ID"=SalesPersonIDNote that the ,TYPE directive was used to make sure that the XML subqueries came through as true XML datatypes. This is very important. If we had left off the ,TYPE directive, they would be processed as strings and then when they were incorporated into the main query, the main FOR XML PATH(‘MyData’) would encode all of the less-than and greater-than signs into this ugly mess:
,"@Quota"=SalesQuota
from Sales.SalesPerson
for xml path('Person'),root('SalesPeople'),type)
,(select "@ID"=SalesReasonID
,"@Name"=Name
,"@Type"=ReasonType
from Sales.SalesReason
for xml path('Reason'),root('SalesReasons'),type)
for xml path('MyData')
/*
<MyData>
<SalesPeople>
<Person ID="268" />
<Person ID="275" Quota="300000.0000" />
<Person ID="276" Quota="250000.0000" />
<Person ID="277" Quota="250000.0000" />
<Person ID="278" Quota="250000.0000" />
<Person ID="279" Quota="300000.0000" />
<Person ID="280" Quota="250000.0000" />
<Person ID="281" Quota="250000.0000" />
<Person ID="282" Quota="250000.0000" />
<Person ID="283" Quota="250000.0000" />
<Person ID="284" />
<Person ID="285" Quota="250000.0000" />
<Person ID="286" Quota="250000.0000" />
<Person ID="287" Quota="300000.0000" />
<Person ID="288" />
<Person ID="289" Quota="250000.0000" />
<Person ID="290" Quota="250000.0000" />
</SalesPeople>
<SalesReasons>
<Reason ID="1" Name="Price" Type="Other" />
<Reason ID="2" Name="On Promotion" Type="Promotion" />
<Reason ID="3" Name="Magazine Advertisement" Type="Marketing" />
<Reason ID="4" Name="Television Advertisement" Type="Marketing" />
<Reason ID="5" Name="Manufacturer" Type="Other" />
<Reason ID="6" Name="Review" Type="Other" />
<Reason ID="7" Name="Demo Event" Type="Marketing" />
<Reason ID="8" Name="Sponsorship" Type="Marketing" />
<Reason ID="9" Name="Quality" Type="Other" />
<Reason ID="10" Name="Other" Type="Other" />
</SalesReasons>
</MyData>
*/
select (select "@ID"=SalesPersonIDNow that we’ve learned so much about FOR XML PATH, let’s put our knowledge to use. Let’s say that you want to construct a webpage or an e-mail that incorporates a table in HTML format. Using our knowledge of FOR XML PATH, we will construct all the HTML between the <table></table> tags. That can then be incorporated into the correct spot in the webpage or e-mail.
,"@Quota"=SalesQuota
from Sales.SalesPerson
for xml path('Person'),root('SalesPeople'))
,(select "@ID"=SalesReasonID
,"@Name"=Name
,"@Type"=ReasonType
from Sales.SalesReason
for xml path('Reason'),root('SalesReasons'))
for xml path('MyData')
/*
<MyData>
<SalesPeople>
<Person ID="268" />
<Person ID="275" Quota="300000.0000" />
<Person ID="276" Quota="250000.0000" />
<Person ID="277" Quota="250000.0000" />
<Person ID="278" Quota="250000.0000" />
<Person ID="279" Quota="300000.0000" />
<Person ID="280" Quota="250000.0000" />
<Person ID="281" Quota="250000.0000" />
<Person ID="282" Quota="250000.0000" />
<Person ID="283" Quota="250000.0000" />
<Person ID="284" />
<Person ID="285" Quota="250000.0000" />
<Person ID="286" Quota="250000.0000" />
<Person ID="287" Quota="300000.0000" />
<Person ID="288" />
<Person ID="289" Quota="250000.0000" />
<Person ID="290" Quota="250000.0000" />
</SalesPeople>
<SalesReasons>
<Reason ID="1" Name="Price" Type="Other" />
<Reason ID="2" Name="On Promotion" Type="Promotion" />
<Reason ID="3" Name="Magazine Advertisement" Type="Marketing" />
<Reason ID="4" Name="Television Advertisement" Type="Marketing" />
<Reason ID="5" Name="Manufacturer" Type="Other" />
<Reason ID="6" Name="Review" Type="Other" />
<Reason ID="7" Name="Demo Event" Type="Marketing" />
<Reason ID="8" Name="Sponsorship" Type="Marketing" />
<Reason ID="9" Name="Quality" Type="Other" />
<Reason ID="10" Name="Other" Type="Other" />
</SalesReasons>
</MyData>
*/
Note that this query below uses most of what we learned in this article. You’ll see the following:
- We create ALIGN and VALIGN attributes to align the table headers correctly.
- We put a <br /> tag into the Phone Number header to split it into two lines.
- We use data() to construct the Full Name of the contact.
- We create a hyperlink for the E-Mail Address
- We subtly color the E-Mail Address in a pale yellow color if EmailPromotion is equal to 1.
- We use the ,TYPE directive in our XML CTEs so that we can concatenate them in subsequent CTEs.
Here’s the query, which creates a single NVARCHAR(MAX) variable called @TableHTML:
declare @TableHTML nvarchar(max);Remember the rule that if two adjacent columns have the same name, their data will concatenated? I had to prevent that from happening with adjacent columns that I named th and td by inserting a blank column with a wildcard name between them to force them to come out as discrete elements.
with HTMLTableHeader(HTMLContent) as
(
select "th/@align"='right'
,"th/@valign"='bottom'
,"th"='ContactID'
,"*"=''
,"th/@valign"='bottom'
,"th"='Full Name'
,"*"=''
,"th"='Phone'
,"th/br"=''
,"th"='Number'
,"*"=''
,"th/@valign"='bottom'
,"th"='Email Address'
for xml path('tr'),type
)
,HTMLTableDetail(HTMLContent) as
(
select "td/@align"='right'
,"td"=ContactID
,"*"=''
,"td/data()"=Title
,"td/data()"=FirstName
,"td/data()"=MiddleName
,"td/data()"=LastName
,"td/data()"=Suffix
,"*"=''
,"td"=Phone
,"*"=''
,"td/@bgcolor"=case when EmailPromotion=1 then '#FFFF88' end
,"td/a/@href"='mailto:'+EmailAddress
,"td/a"=EmailAddress
from Person.Contact
where ContactID between 90 and 94
for xml path('tr'),type
)
,HTMLTable(HTMLContent) as
(
select "@border"=1
,(select HTMLContent from HTMLTableHeader)
,(select HTMLContent from HTMLTableDetail)
for xml path('table') /*No TYPE because we want a string */
)
select @TableHTML=(select HTMLContent from HTMLTable);
And here are the contents of that variable as a result of that query:
/*
<table border="1">
<tr>
<th align="right" valign="bottom">ContactID</th>
<th valign="bottom">Full Name</th>
<th>Phone<br />Number</th>
<th valign="bottom">Email Address</th>
</tr>
<tr>
<td align="right">90</td>
<td>Andreas Berglund</td>
<td>795-555-0116</td>
<td>
<a href="mailto:andreas1@adventure-works.com">andreas1@adventure-works.com</a>
</td>
</tr>
<tr>
<td align="right">91</td>
<td>Mr. Robert M. Bernacchi</td>
<td>449-555-0176</td>
<td bgcolor="#FFFF88">
<a href="mailto:robert4@adventure-works.com">robert4@adventure-works.com</a>
</td>
</tr>
<tr>
<td align="right">92</td>
<td>Mr. Matthias Berndt</td>
<td>384-555-0169</td>
<td bgcolor="#FFFF88">
<a href="mailto:matthias1@adventure-works.com">matthias1@adventure-works.com</a>
</td>
</tr>
<tr>
<td align="right">93</td>
<td>John Berry</td>
<td>471-555-0181</td>
<td>
<a href="mailto:john11@adventure-works.com">john11@adventure-works.com</a>
</td>
</tr>
<tr>
<td align="right">94</td>
<td>Mr. Steven B. Brown IV</td>
<td>280-555-0124</td>
<td>
<a href="mailto:steven1@adventure-works.com">steven1@adventure-works.com</a>
</td>
</tr>
</table>
*/
Our webpage template looks like this, with a placeholder where we want to insert our table:/*
<html>
<head>
<title>HTML Table constructed via FOR XML PATH</title>
</head>
<body style="font-family:Arial; font-size:small">
<span style="font-size:x-large">
<b>Selected Contacts:</b>
</span>
<!-- Insert Table Here -->
</body>
</html>
*/
And here is the final result, with the table data inserted in the placeholder position, when we look at the web page in Internet Explorer:I hope this article gave you a tantalizing look at the possibilities of things you can accomplish with the FOR XML PATH clause. In future blog entries, I’ll explore some other aspects of XML.
Very cool post. I never thought to use SQL Servers xml functionality to generate HTML
ReplyDeleteWow, this covers a lot of stuff. Too much to absorb in one session.
ReplyDeleteWhat i got here was an understanding of some of the elements, so next time i look at it, i won't be as confused.
Thanx Brad!
Excellent post, congrats! I learned a lot. Today I'll go to sleep less silly. Thanks Brad
ReplyDeletethis is the most well explained blog for xml path. did really learned stuffs here. thanks!
ReplyDeleteAnonymous and Anonymous:
ReplyDeleteThanks for the great feedback! I'm glad you found it useful!
I spent hours searching for something this good. Well done my good sir.
ReplyDeleteVery good information thanks.
ReplyDeleteGreat ! Thanks for sharing
ReplyDeleteThank you for taking the time to share this knowledge with everyone. I plan on using it to put together an audit document.
ReplyDelete