Sunday, April 16, 2006

Using a Subquery in a T-SQL Statement !

What is a Subquery?
A subquery is a SELECT statement that is nested within another T-SQL statement. A subquery SELECT statement if executed independently of the T-SQL statement, in which it is nested, will return a result set. Meaning a subquery SELECT statement can standalone and is not depended on the statement in which it is nested. A subquery SELECT statement can return any number of values, and can be found in, the column list of a SELECT statement, a FROM, GROUP BY, HAVING, and/or ORDER BY clauses of a T-SQL statement. A Subquery can also be used as a parameter to a function call. Basically a subquery can be used anywhere an expression can be used.
Use of a Subquery in the Column List of a SELECT Statement
Suppose you would like to see the last OrderID and the OrderDate for the last order that was shipped to Paris. Along with that information, say you would also like to see the OrderDate for the last order shipped regardless of the ShipCity. In addition to this, you would also like to calculate the difference in days between the two different OrderDates. Here is my T-SQL SELECT statement to accomplish this:


select top 1 OrderId,convert(char(10),
OrderDate,121) Last_Paris_Order,
(select convert(char(10),max(OrderDate),121) from
Northwind.dbo.Orders) Last_OrderDate,
datediff(dd,OrderDate,
(select Max(OrderDate)from Northwind.dbo.Orders)) Day_Diff
from Northwind.dbo.Orders
where ShipCity = 'Paris'
order by OrderDate desc

The above code contains two subqueries. The first subquery gets the OrderDate for the last order shipped regardless of ShipCity, and the second subquery calculates the number of days between the two different OrderDates. Here I used the first subquery to return a column value in the final result set. The second subquery was used as a parameter in a function call. This subquery passed the "max(OrderDate)" date to the DATEDIFF function.
Use of a Subquery in the WHERE clause
A subquery can be used to control the records returned from a SELECT by controlling which records pass the conditions of a WHERE clause. In this case the results of the subquery would be used on one side of a WHERE clause condition. Here is an example:select distinct country from Northwind.dbo.Customers
where country not in (select distinct country from Northwind.dbo.Suppliers)
Here I have returned a list of countries where customers live, but there is no supplier located in that country. I suppose if you where trying to provide better delivery time to customers, then you might target these countries to look for additional suppliers.
Suppose a company would like to do some targeted marketing. This targeted marketing would contact customers in the country with the fewest number of orders. It is hoped that this targeted marketing will increase the overall sales in the targeted country. Here is an example that uses a subquery to return the customer contact information for the country with the fewest number of orders:
select Country,CompanyName, ContactName, ContactTitle, Phone
from Northwind.dbo.Customers
where country =
(select top 1 country
from Northwind.dbo.Customers C
join
Northwind.dbo.Orders O
on C.CustomerId = O.CustomerID
group by country
order by count(*))

Here I have written a subquery that joins the Customer and Orders Tables to determine the total number of orders for each country. The subquery uses the "TOP 1" clause to return the country with the fewest number of orders. The country with the fewest number of orders is then used in the WHERE clause to determine which Customer Information will be displayed.
Use of a Subquery in the FROM clause
The FROM clause normally identifies the tables used in the T-SQL statement. You can think of each of the tables identified in the FROM clause as a set of records. Well, a subquery is just a set of records, and therefore can be used in the FROM clause just like a table. Here is an example where a subquery is used in the FROM clause of a SELECT statement:
select au_lname, au_fname, title from
(select au_lname, au_fname, au_id from pubs.dbo.authors
where state = 'CA') as a
join
pubs.dbo.titleauthor ta on a.au_id=ta.au_id
join
pubs.dbo.titles t on ta.title_id = t.title_id

Here I have used a subquery to select only the author record information, if the author's record has a state column equal to "CA." I have named the set returned from this subquery with a table alias of "a." I can then use this alias elsewhere in the T-SQL statement to refer to the columns from the subquery by prefixing them with an "a", as I did in the "ON" clause of the "JOIN" criteria. Sometimes using a subquery in the FROM clause reduces the size of the set that needs to be joined. Reducing the number of records that have to be joined enhances the performance of joining rows, and therefore speeds up the overall execution of a query.
Here is an example where I used a subquery in the FROM clause of an UPDATE statement:
set nocount on
create table x(i int identity,
a char(1))
insert into x values ('a')
insert into x values ('b')
insert into x values ('c')
insert into x values ('d')
select * from x
update x
set a = b.a
from (select max(a) as a from x) b
where i > 2
select * from x
drop table x

Here I created a table named "x," that has four rows. Then I proceeded to update the rows where "i" was greater than 2 with the max value in column "a". I used a subquery in the FROM clause of the UPDATE statement to identity the max value of column "a."
Use of a Subquery in the HAVING clause
In the following example, I used a subquery to find the number of books a publisher has published where the publisher is not located in the state of California. To accomplish this I used a subquery in a HAVING clause. Here is my code:
select pub_name, count(*) bookcnt
from pubs.dbo.titles t
join
pubs.dbo.publishers p
on t.pub_id = p.pub_id
group by pub_name
having p.pub_name in
(select pub_name from pubs.dbo.publishers where state <> 'CA')

Here my subquery returns the pub_name values for all publishers that have a state value not equal to "CA." The HAVING condition then checks to see if the pub_name is in the set returned by my subquery.

(Taken From Databasejournal article by Gregory A. Larsen)

No comments: