The words you are searching are inside this book. To get more targeted content, please make full-text search by clicking here.

Beginning SQL - Paul Wilton

Discover the best professional documents and content resources in AnyFlip Document Base.
Search
Published by Leander Lim, 2022-12-01 22:11:08

Beginning SQL

Beginning SQL - Paul Wilton

Chapter 7

9
10
11
12
13
14
15

1
10
14

6
5
8
7
11
15
4
13
12
1

You can also order the results by adding an ORDER BY clause. However, only one ORDER BY clause is
allowed, and it must be after the last SELECT statement. In addition, you can use only column names
from the first SELECT statement in your ORDER BY clause, although the clause orders all the rows in all
the SELECT statements. The following query obtains the union of three queries and orders them by the
FilmName column. Note that this query won’t work in Oracle because Oracle doesn’t support the
YEAR() function and has a different ORDER BY syntax:

SELECT FilmName, YearReleased FROM Films
UNION ALL
SELECT LastName, YEAR(DateOfBirth) FROM MemberDetails
UNION ALL
SELECT City, NULL FROM Location
ORDER BY FilmName;

There are a few things to note about this query. First, even though the YearReleased and DateOfBirth
columns have a different data type, the value returned by the YEAR() function is an integer data type,
so this will match the YearReleased column, which is also an integer data type. Remove the YEAR()

230


Selecting Data from Different Tables

function and the query still works, except you might find that YearReleased is converted from an integer
into a date data type with some quite odd results. The YEAR() function is covered in Chapter 5, as is a dis-
cussion of data type conversion. Although normally CAST() can be used to convert data types, it’s not nec-
essary here because YEAR() returns the correct data type for the results.

Also notice that the ORDER BY clause comes at the end of the UNION statements and refers to the first
SELECT query. Again, this doesn’t work on IBM’s DB2, which allows the ORDER BY statement to come
last only if the name of the column being ordered appears in all the queries. Likewise, the ORDER BY
syntax when using UNION is different in Oracle. Rather than specifying the name of the column to be
ordered, you specify the position in which the column appears in the SELECT list. If you want to order
by FilmName, which is the first column, the ORDER BY clause would look like the following:

ORDER BY 1;

If you want to order by the second column, change the 1 to a 2. If you want to order by the third column,
then change the 1 to a 3, and so on.

Also note that the query includes only the City column from the Location table and that NULL is substi-
tuted for the second column. This works on many database platforms but not IBM DB2. Change the
value from NULL to an integer value that makes it clear that the value is not a genuine piece of data (-1,
for example).

Executing the query provides the following results:

15th Late Afternoon 1989
Big City NULL
Botts 1956
Dales 1956
Doors 1994
Gee 1967
Gone with the Window Cleaner 1988
Hawthorn NULL
Hills 1992
Jackson 1974
Johnson 1945
Jones 1952
Jones 1953
Night 1997
Nightmare on Oak Street, Part 23 1997

Table continued on following page

231


Chapter 7

On Golden Puddle 1967
One Flew over the Crow’s Nest 1975
Orange Town NULL
Planet of the Japes 1967
Raging Bullocks 1980
Sense and Insensitivity 2001
Simons 1937
Smith 1977
Soylent Yellow 1967
The Dirty Half Dozen 1987
The Good, the Bad, and the Facially Challenged 1989
The Life of Bob 1984
The Lion, the Witch, and the Chest of Drawers 1977
The Maltese Poodle 1947
The Wide-Brimmed Hat 2005
Windy Village NULL

The results are a combination of the three individual SELECT statements. The first statement is as follows:

SELECT FilmName, YearReleased FROM Films

It produces a list of film names and years of release. The second SELECT statement is shown here:

SELECT LastName, YEAR(DateOfBirth) FROM MemberDetails

This statement produces a list of members’ last names and years of birth. The third and final statement is
as follows:

SELECT City, NULL FROM Location

This particular statement produces a list of city names, and in the second column, it simply produces the
value NULL.

That completes the look at the UNION operator, and you should know how to produce one single set of
results based on the results of more than one query. While the UNION operator is in some ways similar to
joins, the biggest difference is that each query is totally separate; there’s no link between them.

232


Selecting Data from Different Tables

Summary

This chapter advanced your knowledge of joins far beyond the simple inner joins covered in Chapter 3.
You can now answer questions that were impossible to answer with a basic inner join. The chapter
started off by revisiting inner joins. It looked at how to achieve multiple joins with multiple conditions.
Additionally, this chapter covered the difference between equijoins, which use the equals operator, and
non-equijoins, which use comparison operators such as greater than or less than.
The chapter then turned to outer joins. Whereas inner joins require the ON condition to return true,
outer joins don’t. The chapter covered the following types of outer joins:

❑ Left outer joins, which return rows from the table on the left of the join, whether the ON clause is
true or not

❑ Right outer joins, which return rows from the table on the right, regardless of the ON clause
❑ Full outer joins, which return rows from the tables on both sides, even if the ON clause is false
In the final part of the chapter, you learned about how you can use the UNION statement to combine the
results from two or more SELECT queries into just one set of results.
The next chapter introduces the topic of subqueries and shows you how you can include another query
inside a query.

Exercises

1. List pairs of people who share the same favorite film category. The results set must include the

category name and the first and last names of the people who like that category.

2. List all the categories of film that no one has chosen as their favorite.

233


8

Queries within Queries

This chapter examines how you can nest one query inside another. SQL allows queries within
queries, or subqueries, which are SELECT statements inside SELECT statements. This might sound
a bit odd, but subqueries can actually be very useful. The downside, however, is that they can
consume a lot of processing, disk, and memory resources.

A subquery’s syntax is just the same as a normal SELECT query’s syntax. As with a normal SELECT
statement, a subquery can contain joins, WHERE clauses, HAVING clauses, and GROUP BY clauses.
Specifically, this chapter shows you how to use subqueries with SELECT statements, either return-
ing results inside a column list or helping filter results when used inside a WHERE or HAVING
clause; to update and delete data by using subqueries with UPDATE and DELETE statements; and
with operators such as EXISTS, ANY, SOME, and ALL, which are introduced later in this chapter.

Subqueries are particularly powerful when coupled with SQL operators such as IN, ANY, SOME,
and ALL, which are covered shortly. However, the chapter begins with a few easy examples of
subqueries. Before starting, you should note that versions of MySQL prior to version 4.1 do not
fully support subqueries and many of the examples in this chapter won’t work on early versions
of MySQL.

Subquer y Terminology

Throughout this chapter, you’ll notice references to the outer and inner subqueries. The outer
query is the main SELECT statement, and you could say that so far all of your SELECT statements
have been outer queries. Shown below is a standard query:

SELECT MemberId FROM Members;

Using the standard query, you can nest — that is, place inside the outer query — a subquery, which
is termed the inner query:

SELECT MemberId FROM MemberDetails
WHERE MemberId = (SELECT MAX(FilmId) FROM Films);


Chapter 8

A WHERE clause is added to the outer query, and it specifies that MemberId must equal the value
returned by the nested inner query, which is contained within brackets in the preceding example. It’s
also possible to nest a subquery inside the inner query. Consider the following example:

SELECT MemberId FROM MemberDetails
WHERE MemberId = (SELECT MAX(FilmId) FROM Films

WHERE FilmId IN (SELECT LocationId FROM Location));

In the preceding example, a subquery is added to the WHERE clause of the inner query.

In this chapter, a subquery inside a subquery is referred to as the innermost query. Topics are explained
more clearly as the chapter progresses, but you will start by looking at subqueries inside SELECT column
lists.

Subqueries in a SELECT List

You can include a subquery as one of the expressions returning a value in a SELECT query, just as you
can include a single column. However, the subquery must return just one record in one expression, in
what is known as a scalar subquery. The subquery must also be enclosed in brackets. An example makes
things a bit clearer:

SELECT Category,
(SELECT MAX(DVDPrice) FROM Films WHERE Films.CategoryId = Category.CategoryId),
CategoryId
FROM Category;

The SELECT query starts off by selecting the Category column, much as you’ve already seen a hundred
times before in the book. However, the next item in the list is not another column but rather a subquery.
This query inside the main query returns the maximum price of a DVD. An aggregate function returns
only one value, complying with the need for a subquery in a SELECT statement to be a scalar subquery.
The subquery is also linked to the outer SELECT query using a WHERE clause. Because of this link,
MAX(DVDPrice) returns the maximum price for each category in the Category table.

If you execute the whole query, you get the following results:

Category DVDPrice Category.CategoryId

Thriller 12.99 1
Romance 12.99 2
Horror 3
War 9.99 4
Sci-fi 12.99 5
Historical 12.99 6
Comedy 15.99 7
Film Noir NULL 9
NULL

236


Queries within Queries

There are no films with a value in the DVDPrice columns for the Comedy or Film Noir categories, so
NULL is returned.

Conceptually speaking, it’s worth taking a little time to go over how the results were formed. Each
DBMS has its own way of creating the results, but the underlying concept is the same.

Starting with the first row in the results, the Category is Thriller and the CategoryId is 1. The sub-
query is joined to the outer query by the CategoryId column present in both the Films and Category
tables. For the first row, the CategoryId is 1, so the subquery finds the maximum DVDPrice for all films
in the Films table where the CategoryId is 1. Moving to the next row in the outer query, the Category is
Romance and the CategoryId is 2. This time, the subquery finds the maximum DVDPrice for all records
where the CategoryId is 2. The process continues for every row in the Category table.

Without the WHERE clause in the subquery linking the subquery to the outer query, the result would sim-
ply be the maximum value of all rows returned by the subquery. If you change the SQL and remove the
WHERE clause in the subquery, you get the following statement:

SELECT Category,
(SELECT MAX(DVDPrice) FROM Films),
CategoryId
FROM Category;

Executing this query provides the following results:

Category MAX(DVDPrice) Category.CategoryId

Thriller 15.99 1
Romance 15.99 2
Horror 15.99 3
War 15.99 4
Sci-fi 15.99 5
Historical 15.99 6
Comedy 15.99 7
Film Noir 15.99 9

MAX(DVDPrice) is now simply the maximum DVDPrice for all records in the Films table and is not
specifically related to any category.

Although aggregate functions such as MAX, MIN, AVG, and so on are ideal for subqueries because they
return just one value, any expression or column is suitable as long as the results set consists of just one
row. For example, the following subquery works because it returns only one row:

SELECT FilmName, PlotSummary, (SELECT Email FROM MemberDetails WHERE MemberId = 1)
FROM Films;

MemberId is unique in the MemberDetails table, and therefore WHERE MemberId = 1 returns only one
row, and the query works:

237


Chapter 8

FilmName PlotSummary Email
The Dirty Half Dozen
Six men go to war wearing unwashed uniforms. [email protected]
On Golden Puddle The horror!

A couple finds love while wading through a puddle. [email protected]

The Lion, the Witch, and A fun film for all those interested in zoo/ [email protected]
the Chest of Drawers magic/furniture drama. [email protected]
Nightmare on Oak Street, The murderous Terry stalks Oak Street.
Part 23 [email protected]
The Wide-Brimmed Hat Fascinating life story of a wide-brimmed hat [email protected]
Sense and Insensitivity She longs for a new life with Mr. Arcy; he longs
for a small cottage in the Hamptons. [email protected]
Planet of the Japes Earth has been destroyed, to be taken over by
a species of comedians. [email protected]
The Maltese Poodle A mysterious bite mark, a guilty-looking poodle.
First-class thriller. [email protected]
One Flew over the Life and times of a scary crow.
Crow’s Nest [email protected]
Raging Bullocks A pair of bulls get cross with each other. [email protected]
The Life of Bob A seven-hour drama about Bob’s life. What fun! [email protected]
Gone with the Window Historical documentary on window cleaners.
Cleaner Thrilling. [email protected]
The Good, the Bad, and Joe seeks plastic surgery in this spaghetti Western.
the Facially Challenged [email protected]
15th Late Afternoon One of Shakespeare’s lesser-known plays [email protected]
Soylent Yellow Detective Billy Brambles discovers that Soylent
Yellow is made of soya bean. Ewwww!

The preceding results are a combination of FilmName and PlotSummary from the Films table and the
results of the subquery, which is the email address of the member with an ID of 1.

If, however, you change the MemberDetails table to the Attendance table, where MemberId appears
more than once, you get the following query:

SELECT FilmName, PlotSummary,
(SELECT MeetingDate FROM Attendance WHERE MemberId = 1)

FROM Films;

Executing this query results in an error message similar to the following:

“Subquery returned more than 1 value. This is not permitted when the subquery
follows =, !=, <, <= , >, >= or when the subquery is used as an expression.”

238


Queries within Queries

The error wording depends on the database system you’re using, but the overall message is the same:
More than one value — can’t do that with a subquery!

An alternative to using a literal value would be to link the inner query to the outer query, so long as the
result of the inner query produces only one row. For example, the following query shows the CategoryId
from the FavCategory table, and in the subquery, it shows FirstName from the MemberDetails table. The
MemberId in the subquery is linked to the FavCategory table’s MemberId for each row in the outer
query — thereby ensuring that the subquery returns only one row, as MemberIds are unique in the
MemberDetails table:

SELECT CategoryId,
(SELECT FirstName FROM MemberDetails WHERE MemberId = FavCategory.MemberId)
FROM FavCategory;

The results of executing this query are as follows:

CategoryId FirstName

1 John
1 Susie
1 Stuart
2 Katie
2 Doris
3 Susie
3 Jamie
3 William
4 Katie
4 Jenny
4 Jamie
4 Stuart
5 Steve
5 Jamie
5 William
6 Susie
6 Stuart
6 Doris
7 Steve
7 Jenny
7 Stuart

239


Chapter 8

If you think that your join would work just as well as and more efficiently than a subquery, then you’re
right; it would make more sense to use a join, but unfortunately, doing so wouldn’t demonstrate how to
use a subquery!

Hopefully you have a good feel for how subqueries work. So far, this chapter has examined only sub-
queries inside a SELECT statement’s column selection. In the next section, you look at subqueries used in
conjunction with a WHERE clause, which is actually the more common use for subqueries. After that, you
learn about some useful operators that you can use with subqueries.

Subqueries in the WHERE Clause

As far as subqueries are concerned, the WHERE clause is where it’s at! Terrible pun, but it is true that sub-
queries are at their most useful in WHERE clauses. The syntax and form that subqueries take in WHERE
clauses are identical to how they are used in SELECT statements, with one difference: now subqueries are
used in comparisons. It’s a little clearer with an example. Imagine that you need to find out the name or
names, if there are two or more of identical value, of the cheapest DVDs for each category. You want the
results to display the category name, the name of the DVD, and its price.

The data you need comes from the Category and Films tables. Extracting film and category data is easy
enough; the problem is how to choose only the cheapest DVD from each category. You could use a
GROUP BY clause, as shown in the following query:

SELECT Category, MIN(DVDPrice)
FROM Category INNER JOIN Films
ON Category.CategoryId = Films.CategoryId
GROUP BY Category;

The query’s results, however, don’t supply the name of the film, just the category name and the price of
the cheapest DVD:

Category MIN(DVDPrice)

Historical 8.95
Horror 8.95
Romance 12.99
Sci-fi 12.99
Thriller 2.99
War 12.99

The results are correct, but they don’t contain a film name. Add FilmName to the list of columns in the
SELECT statement, as shown in the following query:

SELECT Category, FilmName, MIN(DVDPrice)
FROM Category INNER JOIN Films
ON Category.CategoryId = Films.CategoryId
GROUP BY Category;

240


Queries within Queries

Now execute the query. You receive an error message because both Category and FilmName must
appear in the GROUP BY clause:

SELECT Category, FilmName, MIN(DVDPrice)
FROM Category INNER JOIN Films
ON Category.CategoryId = Films.CategoryId
GROUP BY Category, FilmName;

Execute the preceding query and you get the wrong results:

Category FilmName MIN(DVDPrice)

Historical 15th Late Afternoon NULL
Historical Gone with the Window Cleaner 9.99
Horror Nightmare on Oak Street, Part 23 9.99
Romance On Golden Puddle
Horror One Flew over the Crow’s Nest 12.99
War Planet of the Japes 8.95
Thriller Raging Bullocks
Historical Sense and Insensitivity 12.99
Sci-fi Soylent Yellow NULL
War The Dirty Half Dozen 15.99
Historical The Good, the Bad, and the Facially Challenged 12.99
Thriller The Life of Bob NULL
Horror The Lion, the Witch, and the Chest of Drawers
Thriller The Maltese Poodle 8.95
Sci-fi The Wide-Brimmed Hat 12.99
NULL

2.99
NULL

The results are wrong because they’re grouped by Category and FilmName, so the MIN(DVDPrice)
value is not the minimum price for a particular category but rather the minimum price for a particular
film in a particular category!

Clearly what you need is a list of all the lowest prices for a DVD per category, and this is where a sub-
query comes in. In the SQL query, you need to compare the price of a DVD with the minimum price for a
DVD in that category, and you want the query to return a record only if they match:

SELECT Category, FilmName, DVDPrice
FROM Category INNER JOIN Films
ON Category.CategoryId = Films.CategoryId
WHERE Films.DVDPrice =

(SELECT MIN(DVDPrice) FROM Films WHERE Films.CategoryId = Category.CategoryId);

241


Chapter 8

In the query, the Category and Films tables are joined with an inner join based on CategoryId. The query
then restricts which films form the results by use of a subquery in the WHERE clause. The subquery returns
the lowest-priced DVD for a particular category. The category returned by the subquery is linked to the
CategoryId of the Category table in the outer query. Note that Films.CategoryId inside the subquery
refers to the Films table in that subquery, not to the Films table outside the subquery. Then in the WHERE
clause of the outer query, Films.DVDPrice is compared with the minimum price for that category, which
is returned by the subquery. If there’s a match, then clearly the film’s price matches the value of the mini-
mum price.

The results are as follows:

Category FilmName DVDPrice

Thriller The Maltese Poodle 2.99
Romance On Golden Puddle 12.99
Horror One Flew over the Crow’s Nest
War Planet of the Japes 8.95
Sci-fi Soylent Yellow 12.99
Historical The Good, the Bad, and the Facially Challenged 12.99

8.95

Now that you have a good handle on how subqueries function and how you can include them in
SELECT lists and WHERE clauses, you can move on to more complex subqueries that employ various
operators, including IN, ANY, SOME, and ALL.

Operators in Subqueries

So far, all the subqueries you’ve seen have been scalar subqueries — that is, queries that only return only
one row. If more than one row is returned, you end up with an error. In this and the following sections,
you learn about operators that allow you to make comparisons against a multirecord results set.

Revisiting the IN Operator

You first learned about the IN operator in Chapter 3. Just to recap, the IN operator allows you to specify
that you want to match one item from any of those in a list of items. For example, the following SQL
finds all the members who were born in 1967, 1992, or 1937:

SELECT FirstName, LastName, YEAR(DateOfBirth)
FROM MemberDetails
WHERE YEAR(DateOfBirth) IN (1967, 1992, 1937);

Note that this code, and any code that contains the YEAR() function, won’t work in Oracle because it
doesn’t support the YEAR() function.

242


Queries within Queries

The query provides the following results:

FirstName LastName YEAR(DateOfBirth)

Steve Gee 1967
Susie Simons 1937
Jamie Hills 1992

What you didn’t learn in Chapter 3, however, is that you can also use the IN operator with subqueries.
Instead of providing a list of literal values, a SELECT query provides the list of values. For example, if
you want to know which members were born in the same year that a film in the Films table was
released, you’d use the following SQL query. Again, these examples won’t work in Oracle because it
doesn’t support the YEAR() function:

SELECT FirstName, LastName, YEAR(DateOfBirth)
FROM MemberDetails
WHERE YEAR(DateOfBirth) IN (SELECT YearReleased FROM Films);

Executing this query gives the following results:

FirstName LastName YEAR(DateOfBirth)

Katie Smith 1977
Steve Gee 1967
Doris Night 1997

The subquery (SELECT YearReleased FROM Films) returns a list of years from the Films table. If a
member’s year of birth matches one of the items in that list, then the WHERE clause is true and the record
is included in the final results.

You may have spotted that this is not the only way to get the result. You could have used an INNER
JOIN coupled with a GROUP BY statement instead, as shown in the following SQL:

SELECT FirstName, LastName, YEAR(DateOfBirth)
FROM MemberDetails JOIN Films ON YEAR(DateOfBirth) = YearReleased
GROUP BY FirstName, LastName, YEAR(DateOfBirth);

Running this query gives the same results as the previous query. So which is best? Unfortunately, there’s
no definitive answer; very much depends on the circumstances, the data involved, and the database sys-
tem involved. A lot of SQL coders prefer a join to a subquery and believe that to be the most efficient.
However, if you compare the speed of the two using MS SQL Server 2000, in this case, on that system, the
subquery is faster by roughly 15 percent. Given how few rows there are in the database, the difference
was negligible in this example, but it might be significant with a lot of records. Which way should you go?
You should go with the way that you find easiest, and fine-tune your SQL code only if problems occur
during testing. If you find on a test system with a million records that your SQL runs like an arthritic snail
with heavy shopping, then you should go back and see whether you can improve your query.

243


Chapter 8

There is one area in which subqueries are pretty much essential: when you want to find something is not
in a list, something very hard to do with joins. For example, if you want a list of all members who were
not born in the same year that any of the films in the Films table were released, you’d simply change
your previous subquery example from an IN operator to a NOT IN operator:

SELECT FirstName, LastName, YEAR(DateOfBirth)
FROM MemberDetails
WHERE YEAR(DateOfBirth) NOT IN (SELECT YearReleased FROM Films);

The new query gives the following results:

FirstName LastName YEAR(DateOfBirth)

John Jones 1952
Jenny Jones 1953
John Jackson 1974
Jack Johnson 1945
Seymour Botts 1956
Susie Simons 1937
Jamie Hills 1992
Stuart Dales 1956
William Doors 1994

Now a match occurs only if YEAR(DateOfBirth) is not found in the list produced by the subquery
SELECT YearReleased FROM Films. However, getting the same results using a join involves using an
OUTER JOIN, as shown in the following SQL:

SELECT FirstName, LastName, YEAR(DateOfBirth)
FROM MemberDetails
LEFT OUTER JOIN Films
ON Films.YearReleased = Year(MemberDetails.DateOfBirth)
WHERE YearReleased IS NULL;

The query produces almost identical results, except in the case of Catherine Hawthorn, whose date of birth
is NULL. The way the query works is that the left outer join returns all rows from MemberDetails, regardless
of whether YearReleased from the Films table finds a match in the MemberDetails table. From the discus-
sion of outer joins in Chapter 7, you know that NULL is returned when a match isn’t found. NULLs indicate
that YEAR(DateOfBirth) isn’t found in the Films table, so by adding a WHERE clause that returns rows
only when there’s a NULL value in YearReleased, you make sure that the query will find all the rows in
MemberDetails where there’s no matching year of birth in the Films table’s YearReleased column. The
following query modifies the SQL to remove the WHERE clause and show the YearReleased column:

SELECT FirstName, LastName, YearReleased
FROM MemberDetails
LEFT OUTER JOIN Films
ON Films.YearReleased = Year(MemberDetails.DateOfBirth)

ORDER BY YearReleased;

244


Queries within Queries

If you execute this query, you can see how the results came about:

FirstName LastName YearReleased

John Jones NULL
Jenny Jones NULL
John Jackson NULL
Jack Johnson NULL
Seymour Botts NULL
Susie Simons NULL
Jamie Hills NULL
Stuart Dales NULL
William Doors NULL
Catherine Hawthorn NULL
Steve Gee 1967
Steve Gee 1967
Steve Gee 1967
Katie Smith 1977
Doris Night 1997

The advantage of using an outer join in this situation is that it’s quite often more efficient, which can
make a big difference when there are a lot of records involved. The other advantage if you’re using
MySQL before version 4.1 is that subqueries are not supported, so an outer join is the only option. The
advantage of using subqueries is that they are easier to write and easier to read.

Using the ANY, SOME, and ALL Operators

The IN operator allows a simple comparison to see whether a value matches one in a list of values returned
by a subquery. The ANY, SOME, and ALL operators not only allow an equality match but also allow any com-
parison operator to be used. In this section, I’ll detail each of the ANY, SOME, and ALL operators, starting
with ANY and SOME.

ANY and SOME Operators

First off, ANY and SOME are identical; they do the same thing but have a different name. The text and
examples refer to the ANY operator, but you can use the SOME operator without it making one iota of dif-
ference. For ANY to return true to a match, the value being compared needs to match any one of the val-
ues returned by the subquery. You must place the comparison operator before the ANY keyword. For
example, the following SQL uses the equality (=) operator to find out whether any members have the
same birth year as the release date of a film in the Films table:

245


Chapter 8

SELECT FirstName, LastName, YEAR(DateOfBirth)
FROM MemberDetails
WHERE YEAR(DateOfBirth) = ANY (SELECT YearReleased FROM Films);

The WHERE clause specifies that YEAR(DateOfBirth) must equal any one of the values returned by the
subquery (SELECT YearReleased FROM Films). This subquery returns the following values, which
you may remember from an earlier example:

YearReleased

1987
1967
1977
1997
2005
2001
1967
1947
1975
1980
1984
1988
1989
1989
1967

If YEAR(DateOfBirth) equals any one of these values, then the condition returns true and the WHERE
clause allows the record into the final results. The final results of the query are as follows:

FirstName LastName YEAR(DateOfBirth)

Katie Smith 1977
Steve Gee 1967
Doris Night 1997

Seem familiar? Yup, that’s right, the results from = ANY are the same as using the IN operator:

SELECT FirstName, LastName, YEAR(DateOfBirth)
FROM MemberDetails
WHERE YEAR(DateOfBirth) IN (SELECT YearReleased FROM Films);

246


Queries within Queries

To obtain the same results as you would with the NOT IN operator, simply use the not equal (<>) opera-
tor, as shown in the following code:

SELECT FirstName, LastName, YEAR(DateOfBirth)
FROM MemberDetails
WHERE YEAR(DateOfBirth) <> ANY (SELECT YearReleased FROM Films);

Before you write off the ANY operator as just another way of using the IN operator, remember that you
can use ANY with operators other than equal (=) and not equal (<>). The following example query finds a
list of members who, while they were members, had the opportunity to attend a meeting:

SELECT FirstName, LastName
FROM MemberDetails
WHERE DateOfJoining < ANY (SELECT MeetingDate FROM Attendance);

The query checks to see whether the member joined before any one of the meeting dates. This query uses
the less than (<) operator with ANY, and the condition evaluates to true if a member’s DateOfJoining is
less than any of the values returned by the subquery (SELECT MeetingDate FROM Attendance).

The results are shown in the following table:

FirstName LastName

Katie Smith
Steve Gee

ALL Operator

The ALL operator requires that every item in the list (all the results of a subquery) comply with the con-
dition set by the comparison operator used with ALL. For example, if a subquery returns 3, 9, and 15,
then the following WHERE clause would evaluate to true because 2 is less than all the numbers in the
list:

WHERE 2 < ALL (3,9,15)

However, the following WHERE clause would evaluate to false because 7 is not less than all of the num-
bers in the list:

WHERE 7 < ALL (3,9,15)

This is just an example, though, and you can’t use ALL with literal numbers, only with a subquery.

Put the ALL operator into an example where you select MemberIds that are less than all the values
returned by the subquery (SELECT FilmId FROM Films WHERE FilmId > 5):

SELECT MemberId
FROM MemberDetails
WHERE MemberId < ALL (SELECT FilmId FROM Films WHERE FilmId > 5);

247


Chapter 8

The subquery (SELECT FilmId FROM Films WHERE FilmId > 5) returns the following values:

FilmId
6
7
8
9

10
11
12
13
14
15

Essentially, this means that the WHERE clause is as follows:
MemberId < ALL (6,7,8,9,10,11,12,13,14,15);

MemberId must be less than all of the listed values if the condition is to evaluate to true. The full results
for the example query are displayed in the following table:

MemberId
1
4
5

There is something to be aware of when using ALL: the situation when the subquery returns no results at
all. In this case, ALL will be true, which may seem a little weird, but it’s based on fundamental principles
of logic. So, if you change the subquery so that it returns no values and update the previous example,
you end up with the following code:

SELECT MemberId
FROM MemberDetails
WHERE MemberId < ALL (SELECT FilmId FROM Films WHERE FilmId > 99);

You know for a fact that there are no FilmIds with a value higher than 99, so the subquery returns an
empty set. However, the results set for the example is shown in the following table:

248


Queries within Queries

MemberId

1
10
14

6
5
8
7
11
15
4
13
12
9

The table represents every single row in the MemberDetails table; indeed, MemberId < ALL (SELECT
FilmId FROM Films WHERE FilmId > 99) has evaluated to true every time due to an empty results
set returned by the subquery.

That completes the look at the related operators ANY, SOME, and ALL. The next section continues the dis-
cussion of using operators with subqueries by examining the EXISTS operator.

Using the EXISTS Operator

The EXISTS operator is unusual in that it checks for rows and does not compare columns. So far you’ve
seen lots of clauses that compare one column to another. On the other hand, EXISTS simply checks to see
whether a subquery has returned one or more rows. If it has, then the clause returns true; if not, then it
returns false.

This is best demonstrated with three very simple examples:

SELECT City
FROM Location
WHERE EXISTS (SELECT * FROM MemberDetails WHERE MemberId < 5);

SELECT City
FROM Location
WHERE EXISTS (SELECT * FROM MemberDetails WHERE MemberId > 99);

SELECT City
FROM Location
WHERE EXISTS (SELECT * FROM MemberDetails WHERE MemberId = 15);

249


Chapter 8

Notice the use of the asterisk (*), which returns all columns, in the inner subqueries. EXISTS is row-
based, not column-based, so it doesn’t matter which columns are returned.

The first example uses the following SELECT query as its inner subquery:

(SELECT * FROM MemberDetails WHERE MemberId < 5);

This subquery provides a results set with three rows. Therefore, EXISTS evaluates to true, and you get
the following results:

City
Orange Town
Windy Village
Big City

The second example uses the following inner subquery:

SELECT * FROM MemberDetails WHERE MemberId > 99

This returns no results because no records in MemberDetails have a MemberId greater than 99.
Therefore, EXISTS returns false, and the outer SELECT statement returns no results.

Finally, the third example uses the following query as its inner subquery:

SELECT * FROM MemberDetails WHERE MemberId = 15

This query returns just one row, and even though some of the columns contain NULL values, it’s still a
valid row, and EXISTS returns true. In fact, even if the whole row contained NULLs, EXISTS would still
return true. Remember, NULL doesn’t mean no value; it means an unknown value. So, the results for the
full query in the example are as follows:

City
Orange Town
Windy Village
Big City

You can reverse the logic of EXISTS by using the NOT operator, essentially checking to see whether no
results are returned by the subquery. Modify the second example described previously, where the sub-
query returns no results, adding a NOT operator to the EXISTS keyword:

SELECT City
FROM Location
WHERE NOT EXISTS (SELECT * FROM MemberDetails WHERE MemberId > 99);

250


Queries within Queries

Now NOT EXISTS returns true, and as a result the WHERE clause is also true. The final results are
shown in the following table:

City

Orange Town
Windy Village
Big City

These examples are very simple and, to be honest, not that realistic in terms of how EXISTS is used. One
issue with the examples is that the subqueries don’t link in any way with the outer SELECT query,
whereas normally you’d expect them to do so. The following Try It Out example is a lot more realistic.
The task is to find out which categories have been chosen by three or more members as their favorite cat-
egory of film, but these favorite categories must contain at least one film with a rating higher than 3.

Try It Out Using the EXISTS Operator

Build the necessary query by following these steps:

1. First of all, formulate a SELECT statement that returns the values you want. In this case, you just

want the Category from the Category table:

SELECT Category
FROM Category

2. Now you need to add a WHERE clause to ensure that you get only categories that have films

rated 4 or higher and that have also been selected as a favorite by three or more members.

SELECT Category Films.CategoryId
FROM Category
WHERE EXISTS (SELECT * FROM Films

WHERE Category.CategoryId =
AND Rating > 3
)

3. Finally, you just want categories that three or more members have chosen as their favorite.

SELECT Category
FROM Category
WHERE EXISTS (SELECT * FROM Films

WHERE Category.CategoryId = Films.CategoryId
AND Rating > 3
AND (SELECT COUNT(CategoryId)

FROM FavCategory
WHERE FavCategory.CategoryId = Category.CategoryId)

>= 3
);

251


Chapter 8

4. You should receive the following results:

Category

Thriller
War
Sci-fi
Historical

How It Works

In the first step, you simply selected all the category names from the Category table — the main table
from which the data is extracted. The subqueries come in Step 2 to filter the results.

In Step 2, begin with the rating first. Information about films’ ratings is contained in the Films table, but
all you want to know is whether there are films in a particular category that have a rating higher than 3.
You’re not interested in knowing the names of the films, so you used an EXISTS operator with a sub-
query that returns a list of films rated higher than 3.

Next, you linked the outer and inner queries using the CategoryId column in the Category and Films
tables. Now, if for a particular category there are films rated higher than 3, the subquery returns rows
and EXISTS evaluates to true.

Finally, you used another subquery, nested inside the other subquery. This innermost subquery counts
how many members like a particular category; then in the WHERE clause of the outer subquery, you check
to see whether that value is 3 or more.

Using the HAVING Clause with Subqueries

Chapter 5 first introduced the HAVING clause, which is used to filter the groups displayed in a results set
when a GROUP BY clause has been used. For example, to get a list of cities where the average year of
birth of members in each city is greater than 1990, you could use the HAVING clause like this:

SELECT City
FROM MemberDetails
GROUP BY City
HAVING AVG(YEAR(DateOfBirth)) > 1990;

The preceding query compares against an actual value, 1990. A subquery is useful where you want to
compare against a value extracted from the database itself rather than a predetermined value. For exam-
ple, you might be asked to create a list of cities where the average year of birth is later than the average
for the membership as a whole. To do this, you could use a HAVING clause plus a subquery that finds out
the average year of birth of members:

SELECT City
FROM MemberDetails
GROUP BY City
HAVING AVG(YEAR(DateOfBirth)) >

(SELECT AVG(YEAR(DateOfBirth)) FROM MemberDetails);

252


Queries within Queries

This is the same as the query just mentioned, except 1990 is replaced with a subquery that returns the
average year of birth of all members, which happens to be 1965. The final results table is as follows:

City
Big City
Dover
New Town
Orange Town

Sometimes it’s necessary to refer to the outer query, or even a query nested inside a subquery. To do this,
you must give aliases to the tables involved, which is the topic of the next section on correlated subqueries.

Correlated Subquery

A correlated subquery is a subquery that references the outer query. A correlation variable is an alias given to
tables and used to reference those tables in the subquery. In Chapter 3 you learned how to give tables a
correlation, or alias name.

The following example demonstrates a correlated subquery. This example isn’t necessarily the only way
to get the result, but it does demonstrate a nested subquery. The query obtains from each category the
cheapest possible DVD with the highest rating:

SELECT FilmName, Rating, DVDPrice, Category
FROM Films AS FM1 INNER JOIN Category AS C1 ON C1.CategoryId = FM1.CategoryId
WHERE FM1.DVDPrice =

(SELECT MIN(DVDPrice)
FROM Films AS FM2
WHERE FM2.DVDPrice IS NOT NULL
AND FM1.CategoryId = FM2.CategoryId
AND FM2.Rating =
(SELECT MAX(FM3.Rating)
FROM Films AS FM3
WHERE FM3.DVDPrice IS NOT NULL
AND FM2.CategoryId = FM3.CategoryId
GROUP BY FM3.CategoryId)
GROUP BY FM2.CategoryId)

ORDER BY FM1.CategoryId;

If you’re using Oracle, remove the AS keywords because Oracle doesn’t require or support them when
creating an alias.

This is a big query, but when broken down it’s actually fairly simple. There are three queries involved:
the outer query and two inner subqueries. Begin with the most nested inner query:

(SELECT MAX(FM3.Rating)
FROM Films AS FM3
WHERE FM3.DVDPrice IS NOT NULL
AND FM2.CategoryId = FM3.CategoryId
GROUP BY FM3.CategoryId)

253


Chapter 8

This inner query returns the highest-rated film within each category. Films that have no price (where
DVDPrice is NULL, for example) are excluded, because you don’t want them forming part of the final
results. The Films table is given the alias FM3. The query also refers to the FM2 table, which is the alias
given to the subquery in which this query is nested. The CategoryId columns of FM2 and FM3 are inner-
joined to ensure that the MAX rating and MIN price are both from the same category. If you expand the
SQL and now include the innermost query just described, and also the nested subquery that it was
inside, you arrive at the following SQL:

(SELECT MIN(DVDPrice)
FROM Films AS FM2
WHERE FM2.DVDPrice IS NOT NULL
AND FM1.CategoryId = FM2.CategoryId
AND FM2.Rating =
(SELECT MAX(FM3.Rating)
FROM Films AS FM3
WHERE FM3.DVDPrice IS NOT NULL
AND FM2.CategoryId = FM3.CategoryId
GROUP BY FM3.CategoryId)
GROUP BY FM2.CategoryId)

The outer query here returns the minimum DVD price for the highest-rated film in each category.
Remember that the innermost subquery returns the rating of the highest-rated DVD. The outer query’s
WHERE clause is set to specify that the film should also have a rating equal to the one returned by the
subquery — thereby ensuring that it has a rating equal to the highest in the same category. Note that
although an inner query can refer to any alias of a table outside the query, an outer query can’t reference
any tables in a query nested inside it. For example, this SQL won’t work:

(SELECT MIN(DVDPrice)
FROM Films AS FM2
WHERE FM2.CategoryId = FM3.CategoryId

...
...
...

The table with alias FM3 is referenced, but that table is a nested subquery inside, and therefore it can’t be
referenced.

Finally, the entire query, with the outermost SELECT statement, is as follows:

SELECT FilmName, Rating, DVDPrice, Category
FROM Films AS FM1 INNER JOIN Category AS C1 ON C1.CategoryId = FM1.CategoryId
WHERE FM1.DVDPrice =

(SELECT MIN(DVDPrice)
...
...
...
ORDER BY FM1.CategoryId;

The outermost query is a join between the Films table with an alias of FM1 and the Category table with
an alias of C1. A WHERE clause specifies that FM1.DVDPrice must be equal to the DVD price returned by
the subquery, the lowest price of a DVD in that category. This inner query, as you’ve seen, has a WHERE

254


Queries within Queries

clause specifying that the film’s rating must be equal to the highest-rated film for that category, as
returned by the innermost subquery. So, three queries later, you have the lowest-priced, highest-rated
film for each category. The results are displayed in the following table:

FilmName Rating DVDPrice Category

The Maltese Poodle 1 2.99 Thriller
On Golden Puddle 4 12.99 Romance
One Flew over the Crow’s Nest 2 Horror
Planet of the Japes 5 8.95 War
Soylent Yellow 5 12.99 Sci-fi
The Good, the Bad, and the Facially Challenged 5 12.99 Historical

8.95

That completes this chapter’s coverage of subqueries used with SELECT statements. The next section
looks at other SQL statements with which you can use subqueries.

Subqueries Used with Other Statements

So far, all the attention as far as subqueries go has been lavished on the SELECT statement. However, you
can also use subqueries with statements inserting new data, updating data, or deleting data. The princi-
ples are the same, so what you’ve learned so far about subqueries pretty much applies the same to
INSERT, DELETE, and UPDATE statements.

Using Subqueries with the INSERT Statement

Chapter 5 covered using SELECT and INSERT INTO statements. You no longer have to insert literal val-
ues, like this

INSERT INTO FavCategory (CategoryId, MemberId) VALUES (7,15)

Instead, you can use data from the database:

INSERT INTO FavCategory (CategoryId, MemberId) SELECT 7, MemberId FROM
MemberDetails WHERE LastName = ‘Hawthorn’ AND FirstName = ‘Catherine’;

You can take this one step further and use subqueries to provide the data to be inserted. The film club
chairperson has discovered that no one has selected the Film Noir category as their favorite. She’s
decided that people who like thrillers might also like film noir, so she wants all members who put the
Thriller category down as a favorite to also now have Film Noir added as one of their favorite categories.

Film Noir has a CategoryId of 9; Thriller has a CategoryId of 1. You could use SQL to extract these val-
ues, but to keep the SQL slightly shorter and clearer for an example, use the CategoryId values without
looking them up.

255


Chapter 8

It’s quite important not to make a mistake when using INSERT INTO with SELECT queries, or else a
whole load of incorrect data might get inserted. Unless your query is very simple, it’s preferable to cre-
ate the SELECT part of the SQL first, double-check that it is giving the correct results, and then add the
INSERT INTO bit, which is very simple anyway.

For the SELECT part of the query, you need to extract a MemberId from the MemberDetails table, where
that member has selected the Thriller category, that is, where CategoryId equals 1. To prevent errors, you
need to make sure that the member hasn’t already selected the Film Noir category, a CategoryId of 9. It’s
playing safe, as currently no one has selected the Film Noir category, but just in case someone updates
the database in the meantime, you should check that a member hasn’t selected CategoryId 9 as a favorite
before trying to make CategoryId 9 one of their favorites.

Try It Out Nesting a Subquery within an INSERT Statement

In order to create the necessary statement, follow these steps:

1. Begin by building the outer part of the SELECT query:

SELECT 9, MemberId FROM MemberDetails AS MD1;

This portion of the query returns all the rows from the MemberDetails table. Oracle doesn’t sup-
port the AS keyword as a way of defining an alias — it just needs the alias name — so remove
the AS after MemberDetails if you’re using Oracle.

2. Now add the first part of your WHERE clause that checks to see whether the member has selected

Thriller as one of their favorite categories:

SELECT 9, MemberId FROM MemberDetails AS MD1
WHERE EXISTS

(SELECT * from FavCategory FC1
WHERE FC1.CategoryId = 1 AND FC1.MemberId = MD1.MemberId);

3. Now modify the subquery so that it returns rows only if the member hasn’t already selected

CategoryId 9 as one of their favorites:

SELECT 9, MemberId FROM MemberDetails AS MD1
WHERE EXISTS

(SELECT * from FavCategory FC1
WHERE FC1.CategoryId = 1 AND FC1.MemberId = MD1.MemberId
AND NOT EXISTS
(SELECT * FROM FavCategory AS FC2
WHERE FC2.MemberId = FC1.MemberId AND
FC2.CategoryId = 9));

Notice the nested subquery inside the other subquery that checks that there are no rows
returned (that they do not exist), where the current MemberId has selected a favorite
CategoryId of 9. Execute this query and you get the following results:

Literal Value of 9 MemberId

95
9 10
9 12

256


Queries within Queries

4. Next, quickly double-check a few of the results to see whether they really are correct. Having

confirmed that they are, you can add the INSERT INTO bit:

INSERT INTO FavCategory (CategoryId, MemberId)
SELECT 9, MemberId FROM MemberDetails AS MD1
WHERE EXISTS

(SELECT * from FavCategory FC1
WHERE FC1.CategoryId = 1 AND FC1.MemberId = MD1.MemberId
AND NOT EXISTS
(SELECT * FROM FavCategory AS FC2
WHERE FC2.MemberId = FC1.MemberId AND
FC2.CategoryId = 9));

The INSERT INTO inserts the literal value 9 (representing the Film Noir’s CategoryId) and
MemberId into the FavCategory table. When you execute the SQL, you should find that four
rows are added. Execute it a second, a third, or however many times, and no more rows will be
added as the query checks for duplication, a good safeguard.

How It Works

In the first step, you selected 9, or the CategoryId of the Thriller category, and also MemberId from the
MemberDetails table.

In the next step, you added a WHERE clause that uses a subquery. Using the EXISTS operator, it checks to
see whether there are any rows when selecting records from the FavCategory table where the MemberId
is the same as that in the MemberDetails table and where CategoryId is equal to 1 (the Thriller ID).
Using FC1.MemberId = MD1.MemberId joins MemberDetails and FavCategory, making sure that the
current rows in the MemberDetails and FavCategory tables are the same; otherwise the subquery would
just return rows where any member had selected a CategoryId of 1. You also gave an alias to the
FavCategory table of the inner query. The alias is then used in the inner query’s WHERE clause and in the
third step when another query is nested inside the inner query from Step 2.

Step 3 sees an inner, inner query added to the previous subquery’s WHERE clause. Its role is to look for
members who already have category 9 as one of their favorites. It does this by checking that the inner-
most query doesn’t return rows where the CategoryId is 9. The inner subquery, with its FavCategory
table with an alias of FC1, is linked to the innermost query where the FavCategory table has been given
the alias FC2. They are linked by MemberId, because you want to make sure when checking for a mem-
ber’s current favorites that it’s the same member. The second condition in the innermost query’s WHERE
clause checks to see whether the favorite category is one with an ID of 9.

In the final step, the outer query is used to provide the data for an INSERT INTO statement that adds
favorite category 9 as a member’s favorite.

Using Subqueries with the UPDATE Statement

As with INSERT INTO, you can use subqueries to supply values for updating or to determine the WHERE
clause condition in the same way that subqueries can be used with regular queries’ WHERE clauses. An
example makes this clear. Imagine that the film club chairperson has decided to boost profits by selling
DVDs. To maximize profits, she wants films that are rated higher than 3 and that appear in four or more
members’ favorite categories to have their prices hiked up to that of the highest price in the database.

257


Chapter 8

To tackle this query, you need to break it down. First, the new DVDPrice is to be the maximum price of
any film in the Films table. For this, you need a subquery that returns the maximum DVDPrice using the
MAX() function. Note that neither MS Access nor MySQL support updating a column using a subquery:

SELECT MAX(DVDPrice) FROM Films;

This query is used as a subquery in the UPDATE statement’s SET statement, but before executing this
statement, you need to add the WHERE clause:

UPDATE Films
SET DVDPrice = (SELECT MAX(DVDPrice) FROM Films);

You should create and test the WHERE clause inside a SELECT query before adding the WHERE clause to
the UPDATE statement. A mistake risks changing, and potentially losing, a lot of DVDPrice data that
shouldn’t be changed.

Now create the WHERE clause. It needs to limit rows to those where the film’s rating is higher than 3 and
where the film is in a category selected as a favorite by three or more members. You also need to check
whether the film is available on DVD — no point updating prices of films not for sale! The rating and
availability parts are easy enough. You simply need to specify that the AvailableOnDVD column be set
to Y and that the Films.Rating column be greater than 3:

SELECT CategoryId, FilmName FROM Films
WHERE AvailableOnDVD = ‘Y’ AND Films.Rating > 3;

Now for the harder part: selecting films that are in a category chosen as a favorite by three or more
members. For this, use a subquery that counts how many members have chosen a particular category as
their favorite:

SELECT CategoryId, FilmName FROM Films
WHERE (SELECT COUNT(*) FROM FavCategory

WHERE FavCategory.CategoryId = Films.CategoryId) >= 3
AND AvailableOnDVD = ‘Y’ AND Films.Rating > 3;

The COUNT(*) function counts the number of rows returned by the subquery; it counts how many mem-
bers have chosen each category as their favorite. The WHERE clause of the subquery makes sure that the
Films table in the outer query is linked to the FavCategory table of the inner query.

Execute this query and you should get the following results:

CategoryId FilmName

4 Planet of the Japes
6 The Good, the Bad, and the Facially Challenged
5 Soylent Yellow

258


Queries within Queries

Now that you’ve confirmed that the statement works and double-checked the results, all that’s left to do
is to add the WHERE part of the query to the UPDATE statement you created earlier:

UPDATE Films SET DVDPrice = (SELECT MAX(DVDPrice) FROM Films)
WHERE (SELECT COUNT(*) FROM FavCategory WHERE FavCategory.CategoryId =
Films.CategoryId) >= 3

AND AvailableOnDVD = ‘Y’ AND Films.Rating > 3;

Execute the final SQL and you should find that three rows are updated with DVDPrice being set to
15.99. This SQL won’t work on MySQL or MS Access, as they don’t support using subqueries to update
a column, so you’ll need to change the query to the following if you want the results displayed to match
those shown later in the book:

UPDATE Films SET DVDPrice = 15.99
WHERE (SELECT COUNT(*) FROM FavCategory WHERE FavCategory.CategoryId =
Films.CategoryId) >= 3

AND AvailableOnDVD = ‘Y’ AND Films.Rating > 3;

Using Subqueries with the DELETE FROM Statement

The only place for a subquery to go in a DELETE statement is in the WHERE clause. Everything you’ve
learned so far about subqueries in a WHERE clause applies to its use with a DELETE statement, so you can
launch straight into an example. In this example, you want to delete all locations where one or fewer
members live and where a meeting has never been held.

As before, create the SELECT queries first to double-check your results, and then use the WHERE clauses
with a DELETE statement. First, you need a query that returns the number of members living in each city:

SELECT COUNT(*), City
FROM MemberDetails
GROUP BY City, State;

This query is used as a subquery to check whether one or fewer members live in a particular city. It pro-
vides the following results:

COUNT(*) City

1 NULL
1 Dover
2 Windy Village
2 Big City
2 Townsville
1 New Town
4 Orange Town

259


Chapter 8

Now you need to find a list of cities in which a meeting has never been held:

SELECT LocationId, City
FROM Location
WHERE LocationId NOT IN (SELECT LocationId FROM Attendance);

The subquery is used to find all LocationIds from the Location table that are not in the Attendance table.
This subquery provides the following results:

LocationId City
3 Big City

Now you need to combine the WHERE clauses and add them to a DELETE statement. Doing so combines
the WHERE clause of the preceding query, which included the following condition:

LocationId NOT IN (SELECT LocationId FROM Attendance);

Likewise, it combines the SELECT statement of the first example as a subquery:

SELECT COUNT(*), City
FROM MemberDetails
GROUP BY City, State;

The condition and the subquery are merged into the WHERE clause of the DELETE statement:

DELETE FROM Location
WHERE (SELECT COUNT(*) FROM MemberDetails

WHERE Location.City = MemberDetails.City
AND
Location.State = MemberDetails.State
GROUP BY City, State) <= 1
AND
LocationId NOT IN (SELECT LocationId FROM Attendance);

In the first subquery, you count how many members live in a particular location from the Location table.
If it’s one or less, then that part of the WHERE clause evaluates to true. Note that the Location table of the
DELETE statement and the MemberDetails tables of the subquery have been joined on the State and City
fields:

Location.State = MemberDetails.State

This is absolutely vital, otherwise the subquery returns more than one result.

The second condition in the WHERE clause specifies that LocationId must not be in the Attendance table.
Because an AND clause joins the two conditions in the WHERE clause, both conditions must be true.

If you execute the query, you find that no records match both conditions, and therefore no records are
deleted from the Location table.

260


Queries within Queries

Summary

This chapter covered the nesting of a query inside another query, a powerful tool that you can use when
faced with tricky data extraction questions. Specifically, this chapter covered the following topics:

❑ A subquery is just like a normal query, with SELECT clauses, WHERE statements, and GROUP BY
and HAVING clauses.

❑ Subqueries can be included in a SELECT statement’s column listing, or they can be used inside a
WHERE statement to enable results to be filtered.

❑ Aggregate functions are commonly used with subqueries because a subquery must normally
return just one record for each record in the main query.

❑ A subquery can return multiple rows when used with an IN, ANY, ALL, or EXISTS operator.
These operators function as follows:
❑ The IN operator is used to find out whether a value is in one of the values returned by a
subquery.
❑ The ANY operator allows a comparison between a value and any of the values returned
by a subquery. Comparisons are not limited to just equals (=); rather, you can use any
comparison operator, such as greater than (>), less than (<), and so on.
❑ The ALL operator requires that every item in the list, or all the results of a subquery,
comply with the condition set by the comparison operator used with ALL.
❑ Whereas the other operators work on a column basis, EXISTS simply checks that a sub-
query returns one or more rows. It doesn’t make any comparison of column values.

❑ Although queries are the main place where subqueries are used, you can also use them with
INSERT INTO, UPDATE, and DELETE statements, either to filter what’s changed or deleted or to
provide the data to be inserted or changed.

That concludes this chapter’s coverage of writing queries. This book has covered many of the things
you’ll encounter in your everyday SQL programming. The next chapter tackles all the theory and prac-
tice learned so far and puts it to use with some tricky query writing. That chapter also shows you how to
tackle those mind-boggling queries and come out sane!

Exercises

1. The film club chairperson decides that she’d like a list of the total cost of all films for each cate-

gory and the number of members who have chosen each category as their favorite.

2. List all towns that have two or more members but are not currently listed in the Location table.

261


9

Advanced Queries

In database programming, you’ll find that roughly 95 percent of queries are fairly straightforward
and are just a matter of working out what columns are required and including a simple WHERE
clause to filter out the unwanted results. This chapter is all about how to tackle the other 5 percent,
which are difficult and complex queries. This chapter also presents a number of questions and
examines how to write the SQL to answer them. Specifically, this chapter covers the following:

❑ Tackling complex queries
❑ Formulating precise SELECT column lists and FROM clauses
❑ Writing ruthlessly efficient queries

Before getting into the specifics of the chapter, you need to begin by making some additions to the
Film Club database.

Updating the Database

In order to give more scope for tricky queries and avoid repeating examples from previous chap-
ters, this chapter extends the Film Club database and adds some new tables and data. Imagine that
the film club chairperson wants to sell DVDs to members; the film club will employ salespeople to
contact members and sell them DVDs. Therefore, you want to store details of the salespeople,
details of orders taken, and details of what each order contains. In order to do this, you need to
create three new tables (Orders, OrderItems, and SalesPerson), as shown in Figure 9-1:


Chapter 9

Figure 9-1

The SQL needed to create the new tables is shown subsequently. Note, however, that MS SQL Server
doesn’t support the date data type but instead uses the datetime data type. Therefore, you need to
change the data type for the OrderDate column in the Orders table from date to datetime.

CREATE TABLE SalesPerson
(

SalesPersonId integer NOT NULL PRIMARY KEY,
FirstName varchar(50) NOT NULL,
LastName varchar(50) NOT NULL
);
CREATE TABLE Orders
(
OrderId integer NOT NULL Primary Key,
MemberId integer,
SalesPersonId integer,
OrderDate date,
CONSTRAINT SalesPerOrders_FK
FOREIGN KEY (SalesPersonId)
REFERENCES SalesPerson(SalesPersonId),
CONSTRAINT MemberDetOrders_FK
FOREIGN KEY (MemberId)
REFERENCES MemberDetails(MemberId)
);
CREATE TABLE OrderItems
(
OrderId INTEGER NOT NULL,
FilmId INTEGER NOT NULL,
CONSTRAINT Orders_FK

264


Advanced Queries

FOREIGN KEY (OrderId)
REFERENCES Orders(OrderId),
CONSTRAINT Films_FK
FOREIGN KEY (FilmId)
REFERENCES Films(FilmId),
CONSTRAINT OrderItems_PK PRIMARY KEY (OrderId, FilmId)
);

You need to execute this SQL against the database before continuing with the chapter. As well as creat-
ing the tables, also notice that primary keys and foreign keys are specified where appropriate. For exam-
ple, MemberId in the Orders table is actually taken from the MemberId in the MemberDetails table — it’s
a foreign key. Therefore, a foreign key relationship is specified between MemberId in the Orders and
MemberDetails tables to ensure that no invalid data is entered.

You also need to add some salesperson data to the SalesPerson table:

INSERT INTO SalesPerson(SalesPersonId, FirstName, LastName)
VALUES (1,’Sandra’,’Hugson’);

INSERT INTO SalesPerson(SalesPersonId, FirstName, LastName)
VALUES (2,’Frasier’,’Crane’);

INSERT INTO SalesPerson(SalesPersonId, FirstName, LastName)
VALUES (3,’Daphne’,’Moon’);

Likewise, you need to add some Orders and OrderItems. Again, you can save a whole lot of typing by
downloading the following code from www.wrox.com. Remember, if you’re using Oracle, you need to
change the date format for the OrderDate column to day-month_name-year

INSERT INTO Orders(OrderId, MemberId, SalesPersonId, OrderDate)
VALUES (10,7,1,’2006-07-30’);

INSERT INTO OrderItems(OrderId,FilmId) Values (10,4);
INSERT INTO OrderItems(OrderId,FilmId) Values (10,8);

INSERT INTO Orders(OrderId, MemberId, SalesPersonId, OrderDate)
VALUES (11,4,1,’2006-08-06’);

INSERT INTO OrderItems(OrderId,FilmId) Values (11,13);

INSERT INTO Orders(OrderId, MemberId, SalesPersonId, OrderDate)
VALUES (12,6,1, ‘2006-08-18’);

INSERT INTO OrderItems(OrderId,FilmId) Values (12,9);
INSERT INTO OrderItems(OrderId,FilmId) Values (12,8);

INSERT INTO Orders(OrderId, MemberId, SalesPersonId, OrderDate)
VALUES (13,4,1,’2006-09-29’);

INSERT INTO OrderItems(OrderId,FilmId) Values (13,12);

265


Chapter 9

INSERT INTO OrderItems(OrderId,FilmId) Values (13,9);
INSERT INTO Orders(OrderId, MemberId, SalesPersonId, OrderDate)
VALUES (14,11,1,’2006-09-30’);
INSERT INTO OrderItems(OrderId,FilmId) Values (14,8);

INSERT INTO Orders(OrderId, MemberId, SalesPersonId, OrderDate)
VALUES (15,15,1, ‘2006-10-01’);
INSERT INTO OrderItems(OrderId,FilmId) Values (15,11);

INSERT INTO Orders(OrderId, MemberId, SalesPersonId, OrderDate)
VALUES (16,14,1,’2006-10-01’);
INSERT INTO OrderItems(OrderId,FilmId) Values (16,2);
INSERT INTO OrderItems(OrderId,FilmId) Values (16,8);
INSERT INTO OrderItems(OrderId,FilmId) Values (16,15);
INSERT INTO Orders(OrderId, MemberId, SalesPersonId, OrderDate)
VALUES (17,9, 1,’2006-11-23’);
INSERT INTO OrderItems(OrderId,FilmId) Values (17,4);

INSERT INTO Orders(OrderId, MemberId, SalesPersonId, OrderDate)
VALUES (18,6,1, ‘2006-11-27’);
INSERT INTO OrderItems(OrderId,FilmId) Values (18,9);
INSERT INTO OrderItems(OrderId,FilmId) Values (18,8);
INSERT INTO Orders(OrderId, MemberId, SalesPersonId, OrderDate)
VALUES (19,9,1,’2006-11-29’);
INSERT INTO OrderItems(OrderId,FilmId) Values (19,13);

INSERT INTO Orders(OrderId, MemberId, SalesPersonId, OrderDate)
VALUES (20,4,1,’2006-12-02’);
INSERT INTO OrderItems(OrderId,FilmId) Values (20,12);
INSERT INTO OrderItems(OrderId,FilmId) Values (20,2);
INSERT INTO OrderItems(OrderId,FilmId) Values (20,13);
INSERT INTO OrderItems(OrderId,FilmId) Values (20,15);

INSERT INTO Orders(OrderId, MemberId, SalesPersonId, OrderDate)
VALUES (21,9,1, ‘2006-12-12’);
INSERT INTO OrderItems(OrderId,FilmId) Values (21,11);

266


Advanced Queries

INSERT INTO Orders(OrderId, MemberId, SalesPersonId, OrderDate)
VALUES (22,8,1,’2006-12-19’);
INSERT INTO OrderItems(OrderId,FilmId) Values (22,9);
INSERT INTO OrderItems(OrderId,FilmId) Values (22,15);

INSERT INTO Orders(OrderId, MemberId, SalesPersonId, OrderDate)
VALUES (23,9,1,’2007-01-01’);
INSERT INTO OrderItems(OrderId,FilmId) Values (23,2);

INSERT INTO Orders(OrderId, MemberId, SalesPersonId, OrderDate)
VALUES (24,6,1, ‘2007-01-09’);
INSERT INTO OrderItems(OrderId,FilmId) Values (24,15);

INSERT INTO Orders(OrderId, MemberId, SalesPersonId, OrderDate)
VALUES (25,7, 1,’2007-01-13’);
INSERT INTO OrderItems(OrderId,FilmId) Values (25,8);
--

INSERT INTO Orders(OrderId, MemberId, SalesPersonId, OrderDate)
VALUES (1,7,2,’2006-10-23’);
INSERT INTO OrderItems(OrderId,FilmId) Values (1,12);
INSERT INTO OrderItems(OrderId,FilmId) Values (1,15);

INSERT INTO Orders(OrderId, MemberId, SalesPersonId, OrderDate)
VALUES (2,4,2,’2006-10-30’);
INSERT INTO OrderItems(OrderId,FilmId) Values (2,4);
INSERT INTO OrderItems(OrderId,FilmId) Values (2,12);
INSERT INTO OrderItems(OrderId,FilmId) Values (2,13);
INSERT INTO Orders(OrderId, MemberId, SalesPersonId, OrderDate)
VALUES (3,9,2, ‘2006-10-11’);
INSERT INTO OrderItems(OrderId,FilmId) Values (3,8);

INSERT INTO Orders(OrderId, MemberId, SalesPersonId, OrderDate)
VALUES (4,7,2,’2006-11-12’);
INSERT INTO OrderItems(OrderId,FilmId) Values (4,11);

INSERT INTO Orders(OrderId, MemberId, SalesPersonId, OrderDate)

267


Chapter 9

VALUES (5,7, 2,’2006-12-02’);

INSERT INTO OrderItems(OrderId,FilmId) Values (5,9);
INSERT INTO OrderItems(OrderId,FilmId) Values (5,8);
INSERT INTO OrderItems(OrderId,FilmId) Values (5,12);
INSERT INTO OrderItems(OrderId,FilmId) Values (5,11);
INSERT INTO OrderItems(OrderId,FilmId) Values (5,2);

INSERT INTO Orders(OrderId, MemberId, SalesPersonId, OrderDate)
VALUES (6,10,2, ‘2006-12-11’);

INSERT INTO OrderItems(OrderId,FilmId) Values (6,8);

INSERT INTO Orders(OrderId, MemberId, SalesPersonId, OrderDate)
VALUES (7,4,2,’2006-10-23’);

INSERT INTO OrderItems(OrderId,FilmId) Values (7,6);

INSERT INTO Orders(OrderId, MemberId, SalesPersonId, OrderDate)
VALUES (8,4,2,’2006-10-30’);

INSERT INTO OrderItems(OrderId,FilmId) Values (8,13);
INSERT INTO OrderItems(OrderId,FilmId) Values (8,6);

INSERT INTO Orders(OrderId, MemberId, SalesPersonId, OrderDate)
VALUES (9,10,2, ‘2006-10-11’);

INSERT INTO OrderItems(OrderId,FilmId) Values (9,13);
INSERT INTO OrderItems(OrderId,FilmId) Values (9,2);
INSERT INTO OrderItems(OrderId,FilmId) Values (9,4);

--

INSERT INTO Orders(OrderId, MemberId, SalesPersonId, OrderDate)
VALUES (26,11,3, ‘2006-11-09’);

INSERT INTO OrderItems(OrderId,FilmId) Values (26,6);
INSERT INTO OrderItems(OrderId,FilmId) Values (26,15);
INSERT INTO OrderItems(OrderId,FilmId) Values (26,11);
INSERT INTO OrderItems(OrderId,FilmId) Values (26,4);

INSERT INTO Orders(OrderId, MemberId, SalesPersonId, OrderDate)
VALUES (27,13,3,’2006-11-19’);

INSERT INTO OrderItems(OrderId,FilmId) Values (27,6);
INSERT INTO OrderItems(OrderId,FilmId) Values (27,15);
INSERT INTO OrderItems(OrderId,FilmId) Values (27,11);
INSERT INTO OrderItems(OrderId,FilmId) Values (27,4);
INSERT INTO OrderItems(OrderId,FilmId) Values (27,9);

268


Advanced Queries

INSERT INTO OrderItems(OrderId,FilmId) Values (27,2);
INSERT INTO OrderItems(OrderId,FilmId) Values (27,12);

INSERT INTO Orders(OrderId, MemberId, SalesPersonId, OrderDate)
VALUES (28,4,3,’2006-12-12’);

INSERT INTO OrderItems(OrderId,FilmId) Values (28,12);
INSERT INTO OrderItems(OrderId,FilmId) Values (28,4);
INSERT INTO OrderItems(OrderId,FilmId) Values (28,15);
INSERT INTO OrderItems(OrderId,FilmId) Values (28,2);
INSERT INTO OrderItems(OrderId,FilmId) Values (28,11);
INSERT INTO OrderItems(OrderId,FilmId) Values (28,13);

INSERT INTO Orders(OrderId, MemberId, SalesPersonId, OrderDate)
VALUES (29,9,3, ‘2006-12-20’);

INSERT INTO OrderItems(OrderId,FilmId) Values (29,13);
INSERT INTO OrderItems(OrderId,FilmId) Values (29,9);
INSERT INTO OrderItems(OrderId,FilmId) Values (29,11);

INSERT INTO Orders(OrderId, MemberId, SalesPersonId, OrderDate)
VALUES (30,12,3,’2006-12-21’);

INSERT INTO OrderItems(OrderId,FilmId) Values (30,2);
INSERT INTO OrderItems(OrderId,FilmId) Values (30,4);
INSERT INTO OrderItems(OrderId,FilmId) Values (30,6);
INSERT INTO OrderItems(OrderId,FilmId) Values (30,8);
INSERT INTO OrderItems(OrderId,FilmId) Values (30,9);
INSERT INTO OrderItems(OrderId,FilmId) Values (30,11);
INSERT INTO OrderItems(OrderId,FilmId) Values (30,12);
INSERT INTO OrderItems(OrderId,FilmId) Values (30,13);

INSERT INTO OrderItems(OrderId,FilmId) Values (30,15);

INSERT INTO Orders(OrderId, MemberId, SalesPersonId, OrderDate)
VALUES (31,1,3,’2007-01-14’);

INSERT INTO OrderItems(OrderId,FilmId) Values (31,2);
INSERT INTO OrderItems(OrderId,FilmId) Values (31,15);
INSERT INTO OrderItems(OrderId,FilmId) Values (31,11);
INSERT INTO OrderItems(OrderId,FilmId) Values (31,4);

INSERT INTO Orders(OrderId, MemberId, SalesPersonId, OrderDate)
VALUES (32,5,3, ‘2007-01-21’);

INSERT INTO OrderItems(OrderId,FilmId) Values (32,6);
INSERT INTO OrderItems(OrderId,FilmId) Values (32,2);

269


Chapter 9

Now that you’ve added the necessary data to the Film Club database, you can start writing and executing
complex, difficult queries.

Tackling Difficult Queries

In this section, you learn some tips and techniques for writing difficult queries. Unfortunately, there’s no
quick fix, but hopefully by the time you’ve finished this section you’ll be ready and raring to go solve
some mind-bending queries.

The following list details the basic principles you should follow when creating a complex query:

❑ Decide what data you want. You might find that it helps to actually write a small portion of data
on paper. That way, you know what results you’re expecting and roughly how to get them.

❑ Create the SELECT clause and populate its column list. When you know what columns you’re
after, writing queries becomes much easier.

❑ Work out which tables to get your data from and create the FROM clause of the SELECT statement.
❑ Work through the FROM clause a bit at a time. Don’t try to create all the table selections and joins

all at once. Start simple and test the code at each stage to see if it works and gives the results
that you expect. Then work out the FROM clause using a database diagram, and determine where
you need to get your next column’s data from and where the next join links to.
❑ Query elements that don’t affect the final results (such as the ORDER BY clause) can wait until
the end.
Considering the preceding list, the first tip might seem strange, but it’s this: Ask yourself whether you
really need to write a complex query? While it’s quite satisfying to spend a day writing a query so complex
that it confounds even the most gifted database guru, doing so might not always suit your needs. If you’re
in a project under tight deadlines, it might be a whole lot easier to write a few simple queries and do the
data manipulation from a high-level language such as Visual Basic, C++, Java, or whatever language you
use for application development. Writing multiple simple queries is sometimes quicker and the code is eas-
ier to understand.

Work Out What You Want, What You Really, Really Want

Leaping right into the middle of writing a query can be tempting, but in fact the first step is working out
what the query’s requirements are. You need to sit down and analyze what’s been asked for and what
the end results should be. A simple example might be if you’re asked to delete a member from the
MemberDetails table. On the face of it, this is just a simple DELETE FROM MemberDetails and a WHERE
clause ensuring that only the member you want deleted is deleted. However, it has a ripple effect, as refer-
ences to the member are also contained in the FavCategory table, in the Attendance table, and the Orders
table. If the database is set up correctly, as the Film Club one is, then trying to delete a member but not
deleting references to them in other tables will throw up an error. That’s because constraints were set up
to stop this from happening. However, if restraints don’t exist, then it’s all too easy to end up with
orphaned data — for example, references to a member’s favorite category for a member who doesn’t exist!

270


Advanced Queries

A more complex question could be, “Who’s the most successful salesperson?” Sounds easy enough —
surely the most successful is the person who obtained the most orders. Or is it the person who sold the
most number of DVDs? Or is it the salesperson who sold the most DVDs per order? Perhaps the most
successful salesperson got the highest number of repeat customers?

The possibilities are endless, but hopefully you can see that deciding what data you’re actually obtain-
ing from the database is vital. The following example uses the total value of orders for each sales-
person for each month. Having decided what you want, you need to look at where you’re going to
get the data.

Choosing the SELECT Column List

Now that you know the question, it’s time to work out what data supplies the answer and where that
data comes from. In order to answer the question, “What is the total value of orders for each salesperson
for each month,” you first need to work out what data is to be returned and populate the SELECT col-
umn list. The question asks for each salesperson’s total value of orders for each month, so you need to
know which salesperson, which month and year, and the total cost of orders placed. Your SELECT list
should include the following:

SELECT FirstName, LastName, MONTH(OrderDate), Year(OrderDate), SUM(DVDPrice)

FirstName and LastName are those of the salesperson. You use the MONTH() and YEAR() functions to
extract the month and year from the OrderDate table. Finally, you add together all the DVDPrice values
using the SUM() function. The next task is to work out where the data is going to come from and create
the FROM clause.

Creating the FROM Clause

You know what you want, so now you need to work out which tables supply the data. For the salesper-
son’s first and last names, this is easy — the data comes from the SalesPerson table. Start by including
these details in the SELECT column list and then add the FROM clause:

SELECT FirstName, LastName
FROM SalesPerson

That’s easy enough. Now you need details of when the orders were placed. This data can be found in
the Orders table. However, you also need to link each order in the Orders table to the appropriate sales-
person, whose details are found in the SalesPerson table. Use an inner join to link the SalesPersonId
in the SalesPerson table with the SalesPersonId in the Orders table, as shown in the following statement:

SELECT FirstName, LastName,MONTH(OrderDate)AS Month, YEAR(OrderDate) As Year
FROM Orders INNER JOIN SalesPerson

ON Orders.SalesPersonId = SalesPerson.SalesPersonId;

Note that this example won’t work in Oracle because it does not support the YEAR() function.

If you run this SQL, you end up with details of every order placed:

271


Chapter 9 LastName Month Year

FirstName Crane 10 2006
Crane 10 2006
Frasier Crane 10 2006
Frasier Crane 11 2006
Frasier Crane 12 2006
Frasier Crane 12 2006
Frasier Crane 10 2006
Frasier Crane 10 2006
Frasier Crane 10 2006
Frasier Hugson 2006
Frasier Hugson 7 2006
Sandra Hugson 8 2006
Sandra Hugson 8 2006
Sandra Hugson 9 2006
Sandra Hugson 9 2006
Sandra Hugson 10 2006
Sandra Hugson 10 2006
Sandra Hugson 11 2006
Sandra Hugson 11 2006
Sandra Hugson 11 2006
Sandra Hugson 12 2006
Sandra Hugson 12 2006
Sandra Hugson 12 2007
Sandra Hugson 1 2007
Sandra Hugson 1 2007
Sandra Moon 1 2006
Sandra Moon 11 2006
Daphne Moon 11 2006
Daphne Moon 12 2006
Daphne Moon 12 2006
Daphne Moon 12 2007
Daphne Moon 1 2007
Daphne 1
Daphne

272


Advanced Queries

What you really want, however, is to group the orders according to the month and year in which they
were placed, necessitating a GROUP BY clause:

SELECT FirstName, LastName,MONTH(OrderDate)AS Month, YEAR(OrderDate) As Year
FROM Orders INNER JOIN SalesPerson

ON Orders.SalesPersonId = SalesPerson.SalesPersonId
GROUP BY FirstName,LastName,MONTH(OrderDate), YEAR(OrderDate);

Executing the preceding query groups the results by salesperson, month, and year, as shown in the fol-
lowing table:

FirstName LastName Month Year

Daphne Moon 1 2007
Daphne Moon 11 2006
Daphne Moon 12 2006
Frasier Crane 10 2006
Frasier Crane 11 2006
Frasier Crane 12 2006
Sandra Hugson 1 2007
Sandra Hugson 7 2006
Sandra Hugson 8 2006
Sandra Hugson 9 2006
Sandra Hugson 10 2006
Sandra Hugson 11 2006
Sandra Hugson 12 2006

So far, you have a list of salespeople as well as month and year values. There’s no link to the items con-
tained in each order. Your next task is to link the current tables, SalesPerson and Orders, to items in each
order. This data is contained in the OrderItems table, which is the next table to which you must create a
link. The value that links the current tables and OrderItems is the OrderId present in the Orders and
OrderItems tables. Join OrderItems to the current tables using an inner join, as illustrated in the follow-
ing statement:

SELECT FirstName, LastName,MONTH(OrderDate)AS Month, YEAR(OrderDate) AS Year
FROM (Orders INNER JOIN SalesPerson ON Orders.SalesPersonId =
SalesPerson.SalesPersonId)
INNER JOIN OrderItems ON OrderItems.OrderId = Orders.OrderId
GROUP BY FirstName,LastName,MONTH(OrderDate), YEAR(OrderDate);

If you run this query, you won’t see any difference between these results and the previous results. Also,
what you’re really after is the total cost of all orders placed in each month by each salesperson, which is
represented by the SUM(DVDPrice) data in the original SELECT list described previously. However, the

273


Chapter 9

DVDPrice column is in the Films table, so you need to find a link between the currently included tables
in the FROM clause, which are the SalesPerson, Orders, OrderItems, and Films tables. That link is pro-
vided by the OrderItems table, which has the FilmId of the film purchased. Begin by adding the Films
table and linking it with the current tables using an inner join:

SELECT FirstName, LastName, MONTH(OrderDate)AS Month, YEAR(OrderDate) AS Year
FROM (((Orders INNER JOIN SalesPerson ON Orders.SalesPersonId =
SalesPerson.SalesPersonId)
INNER JOIN OrderItems ON OrderItems.OrderId = Orders.OrderId))
INNER JOIN Films ON Films.FilmId = OrderItems.FilmId
GROUP BY FirstName,LastName,MONTH(OrderDate), YEAR(OrderDate);

Finally, you can add the total sales per month to the SELECT clause:

SELECT FirstName, LastName, MONTH(OrderDate)AS Month, YEAR(OrderDate) AS Year,
SUM(DVDPrice) AS Total
FROM ((Orders INNER JOIN SalesPerson ON Orders.SalesPersonId =
SalesPerson.SalesPersonId)
INNER JOIN OrderItems ON OrderItems.OrderId = Orders.OrderId)
INNER JOIN Films ON Films.FilmId = OrderItems.FilmId
GROUP BY FirstName,LastName,MONTH(OrderDate), YEAR(OrderDate);

Remember that the SQL groups orders placed by salesperson and by month, so the SUM(DVDPrice) is
the sum of all the DVDs sold by salesperson and by month. The results are as follows:

FirstName LastName Month Year Total

Daphne Moon 1 2007 80.94
Daphne Moon 11 2006 141.85
Daphne Moon 12 2006 221.74
Frasier Crane 10 2006 151.88
Frasier Crane 11 2006
Frasier Crane 12 2006 12.99
Sandra Hugson 1 2007 50.90
Sandra Hugson 7 2006 31.97
Sandra Hugson 8 2006 12.98
Sandra Hugson 9 2006 27.93
Sandra Hugson 10 2006 21.93
Sandra Hugson 11 2006 44.96
Sandra Hugson 12 2006 37.92
92.89

The query is nearly complete, save two things. First, what if two or more salespeople have the same
name? This would mean that all their results would be lumped together because you’ve grouped by

274


Advanced Queries

FirstName and LastName. You need to add SalesPersonId, which is unique to each salesperson, to the
GROUP BY clause, as well as display it in the final results:

SELECT SalesPerson.SalesPersonId, FirstName, LastName, MONTH(OrderDate)AS Month,
YEAR(OrderDate) AS Year, SUM(DVDPrice) AS Total
FROM ((Orders INNER JOIN SalesPerson ON Orders.SalesPersonId =
SalesPerson.SalesPersonId)
INNER JOIN OrderItems ON OrderItems.OrderId = Orders.OrderId)
INNER JOIN Films ON Films.FilmId = OrderItems.FilmId
GROUP BY SalesPerson.SalesPersonId, FirstName,LastName,MONTH(OrderDate),
YEAR(OrderDate);

Executing the query provides the following results:

SalesPersonId FirstName LastName Month Year Total

1 Sandra Hugson 1 2007 31.97
1 Sandra Hugson 7 2006 12.98
1 Sandra Hugson 8 2006 27.93
1 Sandra Hugson 9 2006 21.93
1 Sandra Hugson 10 2006 44.96
1 Sandra Hugson 11 2006 37.92
1 Sandra Hugson 12 2006 92.89
2 Frasier Crane 10 2006 151.88
2 Frasier Crane 11 2006 12.99
2 Frasier Crane 12 2006 50.90
3 Daphne Moon 1 2007 80.94
3 Daphne Moon 11 2006 141.85
3 Daphne Moon 12 2006 221.74

The second slight issue is the order of the results. They would be easier to read if they were ordered by
salesperson and then by month and year, as shown in the ORDER BY clause of the following statement:

SELECT SalesPerson.SalesPersonId, FirstName, LastName, MONTH(OrderDate)AS Month,
YEAR(OrderDate) AS Year, SUM(DVDPrice) AS Total
FROM ((Orders INNER JOIN SalesPerson ON Orders.SalesPersonId =
SalesPerson.SalesPersonId)
INNER JOIN OrderItems ON OrderItems.OrderId = Orders.OrderId)
INNER JOIN Films ON Films.FilmId = OrderItems.FilmId
GROUP BY SalesPerson.SalesPersonId, FirstName,LastName,MONTH(OrderDate),
YEAR(OrderDate)
ORDER BY LastName, FirstName, YEAR(OrderDate), MONTH(OrderDate);

The final results are as follows:

275


Chapter 9

SalesPersonId FirstName LastName Month Year Total

2 Frasier Crane 10 2006 160.83
2 Frasier Crane 10 2006 151.88
2 Frasier Crane 11 2006 12.99
2 Frasier Crane 12 2006
1 Sandra Hugson 2006 50.90
1 Sandra Hugson 7 2006 12.98
1 Sandra Hugson 8 2006 27.93
1 Sandra Hugson 9 2006 21.93
1 Sandra Hugson 10 2006 44.96
1 Sandra Hugson 11 2006 37.92
1 Sandra Hugson 12 2007 92.89
3 Daphne Moon 1 2006 31.97
3 Daphne Moon 11 2006 141.85
12 221.74

Hopefully this example has given you an idea of how to tackle slightly more complex queries. The fol-
lowing Try It Out puts your knowledge to the test, asking you to write a complex query that identifies
how many repeat customers each salesperson has had.

Try It Out Identifying Repeat Customers

1. Although total sales is a good indicator of a salesperson’s success, the film club chairperson is

interested to know how many people have ordered again from the same salesperson. First, the
chairperson wants to know details of orders where the customer had ordered more than once
from the same salesperson. The SQL to do this is as follows:

SELECT SalesPerson.SalesPersonId,
SalesPerson.FirstName,
SalesPerson.LastName,
MemberDetails.MemberId,
MemberDetails.FirstName,
MemberDetails.LastName

FROM (Orders INNER JOIN MemberDetails
ON Orders.MemberId = MemberDetails.MemberId)
INNER JOIN SalesPerson
ON Orders.SalesPersonId = SalesPerson.SalesPersonId
GROUP BY SalesPerson.SalesPersonId,

SalesPerson.FirstName,
SalesPerson.LastName,
MemberDetails.MemberId,
MemberDetails.FirstName,
MemberDetails.LastName
HAVING COUNT(Orders.MemberId) > 1
ORDER BY SalesPerson.SalesPersonId;

276


Advanced Queries

2. The film club chairperson also wants to know how many repeat orders each customer placed

with the salesperson. The SQL to do this is shown below:

SELECT SalesPerson.SalesPersonId,
SalesPerson.FirstName,
SalesPerson.LastName,
MemberDetails.MemberId,
MemberDetails.FirstName,
MemberDetails.LastName,
COUNT(*) - 1

FROM (Orders INNER JOIN MemberDetails
ON Orders.MemberId = MemberDetails.MemberId)
INNER JOIN SalesPerson
ON Orders.SalesPersonId = SalesPerson.SalesPersonId
GROUP BY SalesPerson.SalesPersonId,

SalesPerson.FirstName,
SalesPerson.LastName,
MemberDetails.MemberId,
MemberDetails.FirstName,
MemberDetails.LastName
HAVING COUNT(Orders.MemberId) > 1
ORDER BY SalesPerson.SalesPersonId;

How It Works

First, you need to think about what the question involves. Basically, it involves counting how many
times a customer has ordered from the same salesperson. If the customer makes repeat orders from the
same salesperson, then display that record in the results.

Now you need to work out where the data to answer the question comes from. Details about orders are
found in the Orders table. Start simple and create a SELECT list that obtains the SalesPersonId and
MemberId — the customer who placed the order:

SELECT SalesPersonId, MemberId
FROM Orders;

This basic query gives the following results:

SalesPersonId MemberId

2 7
2 4
2 9
2 7
2 7
2 10
2 4
2 4

Table continued on following page

277


Chapter 9

SalesPersonId MemberId

2 10
1 7
1 4
1 6
1 4
1 11
1
1 15
1 14
1
1 9
1 6
1 9
1 4
1 9
1 8
1 9
3 6
3 7
3 11
3 13
3 4
3 9
3 12
1
5

Using this table, how would you manually work out the answer? You would group the data into each
salesperson and customers who ordered from them and then count how many identical customers
each salesperson had. To do this in SQL, you need to add a GROUP BY clause grouping salespeople
and customers:

SELECT SalesPersonId, MemberId
FROM Orders
GROUP BY SalesPersonId, MemberId
ORDER BY SalesPersonId

278


Advanced Queries

Executing the preceding query provides these results:

SalesPersonId MemberId

1 4
1 6
1 7
1 8
1 9
1 11
1 14
1 15
2 4
2 7
2 9
2 10
3 1
3 4
3 5
3 9
3 11
3 12
3 13

Identical SalesPersonId and MemberId records are now grouped, but the results still include members
who placed only one order with a salesperson. You want only those customers who ordered more than
once, so you need to count the number of customers in each group who ordered from a specific salesper-
son. If the MemberId appears more than once, that particular customer has ordered more than once from
that salesperson:

SELECT SalesPersonId, MemberId
FROM Orders
GROUP BY SalesPersonId, MemberId
HAVING COUNT(MemberId) > 1
ORDER BY SalesPersonId

Each group contains all the rows with the same salesperson and customer. The COUNT(MemberId) func-
tion counts the number of rows in each group, and the HAVING clause restricts the display of groups to
only those where the count is more than 1 — that is, more than one order by the same customer with the
same salesperson. This query provides the following results:

279


Click to View FlipBook Version