![]() |
![]() |
ALLBASE/SQL Reference Manual: HP 3000 MPE/iX Computer Systems > Chapter 3 SQL Queries![]() Complex Queries |
|
In addition to the simple queries shown in the previous section, you can create complex queries, which may contain more than one SELECT statement. At the highest level, a query is a SELECT statement, which consists of a query expression followed by an optional ORDER BY clause. At the next lower level, you can combine different query blocks into a single query expression with the UNION operator. Lower still, inside each query block is an optional search condition, which can contain predicates that incorporate subqueries. A subquery is always a single query block (SELECT) that can contain other subqueries but cannot contain a UNION. A query expression can contain a maximum of 16 query blocks from all sources, including UNION, subqueries, and the outer query block. Figure 3-1 “Range of Complex Query Types” shows the range of possibilities for complex queries. You can create a complex query by using the following:
The next sections describe each type of complex query with examples. A SELECT statement can consist of several query blocks connected by UNION or UNION ALL statements. Each individual SELECT statement returns a query result which is a set of rows selected from a specified table or tables. The union of these query results is presented as a table that consists of all rows appearing in one or more of the original query results. If only the UNION statement is used, all duplicate rows are removed from the final set of rows. In this case, the maximum size of a tuple in the query result is given by the following formula:
where:
At compile time, SumKeyLengths is computed assuming columns of NULL and VARCHAR contain no data. At run time, the actual data lengths are assumed. If the UNION ALL operator is used, duplicates are not removed. Candidates for duplicate removal are evaluated by comparing entire tuples, not just a single field. Only if two or more rows are entirely alike are the duplicates removed. In the case of the UNION ALL operator, the maximum size of a tuple in the query result is 3996 bytes, as it is for a non-UNION query expression. You cannot use LONG columns in a UNION statement. Suppose you wanted to find out the part number for all parts that require 30 days or more for delivery, or are supplied by the vendor whose number is 9002. The following query delivers this information using the UNION form of the SELECT statement:
Note that no rows are duplicated. When the UNION statement is not qualified by the ALL statement, all duplicate rows are removed from the query result. Notice that the ORDER BY clause must be at the end of the SELECT statement. It cannot be included in the separate query expressions that make up the overall statement. Only the final query result can be ordered. If the UNION ALL statement is used in the previous query, the result can contain duplicate rows. The following example flags duplicate rows with two types of arrows that are described below:
In the above example, rows are duplicated for the following:
Note that you could get the same information in other ways. For example, you could use two separate queries. Alternatively, you could use two predicates in the search condition joined by the OR operator as follows:
This query still contains duplicate rows where more than one vendor supplies a given part; but no duplicates are caused by vendor 9002 supplying some parts, and that some of these take 30 or more days to deliver. The duplicates could be eliminated by using the SELECT DISTINCT instead of SELECT statement. If you want to see which SELECT statement in the UNION statement contributed each row to the query result, you can include character constants in your SELECT statements. A second column is then generated that shows the originating query block for each row, as in this example:
The indicated duplicate rows would have been removed if the example contained the UNION statement instead of UNION ALL. A subquery, also known as a nested query, is a query block that is completely embedded in a predicate. A subquery may appear within the search condition which is a part of the WHERE or HAVING clause of a SELECT, INSERT, UPDATE or DELETE statement. It is like any other query expression, except that it cannot contain a UNION operator. A subquery may be used only in the following types of predicates:
Subqueries can be used to arrive at a single value that lets you determine the selection criteria for the outer query block. In the following simple example, the subquery (in parentheses) is evaluated to determine a single value used in selecting the rows for the outer query:
Subqueries are most frequently found within special predicates, which are described fully in the next section. Additional examples of subqueries can be found there. The three types of special predicate are listed here:
With all these types, subqueries may be used; for ALL, ANY, SOME, and IN predicate, additional forms allow the use of a value list in place of a subquery. For each type of special predicate the examples in the next sections show both subquery and non-subquery forms of the predicate whenever both possibilities exist. A quantified predicate compares a value with a number of other values that are either contained in a value list or derived from a subquery. The quantified predicate has the following general form: Expression ComparisonOperator Quantifier { ValueList SubQuery } The comparison operators shown here are allowable:
The quantifier is one of these three keywords:
The value list is of this form:
With the ANY or SOME quantifier (ANY and SOME are synonymous), the predicate is true if any of the values in the value list or subquery relate to the expression as indicated by the comparison operator. Suppose you have a list of the part numbers for parts you have been buying from vendor 9011. You would like to start obtaining those parts from other vendors. The following example shows how you would find the part number and vendor number for all parts supplied by vendor 9011 that are also supplied by some other vendor:
The quantifier ANY is used to determine whether PurchDB.SupplyPrice contains any of the part numbers in the value list. If so, the query returns the part number and vendor number of vendors supplying that part. The final predicate eliminates all instances where the part is supplied by vendor 9011. Note that SOME could be used in place of ANY, because SOME and ANY are synonyms. You can also use the subquery form of the quantified predicate. If you wanted to distribute some of the business you have been giving vendor 9004, you might want to find vendor numbers for each vendor supplying at least one part supplied by vendor 9004. The following query returns this information:
The subquery obtains the part numbers for all parts supplied by vendor 9004. The quantifier ANY is then used to determine if PartNumber is the same as any of these parts. If so, the vendor number supplying that part is returned in the query result. Some queries may require you to use ANY and SOME constructs in a manner that is not intuitive. Consider the following query:
The inexperienced SQL user might think that this means, "Select the sales price of parts from table T1 whose numbers are not equal to any part numbers in table T2." However, the actual meaning is, "Select the sales price of parts from T1 such that the part number from T1 is not equal to at least one part number in T2." This query returns the sales price of all the parts in T1 if T2 has more than one part. A less ambiguous form using EXISTS is as follows:
With the ALL quantifier, the predicate is true only if all of the values in the value list or subquery relate to the expression as indicated by the comparison operator. Assume you have been buying parts from vendor 9010. To get a discount from this vendor, you have been required to purchase parts in larger quantities than you would like. To avoid large stockpiles of these parts, you want to find vendors whose discount is not dependent on the purchase of such large quantities. The following query uses two subqueries and an ALL quantifier to retrieve the information you want:
The first subquery obtains the number of parts needed to qualify for a discount for each part supplied by vendor 9010. Using the quantifier ALL, rows are selected only when the quantity needed for a discount is less than that needed for any part supplied by 9010. The second subquery limits the selection to only those part numbers supplied by vendor 9010. Thus, the query result shows every part supplied by vendor 9010 which can be obtained from another vendor in smaller quantities with a discount. An IN predicate compares a value with a list of values or a number of values derived by the use of a subquery. The IN predicate has the following general form: Expression [NOT] IN { ValueList SubQuery } The ValueList and SubQuery forms of the IN predicate are described separately in the following sections. Note that IN is the same as = ANY. If you wanted to obtain the numbers of all vendors who supplied a given list of parts, the following query could be used:
If you wanted a list of all the vendors who supply the same parts that vendor 9004 supplies, the following query could be used:
The subquery determines the part number of every part supplied by vendor 9004. The outer query selects every vendor who supplies one or more of those parts. DISTINCT removes duplicates from the final query result, as many vendors supply more than one such part. The EXISTS predicate, also known as the existential predicate, tests for the existence of a row satisfying some condition. It has the following general format:
EXISTS is true only if the query result of the subquery is not empty; that is, a row or rows are returned as a result of the subquery. If the query result is empty, the EXISTS predicate is false. In the following example, suppose you need to determine the names of all vendors who currently supply parts:
In this example, v and sp are correlation names, which enable ALLBASE/SQL to distinguish the two VendorNumber columns in the predicate without requiring you to repeat each table name in full. You can also use the NOT EXISTS form of the existential predicate. If you wanted to find those vendors who are not currently supplying you with parts you could use a query of the form shown here:
In many cases, it is possible to execute the subquery just once, and obtain a result which is passed to the outer query for its use. Here is an example:
This kind of subquery is a noncorrelated subquery. In other cases, however, it is necessary to evaluate a subquery once for every row in the outer query, as in the following:
The predicate in the subquery references the column value v.VendorNumber, which is defined by the outer query block. When this type of relationship exists between a column value in the subquery and a column value in an outer query block, the query is called a correlated subquery. Recognizing correlated subqueries is important when performance is a priority. Correlated subqueries require the optimizer to use an outer loop join algorithm rather than a sort-merge join. Because a sort-merge join is orders of magnitude faster than an outer loop join, correlated subqueries pay a performance penalty. In addition, when the ANY, SOME, ALL, or IN predicate makes use of subqueries, the queries are converted into correlated subqueries using the EXISTS predicate. Therefore, if at all possible, queries using ANY, SOME, ALL, IN, or the correlated form of the EXISTS predicate should be done as joins of two or more tables rather than by using subqueries if performance is an issue. In fact, it is possible to state a query as a join as well as in a form using subqueries; non-correlated subqueries are faster than sort-merge joins. Sort-merge joins are faster than correlated subqueries which use an outer loop join. An inner join returns only tuples for which matching values are found between the common columns in the joined tables. A natural inner join specifies that each pair of common columns is coalesced into a single column in the query result. The term join has become synonymous with the term natural inner join because that type of join is used so frequently. To include in the query result those tuples from one table for which there is no match in the common columns of the other table you use an outer join. The term natural, when applied to an outer join, has the same meaning as with an inner join. Common columns are coalesced into a single column in the query result. No duplicate columns are returned. Outer joins may be constructed using the explicit JOIN syntax of the SELECT statement (see the "SELECT" section of the "SQL Statements" chapter). In a two table outer join, the first table listed in the FROM clause of the SELECT statement is considered the left hand table and the second is considered the right hand table. The set of rows in the result may be viewed as the union of the set of rows returned by an inner join (the inner part of the join) and the set of rows from one table for which no match is found in the corresponding table (the outer part of the join). If the unmatched rows from both tables being joined are preserved, the join is a symmetric outer join. If the rows are preserved from only the left hand table, the join is a left asymmetric outer join. (The word asymmetric is usually omitted.) If the rows are preserved from only the right hand table, the join is a right outer join. The current syntax will allow you to specify either a left outer join or a right outer join, but not a symmetric outer join. A technique for creating a symmetric outer join using the UNION operator is described later in the section, "Symmetric Outer Joins Using the UNION Operator." A left outer join obtains the rows from both tables for which there is a matching value in the common column or columns (the inner part) and the rows from the left hand table for which there is no match in the right hand table (the outer part). Each unmatched row from the left hand table is extended with the columns coming from the right hand table. Each column in that extension has a null value. A right outer join obtains the rows from both tables for which there is a matching value in the common column or columns, and the rows from the right hand table for which there is no match in the left hand table. The unmatched rows from the right hand table are extended with the columns coming from the left hand table, with null column values returned in that extension for every result row which has no match in the left hand table. For example, the following right outer join is between the SupplyPrice and the Vendors tables. For all vendors who supply parts, it returns the Part Number, Vendor Name and Vendor City. For all vendors who do not supply parts, it returns just the Vendor Name and Vendor City.
When you use the ON clause of the JOIN syntax, it must contain, at a minimum, the predicate which specifies the join condition. Other predicates may be placed within the SELECT statement, but their location is critical as the following examples show. Additional predicates may be placed in the ON clause. These predicates limit the rows participating in the inner join associated with the ON clause. All rows excluded by such predicates participate in the outer part of the associated join. The following query returns (in the inner part of the join) Part Numbers for all vendors who supply parts and are located in California (italics). It also returns, without the Part Number (in the outer part of the join) all vendors who do not supply parts ({{highlighted}}), and all vendors who do supply parts, but are not located in California.
In the above example, the rows participating in the inner join are further restricted by adding to the ON clause, AND VendorState = 'CA'. All vendors that are not in California are placed in the outer part of the join. If you move the limiting predicate from the ON clause to the WHERE clause, the query returns a different result. In the following query, the inner part of the join still contains all vendors who supply parts and are located in California. However, in the outer part of the join, only those vendors who do not supply parts and are in California are included.
In the above example, the WHERE clause is applied to all the rows returned, regardless of whether they are in the inner or outer part of the join. Thus no rows are returned unless the vendor is located in California. If you want the inner part of the query to contain all vendors who do supply parts and are located in California while the outer part contains all vendors who do not supply parts, regardless of location, use the query shown below.
If all common columns between the tables being joined are to be used for the join, the keyword NATURAL may be used so long as the specification of the ON clause join predicate is omitted. This technique may be used when joining more than two tables, as in the query shown below:
An outer join can also be created by using the UNION operator. Suppose you want to create a list of vendors who either supply some part with a unit price less than $100 or else do not supply any parts at all. To do this, merge two separate queries with a UNION ALL statement, as in the following examples. The first query shown here selects the names of vendors who do not supply parts:
Notice that a second query block is embedded within the first query expression. It creates a temporary table containing the names of all vendors who do supply parts. Then note the special predicate EXISTS, which is negated in this case. The outer SELECT statement allows us to identify the name of each vendor in the Vendors table. Each VendorName is compared against the list of vendors who do supply parts. If the VendorName from the outer SELECT statement is not found in the temporary table created by the subquery, the outer VendorName is returned to the query result, providing us a list of all the Vendors who do not supply parts. The second query shown here defines the vendors who supply at least one part with a unit price under $100:
The next example shows this query joined to the previous one by the UNION ALL statement. It also shows the use of character constants to indicate which rows result from which query block.
Since the syntax does not support a symmetric outer join, you might try to simulate a symmetric outer join using the left outer join syntax in combination with the right outer join syntax. Intuitively, the following query might seem correct:
This three table outer join does a left outer join between the Parts and the SupplyPrice tables. The result of that join is then used as the left hand table in a right outer join with the Vendors table. It would seem as though the result first displays all parts supplied by a vendor, then all parts for which there is no supplier, followed by all vendors who do not supply parts. But, the action of the query is subtle. The natural left join preserves the parts from the Parts table that is not supplied by any vendor. This supplies the left hand component for the simulated symmetric outer join. However, although the natural right join preserves the three vendors from the vendors table who do not supply parts (the right hand component for the simulated symmetric outer join), it eliminates the unmatched parts from the Parts table. This happens because the natural right join only preserves unmatched rows from the right hand table, eliminating the row from the Parts table.
To preserve all the unmatched rows from both sides, thus generating a full symmetric outer join, you must use the following syntax:
The result from the natural left join...natural left join preserves the unmatched part from Parts. The natural right join...natural right join preserves the unmatched vendors from Vendors. The natural right join...natural left join would preserve all unmatched rows from SupplyPrice if there were any (in this example there are none). The union operation combines the three results, preserving the unmatched rows from all joins. There are three complete sets of rows that satisfy the inner join, but the union operation eliminates the duplicate rows unless UNION ALL is specified. The result of the above query follows:
|