![]() |
![]() |
ALLBASE/SQL FORTRAN Application Programming Guide: HP 9000 Computer Systems > Chapter 6 Overview of Data Manipulation![]() The Query |
|
A query is a SELECT command that describes to ALLBASE/SQL the data you want retrieved. You can retrieve all or only certain data from a table. You can have ALLBASE/SQL group or order the rows you retrieve or perform certain calculations or comparisons before presenting data to your program. You can retrieve data from multiple tables. You can also retrieve data using views or combinations of tables and views. The SELECT command identifies the columns and rows you want in your query result as well as the tables and views to use for data access. The columns are identified in the select list. The rows are identified in several clauses (GROUP BY, HAVING, and ORDER BY). The tables and views to access are identified in the FROM clause. Data thus specified is returned into host variables named in the INTO clause, with the following syntax:
To retrieve all data from a table, the SELECT command need specify only the following:
Although the shorthand notation * can be used in the select list to indicate you want all columns from one or more tables or views, it is better programming practice to explicitly name columns. Then, if the tables or views referenced are altered, your program will still retrieve only the data its host variables are designed to accommodate:
The SELECT command has several clauses you can use to format the data retrieved from any table:
The following SELECT command contains a WHERE clause that limits rows returned to those not containing a salesprice; the predicate used in the WHERE clause is known as the null predicate:
In the UPDATE and DELETE commands, you may need a WHERE clause to limit the rows ALLBASE/SQL changes or deletes. In the following case, the sales price of parts priced lower than $1000 is increased 10 percent; the WHERE clause in this case illustrates the comparison predicate:
The ALLBASE/SQL Reference Manual details the syntax and semantics for these and other predicates. When you use an aggregate function in the select list, you can use the GROUP BY clause to indicate how ALLBASE/SQL should group rows before applying the function. You can also use the HAVING clause to limit the groups to only those satisfying certain criteria. The following SELECT command will produce a query result containing two columns: a sales price and a number indicating how many parts have that price:
The GROUP BY clause in this example causes ALLBASE/SQL to group all parts with the same sales price together. The HAVING clause causes ALLBASE/SQL to ignore any group having an average sales price less than or equal to $1500.00. Once the groups have been defined, ALLBASE/SQL applies the aggregate function COUNT to each group. Each null value in a GROUP BY column constitutes a separate group. Therefore a query result having a null value in the column(s) used to group rows would contain a separate row for each null value. An aggregate function is one example of an ALLBASE/SQL expression. An expression specifies a value. An expression can be used in several places in the SELECT command as well as in the other data manipulation commands. Refer to the ALLBASE/SQL Reference Manual for the syntax and semantics of expressions, as well as the effect of null values on them. The rows in the query result obtained with the preceding query could be returned in a specific order by using the ORDER BY clause. In the following case, the rows are returned in descending sales price order:
The examples shown so far have all included queries where results would most likely contain more than one row. The sequential table processing technique using cursors could also be used to handle multiple-row query results. Later in this chapter you'll find examples of this technique, as well as examples illustrating simple data manipulation, in which only one-row query results are expected. To retrieve data from more than one table or view, the query describes to ALLBASE/SQL how to join the tables before deriving the query result:
To obtain a query result consisting of the name of each part and its quantity-on-hand, you need data from two tables in the sample database: PurchDB.Parts and PurchDB.Inventory. The join condition in this case is that you want ALLBASE/SQL to join rows in these tables that have the same part number:
Whenever two or more columns in a query have the same name but belong to different tables, you avoid ambiguity by qualifying the column names with table and owner names. Because the columns specified in the join condition shown above have the same name (PartNumber) in both tables, they are fully qualified with table and owner names (PurchDB.Parts and PurchDB.Inventory). If one of the columns named PartNumber were named PartNum, the WHERE clause could be written without having the fully qualified column name as follows:
ALLBASE/SQL creates a row for the query result whenever the PartNumber value in one table matches that in the second table. Any row containing a null PartNumber is excluded from the join, as are rows that have a PartNumber value in one table, but not the other: You can also join a table to itself. This type of join is useful when you want to identify pairs of values within one table that have certain relationships. The PurchDB.SupplyPrice table contains the unit price, delivery time, and other data for every vendor that supplies any part. Most parts are supplied by more than one vendor, and prices vary with vendor. You can join the PurchDB.SupplyPrice table to itself in order to identify for which parts the difference among vendor prices is greater than $50. The query and its result would appear as follows: The query:
The result:
To obtain such a query result, ALLBASE/SQL joins one copy of the table with another copy of the table, using the join condition specified in the WHERE clause:
Join variables can be used in any query as a shorthand way of referring to a table, but they must be used in queries that join a table to itself so that ALLBASE/SQL can distinguish between the two copies of the table. Views are used to restrict data visibility as well as to simplify data access:
The sample database has a view called PurchDB.VendorStatistics, defined as follows:
This view combines information from three base tables to provide a summary of data on existing orders with each vendor. One of the columns in the view consists of a computed expression: the total cost of an item on order with the vendor. Note that the select list of the SELECT command defining this view contains some qualified and some unqualified column names. Columns OrderDate, OrderQty, and PurchasePrice need not be qualified, because these names are unique among the column names in the three tables joined in this view. In the WHERE clause, however, both join conditions must contain fully qualified column names since the columns are named the same in each of the joined tables. You can use a view in a query without restriction. In the FROM clause, you identify the view as you would identify a table. When you reference columns belonging to the view, you use the column names used in the view definition. In the view above, for example, the column containing quantity-on-order is called OrderQuantity, not OrderQty as it is in the base table (PurchDB.OrderItems). The VendorStatistics view can be used to quickly determine the total dollar amount of orders existing for each vendor. Because the view definition contains all the details for deriving this information, the query based on this view is quite simple:
The query result appears as follows:
Although you can use views in queries without restriction, you can use only some views to INSERT, UPDATE, or DELETE rows:
The PurchDB.VendorStatistics view cannot be used for any INSERT, UPDATE, or DELETE operation because it is based on a three-table join and contains a column (TotalPrice) derived from a multiplication operation. Three clauses in the SELECT command have an effect on the execution speed of queries:
As discussed earlier, the WHERE clause consists of one or more predicates. Predicates can be evaluated more quickly when they can be optimized by ALLBASE/SQL. The following predicates are optimizable when all the data types within them are the same (in the case of DOUBLE PRECISION data, the precisions and scales of the different values must be the same). Note that after optimization, ALLBASE/SQL may perform an index scan to access data; an index scan improves data access speed by making use of an index on one or more of the columns in the predicate, as shown in the following syntax:
The lower the cluster count of an index, the greater the chance ALLBASE/SQL will use it when an appropriate index is available. Cluster count indicates the number of times ALLBASE/SQL has to access a different data page to retrieve the next row during an index scan. Refer to the ALLBASE/SQL Database Administration Guide for information on how to optimize the cluster count of an index. The following predicate syntax is not optimizable, and an index is never used:
When a query does not contain a WHERE clause, an index is never used, because all rows from tables in the FROM clause containing columns in the select list qualify:
When an index is not used, ALLBASE/SQL performs what is known as a serial scan to locate rows. When a serial scan is performed instead of an index scan, the entire table is locked, regardless of the automatic locking mode of the table. The optimization and locking ALLBASE/SQL performs for the WHERE clause in the SELECT command also applies to the WHERE clause in the UPDATE and DELETE commands. When a query contains a GROUP BY and/or an ORDER BY clause, ALLBASE/SQL must sort rows. The time required for sorting increases as the number of qualifying rows increases. Sorting occurs in DBEFiles associated with the SYSTEM DBEFileSet. Therefore enough file space must be available in this DBEFileSet when the query is executed to accommodate the sort operations. Guidelines on space requirements can be found in the ALLBASE/SQL Database Administration Guide . |