Handling Large IMAGE Tables [ Information Access Server: Database Administration ] MPE/iX 5.0 Documentation
Information Access Server: Database Administration
Handling Large IMAGE Tables
Access Server performs slowly when a large IMAGE table is loaded. For
example, the serial load of a table from a 60,000-record dataset can take
from 10 minutes to an hour, depending on the model HP 3000 you're using
and the demand on system resources at the time of the load.
There are two major ways to alleviate this problem:
* Users, in accessing a large IMAGE table, can first define a query
so that the loaded table is considerably smaller.
* For situations where view tables make sense (when a user routinely
needs a well-defined subset of a large table), the DBA can
structure the Where Clause of the view table definition to
optimize performance.
These methods are discussed in detail below.
When Does a Load Occur?
There are three cases when a load of the complete IMAGE table takes
place:
* The user goes to the Display Table screen without specifying a
query.
* The user initiates a sort operation on the Sort Table screen
without specifying a query.
* The user initiates a summarize operation on the Summarize Table
screen without specifying a query. (Unlike local tables, there is
no need to sort columns in a remote table before summarizing
them.)
When quick performance is important to your users, they will want to
avoid these situations if at all possible.
Using the Where Clause to Reduce Table Size
Below is a discussion of the role of the Where Clause in reducing the
size of the table to be loaded, and how you can structure the Where
Clause to optimize your view table definitions.
Serial Reads Using the MATCH Operator.
If your users consistently access the same part of a large IMAGE table,
the performance will be precisely the same whether the user builds a
"view" by drawing the tables together in Access PC or in the Host Batch
Facility, or accesses a view table you have defined using the Add View
Table screen. But defining a view table saves users the unnecessary
effort of reconstructing the same view of the data every time they need
to reference it.
Suppose, for example, that in the database SAMPL2 (from the tutorial in
the Information Access: Learning the Administrator Utility manual), the
INVENTORY detail dataset contained 20,000 records. Suppose, further,
that you have a group of users interested only in paper products, another
group interested only in desk products, and so on.
You might first configure the entire dataset as one IMAGE table
TOTAL-INVENTORY.
Then you might configure four view tables, one for each product line,
using the MATCH operator and the wildcard character @. For paper
products, for example, the view table PAPER-INVENTORY could be configured
thus:
Item Clause: PRODUCT-NBR,WAREHOUSE,BACKORDERFLG,ON-HAND-QTY
Using Clause: TOTAL-INVENTORY
Where Clause: PRODUCT-NBR MATCH "PAP@"
The other three view tables would look identical except that you would
replace PAP (in the Where Clause) with DES, FIL, or PEN.
If a user now accesses the view table PAPER-INVENTORY, it is likely to be
a good deal smaller than the original table (perhaps 5,000 instead of
20,000 records). And the load need not occur until the user reduces it
further with a Define Query.
NOTE But suppose that, infrequently, the same user only wants to see
records for envelopes, for which PRODUCT-NBR is PAP006. In this
case, since a chained read can be set up on the Define Query
screen, it would be much quicker for the user to select the IMAGE
table TOTAL-INVENTORY and define a query against it (= PAP006),
than to select and pare down the view table PAPER-INVENTORY with
the same query. Chained reads are discussed below.
Chained Reads.
The disadvantage in using the MATCH operator is the time it takes to
perform serial reads on a large dataset. If you can take advantage of
chained reads on search items in an IMAGE database or keys in a KSAM
file, you can speed up the search considerably.
For example, the WAREHOUSE item is a search item in the INVENTORY
dataset. Dividing the IMAGE table TOTAL- INVENTORY into several view
tables, one for each warehouse, is easily done and highly efficient.
Suppose, as in the DBA tutorial, that one of your warehouses is in
Fresno. You could set up a view table called FRESNO- INVENTORY this way:
Item Clause: PRODUCT-NBR,BACKORDERFLG,ON-HAND-QTY
Using Clause: TOTAL-INVENTORY
Where Clause: WAREHOUSE = "Fresno"
View tables for the other warehouses would differ only in the Where
Clause.
The chained read on the value of WAREHOUSE happens very quickly.
Moreover, the size of the table to be loaded is considerably smaller than
the original IMAGE table. The result is a reduction in the time your
users need to wait for their information. (Note: But for long chains,
e.g. where 50% of the records are chained, chained reads may actually
take longer than serial reads. The shorter the chain, the better.)
Structuring for Chained or KSAM Keyed Reads.
To optimize performance when dealing with large IMAGE datasets or KSAM
files, construct Where Clauses to use chained or keyed reads rather than
serial reads wherever possible.
The following guidelines indicate how to do this:
* A Where Clause of the form
DATA-ITEM = constant
results in a chained or keyed read if the data item is a search
item of a detail dataset. If, however, the relational operator is
anything other than the equal sign (that is, <, >, <=, >=, or
MATCH), a serial read always occurs.
* A Where Clause of the form
DATA-ITEM RELOP constant
results in a serial read if the data item is not a search or key
item. (RELOP indicates any of the relational operators =, <, >,
<=, >=, or MATCH).
* A Where Clause of the form
DATA-ITEM1 RELOP DATA-ITEM2
results in a serial read if DATA-ITEM1 and DATA-ITEM2 are members
of the same dataset.
If the data items are members of different datasets, we are in a
PRODUCTION or JOIN situation. For a PRODUCTION, chained reads
will be used if RELOP is an equal sign and at least one of the
data items is a search item. For a JOIN, a chained read is
triggered by the existence of search items in the Using Clause. A
Where Clause that can use a chained read will also increase the
efficiency of JOIN processing.
* The use of AND and OR in a Where Clause can impact the ability of
Access Server to use a chained read. In the examples below,
assume that all data items are from the same IMAGE dataset.
DATA-ITEM1 = constant1 AND DATA-ITEM2 = constant2
results in a chained read if either or both of the data items is a
search item.
DATA-ITEM1 = constant1 OR DATA-ITEM1 = constant2
results in two chained reads on the same path if DATA-ITEM1 is a
search item.
DATA-ITEM1 = constant1 OR DATA-ITEM2 = constant2
always results in a serial read, even if both DATA-ITEM1 and
DATA-ITEM2 are search items. Using two chained reads on different
data items would result in duplicate records if the chains
intersect.
In clauses containing both AND and OR, you should use parentheses,
if possible, to group all clauses known to require serial reads,
and AND with a clause known to use a chained read. The result
will be that the required data is located using a chained read.
Set it up like this:
(chained-read clause) AND (serial-read clause)
When nested levels of view tables are used (that is, when TABLE1
specifies TABLE2 in its Using Clause, TABLE2 specifies TABLE3 in
its Using Clause, and so forth), then if ANY of the tables can use
a chained read, ALL processing to satisfy the nested table
specification will be chained. This occurs because nested tables
generate a logical AND of selection criteria.
NOTE Mixing ANDs and ORs in a Where Clause can cause complexity beyond
the software's capability to recognize chained read opportunities.
It is better to formulate a Where Clause so that a search item is
only ANDed with other items, or where only the same search item is
ORed on different equality conditions. The user can then query on
the result to arrive at the desired item selection.
For example,
(I1=C1 AND I2=C2) OR (I1=C3 AND I2=C4)
will result in a serial read even though I1 and I2 are both search
items. In this case, performance may be dramatically improved by
taking advantage of the chained read resulting from
I1=C1 OR I1=C3
and then let the user perform a Query On Result using the longer
search formulation. Note that the second Where Clause will result
in all of the records, and possibly more, than would result from
the first clause.
KSAM and IMSAM Keyed Reads
Both KSAM and IMSAM keys can be used for direct comparison in the same
way as IMAGE search items for JOINs and queries, such as:
"CUST-NAME = 'JOHN SMITH'"
However, because KSAM and IMSAM store key values in sorted order, other
means of keyed access are available.
Partial Match.
If a wildcard match is given for a KSAM or IMSAM key, and if the wildcard
appears at the end, then a keyed read can be used for the retrieval.
In Access PC, wildcards use the == operator and the wildcard characters
are * and ?. For the Host Batch Facility, the MATCH operator is used.
For ROMAN-8 installations, the wildcard characters are @ and ?. For
ISO-7 installations, the wildcard characters are * and ? because @ is
one of the ISO-7 mapped characters.
For example, if PRODUCT-NAME is a KSAM or IMSAM keyed read, then at the
interactive screen on the PC, the user could get all product names
starting with "HP26" by entering:
OP PRODUCT-NAME
== HP26*
In PC batch, the syntax would be:
SEARCH "PRODUCT-NAME == 'HP26*'"
In the Host Batch Facility, if ISO-7 is not configured, the syntax would
be:
SEARCH "PRODUCT-NAME MATCH 'HP26@'"
Ranges.
KSAM and IMSAM keys can be used to quickly retrieve a range of values.
In Access PC, the special operator >< is used to indicate a range. In PC
batch and in the Host Batch Facility, the <, <=, >, and >= operators are
used with an AND.
For example, if PRODUCT-NAME is a KSAM or IMSAM key field, then at the
Define Query screen in Access PC the user could get all product names
from 2400 to 2699 by entering:
OP PRODUCT-NAME
>< 2400...2699
In PC batch or the Host Batch Facility, the syntax would be:
SEARCH "(PRODUCT-NAME >= '2400')AND(PRODUCT-NAME <= '2699')"
Inequalities.
If a direct compare, partial match, or range keyed read is not available,
then a KSAM or IMSAM key can be used to gain some speed in retrievals
with a simple <, <=, >, or >= operator. Such a retrieval would be useful
for exception reporting. For example,
(SHIPMENT-VALUE > 10000)
could quickly find all very large orders if SHIPMENT-VALUE were a KSAM or
IMSAM key.
MPE/iX 5.0 Documentation