HP 3000 Manuals

DBA Guidelines [ ALLBASE/SQL Performance Guidelines ] MPE/iX 5.0 Documentation


ALLBASE/SQL Performance Guidelines

DBA Guidelines 

Several aspects of database administration affect performance.  To
improve performance in these areas, you need DBA authority.

Avoiding Runtime Revalidation of Sections 

When you run an application, ALLBASE/SQL automatically and transparently
checks each section for validity before executing it.  When a section is
found to be invalid, ALLBASE/SQL revalidates it (if possible), then
executes it.  You may notice a slight delay as the revalidation takes
place.  Revalidation also can reduce concurrency, since revalidating a
permanent section involves updates to certain system catalog tables, and
this requires obtaining exclusive page locks on the system catalog until
the transaction is committed.  To avoid problems of runtime revalidation,
you can re-preprocess the program or use the VALIDATE statement to
revalidate the affected modules prior to run time.  Complete syntax for
the VALIDATE statement is presented in the ALLBASE/SQL Reference Manual.

You may find it appropriate to use the VALIDATE statement after executing
an UPDATE STATISTICS statement or any DDL statement, since these
statements invalidate stored sections.  If you issue such statements
during a period of low activity for the DBEnvironment (at night, for
example), the DBEnvironment experiences minimal performance degradation.

   *   Use the UPDATE STATISTICS statement carefully, since it results in
       invalidating sections.  When sections are invalid, they must be
       revalidated at run time before execution.  For best performance,
       after the initial statistics for a table have been established,
       you should use UPDATE STATISTICS during periods of low
       DBEnvironment activity, then use the VALIDATE statement to
       revalidate modules and procedures after using UPDATE STATISTICS.

Developing Application Programs 

The following tips relate to preprocessing of application programs:

   *   During development, frequent re-preprocessing of applications
       locks system catalog pages for extended periods of time.
       Therefore it is a good idea to employ separate development and
       production DBEnvironments on your system.  Use the ISQL INSTALL
       command to move finished module files into your production
       DBEnvironments.  The INSTALL command locks the same system catalog
       pages as the preprocessor, but it only does so a single time. 
   *   Avoid using DDL (including the UPDATE STATISTICS and CREATE INDEX
       statements) when running in multi-user mode, since DDL places
       exclusive locks on system catalog resources and can reduce
       throughput significantly.  It is good practice to restrict DDL to
       single-user applications that run during off-peak hours.  You can
       set the DDL Enabled flag to NO to achieve significantly higher
       throughput in a multiuser environment if there is no data
       definition going on.
   *   Section caching can result in some performance improvement.  Note
       that the first connect to a DBEnvironment is slower when DDL is
       disabled, since extra caching is being done.  See "Section Caching
       and Directory Caching" later in this chapter for more information.
   *   Use the TERMINATE USER statement to abort a user program or ISQL
       session if necessary.  DBCore does a set critical while accessing
       the DBEnvironment, and it only allows a program to abort at
       certain strategic points.  If DBCore is filling the tuple buffer,
       it completes this operation before allowing a break.  TERMINATE
       USER allows the program to abort immediately.

Load Balancing 

Load balancing is an effort to spread out I/O in your system among the
available disk devices for the most efficient operation.  Large systems
can benefit significantly from an analysis of system load followed by
reallocating the DBEFiles for specific DBEFileSets or within a given
DBEFileSet.  In general, the DBEFiles on which the most I/O is performed
should be placed on the fastest devices available.

For best performance, separate data DBEFiles, index DBEFiles, and log
files across the available disks on your system:

   *   Log files should be put on different devices from data DBEFiles
       for improved performance and insurance against a possible head
       crash.  For the same reasons, dual log files should be put on
       different devices.
   *   On MPE/iX, an easy way to scatter files over several devices is to
       use different volumesets, which specify different devices.  Any
       files over 10 megabytes in size are scattered among devices by
       default on MPE/iX.
   *   To move DBEFiles to different devices, use the SQLUtil MOVEFILE
       command for DBEFiles and the MOVELOG command for log files.
   *   Indexes on large tables should be physically separated from data
       DBEFiles for performance reasons.  For best results, put the index
       DBEFile on a different device by using the SQLUtil MOVEFILE
       command.  Doing this can have the effect of reducing the movement
       of the disk heads.  For small tables, the performance gain that
       results from having the indexes and data on separate spindles is
       probably not as apparent.

Placing Concurrently Used Objects on Different Drives.   

Any two database objects that are used concurrently (that is, have I/O at
the same time) should be located on different disk drives to minimize
disk head contention.  This includes any two tables that show I/O at the
same time, any two B-tree indexes that show I/O at the same time, or a
table that shows I/O at the same time the B-tree index defined on it
shows I/O. When objects are accessed simultaneously, the disk heads may
move back and forth between the objects.  This movement slows access to
data.  To avoid this problem use the SQLUtil MOVEFILE command to place
the competing objects on different drives.

Remember that the mapping of database objects to DBEFileSets is not
one-to-one, and that MOVEFILE moves DBEFiles, not DBEFileSets.  In order
to achieve the goal of placing different objects on different drives, you
must ensure that they are in different DBEFiles.  To this end, it is
useful to create separate INDEX and TABLE DBEFiles, and to define a
different DBEFileSet for each table or index that may need to be
separated from other objects.

Calculating Shared Memory Allocation 

As you adjust the various configurable parameters in ALLBASE/SQL, be sure
you do not exceed the amount of shared memory available on your system.
You can derive an approximate figure for the total shared memory used (in
kilobytes) from the following formula:

     SM = (4.2*DBP) + (0.5*LBP) + (0.16*NTXN) + (4.1*CBP)

where

     SM = Total Shared Memory in kilobytes
     DBP = DataBufferPages [ Limits vary with different systems ]
     LBP = LogBufferPages [ 24-1024; default 24 ]
     NTXN = Number of Concurrent Transactions [ 2-240; default 2 ]
     CBP = Control Block Pages [ 17-800 (512 on Series 300); default 37 ]

The result of this equation must be within the amount of shared memory
you have available.  This equation also gives an idea of the relative
effect of changing different parameters on the total shared memory size.

For the default parameters, you can use the following:

     SM = 4.2*DBP + 164

For the maximum parameters, you can use the following:

     SM = 4.2*DBP + 3830

Use these equations to arrive at a first approximation.

Choosing a Number of Data Buffer Pages 

Data buffers are used to cache data pages touched by ALLBASE/SQL. Up to
the limit of buffer space, a data page is retained in memory until the
space it occupies is needed for another page that is on disk.  You can
set the number of data buffer pages your DBEnvironment uses.

The following chart shows the minimum, default, and maximum numbers of
data buffer pages for ALLBASE/SQL on different hardware configurations: 

          Minimum, Default and Maximum Data Buffer Pages 

--------------------------------------------------------------------------------------------------
|                                    |                   |                   |                   |
|               System               |      Minimum      |      Default      |      Maximum      |
|                                    |                   |                   |                   |
--------------------------------------------------------------------------------------------------
|                                    |                   |                   |                   |
| HP-UX Series 700/800               | 15                | 100               | None              |
|                                    |                   |                   |                   |
--------------------------------------------------------------------------------------------------
|                                    |                   |                   |                   |
| HP-UX Series 300/400               | 15                | 15                | 240               |
|                                    |                   |                   |                   |
--------------------------------------------------------------------------------------------------
|                                    |                   |                   |                   |
| MPE/iX                             | 15                | 100               | 16,383            |
|                                    |                   |                   |                   |
--------------------------------------------------------------------------------------------------

How many data buffer pages should you allocate?  There is no explicit
equation to find the ideal buffer pool size.  The following general
suggestions may help, though this is not an exhaustive list:

   *   Total shared memory should not exceed free real memory available
       on the system.  Available real memory can be determined only at
       run time since it depends on the total number of processes on the
       system and on the type of application programs running on the
       system.  The DBA may be able to make estimates.  Excessive shared
       memory causes page faults.

   *   You never get a performance benefit by defining more page space
       than will stay in real memory.  If the data buffers force paging
       of virtual space, too many buffers will degrade performance.

       On MPE/iX when not using pseudo-mapped files, use about 6 to 12
       times the number of active transactions.  Shared memory only holds
       buffers temporarily, since real buffering occurs in the operating
       system.  With pseudo-mapped files and on HP-UX, try to use a large
       buffer pool within the limits of the first two suggestions.
       However, a larger pool in general slows down checkpoints if the
       fraction of dirty pages remains the same.  There is a trade-off
       between checkpoint time and response time.

   *   When possible, with pseudo-mapped files on MPE/iX and on HP-UX,
       the buffer pool should accomodate all pages that are accessed
       frequently.  If a small table or index is frequently accessed,
       then the buffer pool should accomodate all its pages, if possible.

   *   If a table is being accessed without much locality of reference
       (i.  e., with almost random access) and its total size is orders
       of magnitude larger than any practical buffer pool size, then
       increasing the size of the buffer pool does not help much for that
       particular table.  For example, if a table is 100 MBytes large,
       then a 2000 page buffer pool does not work much better than a 1000
       page pool.

   *   With respect to the previous suggestion on HP-UX and with
       pseudo-mapped files on MPE/iX, if the table has a 2000 page index
       that is used frequently, then a 2000 page buffer pool performs
       better than a 1000 page pool.

When should you increase the number of buffer pages?  There is not much
advantage in increasing the buffer pool size on MPE/iX if you are not
using pseudo-mapped files.  However, if you are using pseudo-mapped files
in MPE/iX, or in HP-UX, if there is a high hit ratio, increasing the
number can help performance.  If the set of applications does not have
locality of data and index pages, it may not matter how many data buffers
there are.  Once you have enough data buffer pages, performance is not
very sensitive to adding more.

How do you keep a small, frequently used group of pages in memory
consistently?  The trick is to cause other pages to be swapped out first.
ALLBASE/SQL employs a least recently used (LRU) algorithm to discard
older pages that have not been used recently to make room for new pages.
In order to retain a special group of pages in memory, there must be
other pages that are available for use by each read or write operation
that does not use the special group.  Furthermore, these other pages must
be allowed to age sufficiently that they become less recently used than
the special group.  If you use a large enough number of pages outside the
special group, each page will age to the point of being least recently
used before any of the special group is swapped out.  (This point is
further clarified in the example that follows.) 

Use the following suggestions to determine how many data buffer pages is
enough:

   *   Examine the performance-crucial transactions.  (Usually there are
       only a few which are sensitive or dominate performance).  Look at
       the tables which are used in these transactions.
          *   INSERT actions tend to cluster at the end of a table, so if
              there is insert activity, add a buffer page for each active
              user (i.e., the instantaneously active count).
          *   If a table or index is small enough to fit in real memory
              and is heavily accessed, add the number of pages in the
              table or index.
          *   If an index is large, but the number of high level index
              pages is small and the table is heavily accessed, add a
              number of pages corresponding to the number of index levels
              above the leaf level.
          *   If a table is too big to fit in real memory and the access
              is random, add a buffer page for aging for each transaction
              before the small table is accessed.  This allows for LRU.
          *   Estimate the table access pattern in your transactions.  If
              small tables are accessed in 50% of them or more, you
              should gain by having a large enough buffer pool to hold
              the small tables.
   *   The right number of buffers is approximately the number needed to
       hold the small tables and the aged pages of the large tables.

Basic Example.     

The dominant transaction in a sample application has three actions:

   *   Update a small table.
   *   Update a large table.
   *   Insert to a third table.

Assume that the tables for the sample occupy the following space in
DBEFiles:

   *   The small table and its index use 50 pages:  one root index page,
       10 leaf index pages, and 39 data pages.

   *   The large table uses 50,000 pages:  12,000 index pages (including
       one root page, 100 pages of second level index, and 11,899 leaf
       pages), and 38,000 data pages.

Also assume the following:

   *   Each insert into the third table requires 40 bytes, and the pages
       of the third table fill up about every 100 transactions.

   *   There are about 10 active transactions in the database at any
       instant.

At a given instant, the buffer pool in this scenario will come to look
like the following:

-------------------------------------------------------------------------------------------------
|    Source    |    Oper-     |         Pages Already          |        Pages Requiring         |
|              |    ation     |         In Data Buffer         |          Physical I/O          |
-------------------------------------------------------------------------------------------------
- Small table  - Update       - 11 (index root and leaf)       -                                -
-              -              - 10 (data)                      - 40                             -
-------------------------------------------------------------------------------------------------
- Large table  - Update       - 1 (index root)                 -                                -

|              |              | 10 (2nd level index pages      | 100 (total index 2nd level     |
|              |              | active per transaction)        | pages)                         |
|              |              | 10 (leaf pages active per      | 12,000 ( total index leaf      |
|              |              | transaction)                   | pages)                         |
|              |              | 10 (data pages active per      | 50,000 (total data pages)      |
|              |              | transaction)                   |                                |
-------------------------------------------------------------------------------------------------
| Third table  | Insert       | 10 (data pages active per      |                                |
|              |              | transaction)                   |                                |
-------------------------------------------------------------------------------------------------
| Total in     |              | 62                             |                                |
| Cache        |              |                                |                                |
-------------------------------------------------------------------------------------------------

Based on the numbers of pages given above, and assuming the default
number of data buffer pages (100), the application will require an
average of 7 I/O's per transaction, as shown in the following table:

----------------------------------------------------------------------------------------------
-            Source            -          Operation           -       Number of I/O's        -
----------------------------------------------------------------------------------------------
- Small table                  - Read Data                    - 1                            -
-                              - Write Data                   - 1                            -
----------------------------------------------------------------------------------------------
- Large table                  - Read 2nd level index         - 1                            -
-                              - Read leaf index              - 1                            -
-                              - Read data                    - 1                            -
-                              - Write data                   - 1                            -
----------------------------------------------------------------------------------------------
- Third table                  - Write data                   - .01                          -
----------------------------------------------------------------------------------------------
- Commit                       - Write Log                    - 1                            -
----------------------------------------------------------------------------------------------
- Total                        -                              - 7.01                         -
----------------------------------------------------------------------------------------------

There is a 25% chance that a random access to the small table will want a
page that is already in the buffer pool.  There is a 10% chance that the
large index 2nd level page is already in the buffer pool.  These reduce
the I/O count to about 6.4 instead of 7, as shown in the following table:

----------------------------------------------------------------------------------------------
-            Source            -          Operation           -       Number of I/O's        -
----------------------------------------------------------------------------------------------
- Small table                  - Read Data                    - .75                          -
-                              - Write Data                   - .75                          -
----------------------------------------------------------------------------------------------
- Large table                  - Read 2nd level index         - .9                           -
-                              - Read leaf index              - 1                            -
-                              - Read data                    - 1                            -
-                              - Write data                   - 1                            -
----------------------------------------------------------------------------------------------
- Third table                  - Write data                   - .01                          -
----------------------------------------------------------------------------------------------
- Commit                       - Write Log                    - 1                            -
----------------------------------------------------------------------------------------------
- Total                        -                              - 6.41                         -
----------------------------------------------------------------------------------------------

First Threshold for Performance Gain.   

System performance will improve with the addition of more data buffers.
The first threshold of performance gain is when there are enough data
buffers to prevent the pages of the small table from being swapped out.
This will happen only if the small table's pages never become least
recently used.  Suppose we want 62 pages to stay in memory from the
previous calculation, and there will be about 10 insert pages.  In each
transaction, accessing the large table will require touching a root page,
two more index pages, and a data page.  If the two non-root index pages
and the data page are read each time, we touch three pages that are new.
We need enough pages in memory so that pages from the large table will
always be considered less recently used than the pages from the small
table.  A minimum estimate would be 120 pages for that; a more
comfortable estimate would allow 160 pages.  The buffer pool in this
scenario (using the 120 page minimum) will come to look like the
following:

-------------------------------------------------------------------------------------------------
|    Source    |    Oper-     |         Pages Already          |        Pages Requiring         |
|              |    ation     |         In Data Buffer         |          Physical I/O          |
-------------------------------------------------------------------------------------------------
- Small table  - Update       - 11 (index root & leaf)         -                                -
-              -              - 40 (all pages in buffer)       -                                -
-------------------------------------------------------------------------------------------------
- Large table  - Update       - 1 (index root)                 -                                -
|              |              | 40 (2nd level index            | 100 (total index 2nd level     |
|              |              | pages--active and "aging")     | pages)                         |
|              |              | 40 (leaf pages--active and     | 12,000 ( total index leaf      |
|              |              | "aging")                       | pages)                         |
|              |              | 40 (data pages--active and     | 50,000 (total data pages)      |
|              |              | "aging")                       |                                |
-------------------------------------------------------------------------------------------------
| Third table  | Insert       | 10 (data pages active per      |                                |
|              |              | transaction)                   |                                |
-------------------------------------------------------------------------------------------------
| Total in     |              | 182                            |                                |
| Cache        |              |                                |                                |
-------------------------------------------------------------------------------------------------

Here is the resultant pattern of I/O's:

----------------------------------------------------------------------------------------------
-            Source            -          Operation           -       Number of I/O's        -
----------------------------------------------------------------------------------------------
- Small table                  - Read Data                    - 0                            -
|                              | Write Data                   | 0 (deferred until            |
|                              |                              | checkpoint)                  |
----------------------------------------------------------------------------------------------
- Large table                  - Read 2ND level index         - .9                           -
-                              - Read leaf index              - 1                            -
-                              - Read data                    - 1                            -
-                              - Write data                   - 1                            -
----------------------------------------------------------------------------------------------
- Third table                  - Write data                   - .01                          -
----------------------------------------------------------------------------------------------
- Commit                       - Write Log                    - 1                            -
----------------------------------------------------------------------------------------------
- Total                        -                              - 4.91                         -
----------------------------------------------------------------------------------------------

Second Threshold for Performance Gain.   

The next performance gain comes from fitting all the pages in the second
level of the large table's index into the data buffer cache.  There are
100 pages in the second index level, so the buffer pool in this scenario
will come to look like the following:

-------------------------------------------------------------------------------------------------
|    Source    |    Oper-     |         Pages Already          |        Pages Requiring         |
|              |    ation     |         In Data Buffer         |          Physical I/O          |
-------------------------------------------------------------------------------------------------
- Small table  - Update       - 11 (index root & leaf)         -                                -
-              -              - 40                             -                                -
-------------------------------------------------------------------------------------------------
- Large table  - Update       - 1 (index root)                 -                                -
-              -              - 100 (2nd level index pages)    -                                -
|              |              | 100 (leaf pages--active and    | 12,000 ( total index leaf      |
|              |              | "aging")                       | pages)                         |
|              |              | 100 (data pages--active and    | 50,000 (total data pages)      |
|              |              | "aging")                       |                                |
-------------------------------------------------------------------------------------------------
| Third table  | Insert       | 10 (data pages active per      |                                |
|              |              | transaction)                   |                                |
-------------------------------------------------------------------------------------------------
| Total in     |              | 362                            |                                |
| Cache        |              |                                |                                |
-------------------------------------------------------------------------------------------------

Here is the resultant pattern of I/O's:

----------------------------------------------------------------------------------------------
-            Source            -          Operation           -       Number of I/O's        -
----------------------------------------------------------------------------------------------
- Small table                  - Read Data                    - 0                            -
|                              | Write Data                   | 0 (deferred until            |
|                              |                              | checkpoint)                  |
----------------------------------------------------------------------------------------------
- Large table                  - Read 2nd level index         - 0                            -
-                              - Read leaf index              - 1                            -
-                              - Read data                    - 1                            -
-                              - Write data                   - 1                            -
----------------------------------------------------------------------------------------------
- Third table                  - Write data                   - .01                          -
----------------------------------------------------------------------------------------------
- Commit                       - Write Log                    - 1                            -
----------------------------------------------------------------------------------------------
- Total                        -                              - 4.01                         -
----------------------------------------------------------------------------------------------

This analysis predicts one plateau of performance at about 180-200 pages
in the buffer pool, a second plateau at about 360-400 pages.  The next
plateau would require something like 25,000 pages, which is impractical
for most installations.

In practice, there is little penalty for extra pages, so use a generous
allocation (that is, 200 or 400 pages or more), if you can afford to.


NOTE The previous calculations assume only the pages involved in one transaction type involving a small and a large table. If you have other transactions contending for the database, be sure to allow adequate buffer space for them too.
Cautions. * The size of the page pool combined with other demands on memory should not exceed available real memory. * On a Series 300 system, where real memory is in the 2 MB - 8 MB range, the tradeoff deserves more attention. * Checkpoints may take longer if the buffer pool is larger. A checkpoint writes to disk all dirty pages in the buffer pool. If there is a consistent fraction of dirty pages, the checkpoint takes longer with a large buffer pool. However, if the large buffer pool is essentially filled with read-only data, the checkpoint may not take much longer. A rough estimate can be obtained by looking at the buffer pool size derived in the analysis above. Choosing a Number of Log Buffer Pages The number of log buffers is independent of the number of data buffers. The log buffer pages are only 512 bytes (in contrast with the 4096 byte pages for the data buffers). If the log buffer pool becomes full before any transaction completes, then at least two I/Os will be needed. You should provide enough log buffers to allow all active transactions to have log space. Additional buffers will not help performance, but adding pages probably doesn't hurt either. In the above transaction, we assume that the data in the log takes 2 pages. (Before and after images of the changed data cause about 600 bytes to be logged; the insert is 100 bytes; we allow 100 bytes for log overhead; and we round up to 1024 bytes or 2 pages.) This implies that 20 log buffer pages should be sufficient for 10 transactions. If there are occasional bursts of more than the average, 40 or 50 pages might be a good idea. Choosing the Number and Size of Log Files ALLBASE/SQL gives you great flexibility in choosing the number and size of log files. The following suggestions may be useful in making choices for your system. Nonarchive Log Guidelines. The size of nonarchive log files determines the frequency of automatic system checkpoints, and it therefore determines rollback recovery time. The total size of the log (all files taken together) determines the largest transaction that can be accomodated in the DBEnvironment. * Total log space should be large enough to accomodate changes from the largest transaction. Otherwise large transactions may always abort. * An automatic checkpoint is done every time a log file gets full. By choosing a specific file size, you can force a checkpoint to take place at specific intervals, thereby controlling the amount of time required for rollback recovery. * In general, a large log file is good for nonarchive logging. However, a checkpoint can cause significant loading to be put on the system when the log file is large. When a checkpoint occurs, all data pages relating to the transactions currently held in the log file must be forced to disk if they have not already been swapped out. If very large log files are used, the checkpoint process could flood the I/O subsystem and cause interference with other higher priority processing. If such flooding occurs, use a smaller log file or use the CHECKPOINT statement to force a checkpoint at regular intervals. * The transaction that runs out of log space triggers a checkpoint, and this causes all transactions to wait until the buffers are flushed. * Use a large log buffer size to avoid frequent flushing of the buffer to disk. * Large non-archive log files can reduce the number of system checkpoints. Use the SYSTEM.COUNTER view to display the number of system checkpoints. * Checkpoints should be postponed as far as possible, so long as rollback recovery time is within limits. The reason is that checkpoints flush dirty pages, and this prevents the saving of writes in those cases where the same page gets updated frequently. Also, checkpoints increase disk contention. Archive Log Guidelines. * The transaction that runs out of log space triggers a checkpoint, and this causes the transaction to wait until the buffers are flushed. Other transactions may continue if logging is in archive mode. * In archive mode, use at least two log files, so that one can be backed up while the other is being used. * In archive mode, use large log files to minimize the frequency of log backups and checkpoints. Sorting Operations The sort method in ALLBASE/SQL is memory-intensive, using the tournament sort algorithm for sorting and merging. It sorts in two phases: * Run generation. It reads the input rows, sorts them in the tournament tree and generates runs (a run is a list of sorted tuples). The number of runs generated depends on the input data and on the sort memory size. For large sort operations, the sorted runs (intermediate results) may be written to scratch files created dynamically in user-defined TempSpaces and purged after sorting is completed. If all input tuples fit in sort memory, then the merge phase is completely avoided and no scratch files are used. * Merge Phase. In the second phase, the sorted runs are merged. The merge width depends on the size of the sort memory. If the number of generated runs is less than the merge width, then only one merge is done. Otherwise several merge passes may be required. Using TempSpaces. Create TempSpaces on your system to accomodate the scratch files needed for sort operations. You use the CREATE TEMPSPACE statement to define a group (MPE/iX) or directory (HP-UX) in which ALLBASE/SQL may open scratch files. Scratch files are opened to store intermediate results of operations such as sorts from queries using ORDER BY, etc., and from index creation. It is recommended that the DBA create at least one TempSpace in the DBEnvironment. If you do not create a TempSpace, ALLBASE/SQL opens scratch files in the current group (on MPE/iX) or in the /tmp directory (HP-UX). CREATE TEMPSPACE does not itself use any disk space; scratch files are not created at CREATE TEMPSPACE time. Instead, they are dynamically created and purged during sort operations. Disk space under the defined TempSpaces may dynamically go up and down when scratch files are created and purged by ALLBASE/SQL. Tips for TempSpace Usage. * If large sorting operations or index creation are expected, use CREATE TEMPSPACE to define TempSpace locations. Be sure to specify a MaxTempFileSize large enough for your needs. * On MPE/iX, create one TempSpace per volumeset to spread scratch files across volumesets and avoid filling up a single volumeset. On HP-UX, create one TempSpace for each disk partition that is available for scratch files. * ALLBASE/SQL opens a new scratch file after reaching the limit specified in the CREATE TEMPSPACE statement. If the sort is large and the default is small, ALLBASE/SQL may have to open many small scratch files. To avoid this, specify a large MaxTempFileSize value. For small sorts, ALLBASE/SQL does not create large scratch files even if MaxTempFileSize is large; hence, it is safe to specify MaxTempFileSize generously. * If multiple TempSpaces are defined in the DBEnvironment, ALLBASE/SQL opens scratch files by rotation in each TempSpace. * When doing large sorts on HP-UX systems, choose a MaxTempFileSize value that is large enough to prevent the creation of more scratch files than permitted on your HP-UX system. The default limit of open files for one process (configurable in the kernel file) is 60 files. Thus, the total number of files open by ALLBASE/SQL--including scratch files and other files opened by the system--cannot exceed this limit. If the limit is reached, ALLBASE/SQL returns DBERR 3061 together with an HP-UX file system error number. If this error appears, drop and recreate your TempSpaces with larger MaxTempFileSize values. Disk Space for Sorting. All intermediate sort results are written to scratch files, not to any DBEFileSet in the DBEnvironment. Sorting may require 100% to 200% of the input data size in temporary disk files for intermediate results. If a join and an ORDER BY are both involved, the amount could be as much as 300%. However, scratch files are purged as soon as the sort operation is finished. Final sort output for queries with GROUP BY, ORDER BY or DISTINCT is written to the SYSTEM DBEFileSet, requiring space worth 100% of the sorted output. For CREATE INDEX, the final output is directly written to the DBEFileSet containing the index, requiring space worth the size of the index. Summary: * CREATE INDEX: 100-200% of the size of the input key data in TempSpace files and 100% in the INDEX DBEFileSet. * DML statements using a sort: 100-300% of the size of the input data in TempSpace files and 100% in the SYSTEM DBEFileSet. Controlling the Use of Temporary Space. The order in which you carry out database operations may determine the amount of temporary space required. For example, if you create a large table, load it, then create an index, you will probably use scratch file space for sorting; but if you create the table, then create the index, and load after index creation, you will not use scratch files. In the latter case, of course, the load will be 3 to 10 times slower because of the need to create index entries during loading. The size of the index will also be about 50% larger if you are loading unsorted data. Memory Utilization in Sorting. Sorting is memory-intensive, and the amount of memory required is estimated from the number of input tuples. This in turn is estimated based on table statistics, so it is important to have up-to-date statistics (especially for large tables) when executing an ORDER BY, GROUP BY, or CREATE INDEX operation. Performance Hints for Large Sorts. * Issue the UPDATE STATISTICS statement after loading a large table and before issuing the CREATE INDEX statement. Issue another UPDATE STATISTICS after CREATE INDEX to get the correct statistics for the index. * Make sure you have enough disk space in the TempSpaces to hold the scratch files. Use of the Sort/Merge Join Algorithm. ALLBASE/SQL uses two join methods: one is nested-loop join, and the other is sort/merge join. You cannot directly select a join method; the optimizer chooses one depending on the query and the statistics in the system catalog for the tables involved. However, it may still be useful to understand the differences in the algorithms when designing your databases. The nested loop method scans the second table once for each qualifying row in the first table. This can require considerable I/O and CPU time when the table is larger than the buffer cache, but it may be the optimal approach for joins involving small tables. Sort/merge join can be viewed in two phases: the sort and the merge/join phase. * The sort phase. In order for the sort/merge join to work, tuples from two or more tables to be joined together are sorted in ascending order. If the scan on the table is an index scan on the joined column (that is, there is a join on col1 and the index being picked is also on col1 or on col1, col2), then the sort can be skipped since the returned tuples from the scan are already sorted. * The merge/join phase. Multiple tables are merged or joined together without scanning a table over and over as happens in a nested loop join. This saves a significant amount of I/O and CPU time, especially for large tables. Sort/merge join can be of help for result tables that are larger than the buffer cache, since the nested loop method with multiple scan passes would result in more I/O's. Sort/merge join also helps for tables that have indexes that are larger than the buffer cache size. Temporary Space in the SYSTEM DBEFileSet. Query results from queries using ORDER BY and GROUP BY clauses are stored in temporary pages in the SYSTEM DBEFileSet. Be sure to include enough TABLE or MIXED DBEFile space in SYSTEM to accomodate your largest sorted query. As your system catalog grows, you should monitor the DBEFile space in SYSTEM occasionally (after doing an UPDATE STATISTICS on at least one system view) to make sure enough temporary space is still available. As needed, add DBEFiles to SYSTEM to supply temporary pages. Section Caching and Directory Caching. Section caching retains a number of sections in memory between transactions so that they do not need to be read in again for the next execution. Up to 12 sections (4 on Series 300 and 400 computers) are retained. Directory caching stores in memory the DBCore directory, which contains the locations of tables and indexes. Directory caching is turned on when you set the DDL Enabled flag to NO by using ALTDBE in SQLUtil. Section caching is more efficient when DDL Enabled is set to NO. Section caching is subject to the following rules: * When there are 12 or fewer sections in memory (four for Series 300 and 400 computers), the system does not try to delete any permanent sections at the end of a transaction. * When there are more than 12 sections in memory (four for Series 300 and 400 computers), the system deletes only those sections that are not opened. Sections are considered opened if they are cursor sections and have been opened by the OPEN CursorName statement. * Dynamic sections are not deleted even if the cache limit is exceeded. * More than 12 (or 4) sections remain in memory if all of them are opened cursors. The only limit is the amount of user heap space available. This feature is helpful for an application that executes the same sections again and again, especially if it has a small number of sections. Setting Limits for Section Caching By default, ALLBASE/SQL allocates shared memory for up to 4 cached sections on Series 300 and 400 systems, and 12 on Series 700, 800, and 900 (MPE/iX) systems. You can increase this number by setting an environment variable. In HP-UX, the variable name is HPSQLsectcache. In MPE/iX, the JCW is called HPSQLSECTCACHE. HPSQLSECTCACHE can be set to any value from 4 to 128. Using Multi-connect Functionality It is possible to establish a maximum of 32 simultaneous database connections. When your application must access more than one DBEnvironment, there is no need to release one before connecting to another. Performance is greatly improved by using this method rather than connecting to and releasing each DBEnvironment sequentially. For more detailed information, refer to the chapter "Transaction Management with Multiple DBEnvironment Connections" in the ALLBASE/SQL Release F.0 Application Programming Bulletin and the section "Using Multiple Connections and Transactions with Timeouts" in the ALLBASE/SQL Reference Manual chapter "Using ALLBASE/SQL." Using Timeouts to Tune Performance When an application requests a database resource that is unavailable, it is placed on a wait queue. If the amount of time the application waits is longer than the timeout value specified for a given DBEnvironment connection, an error occurs and the transaction is rolled back. Your strategy for specifying timeout values and handling timeout errors depends on the specific needs of your application and on your business procedures. By default, the timeout value is infinite. For example, a transaction may require resources that are locked by another transaction that consumes a great deal of time to execute. In this case, you could specify a reasonable amount of time for the transaction to wait before a timeout occurs. Such a strategy might be essential in a distributed environment. For more detailed information, refer to the chapter "Transaction Management with Multiple DBEnvironment Connections" in the ALLBASE/SQL Release F.0 Application Programming Bulletin and the section "Setting Timeout Values" in the ALLBASE/SQL Reference Manual chapter "Using ALLBASE/SQL."


MPE/iX 5.0 Documentation