Internal Techniques [ TurboIMAGE/XL Database Management System Reference Manual ] MPE/iX 5.5 Documentation
TurboIMAGE/XL Database Management System Reference Manual
Internal Techniques
Although it is not necessary to know the following techniques to use
TurboIMAGE/XL, an understanding of them can help you design a more
efficient database.
Primary Address Calculation
TurboIMAGE/XL employs two distinct methods of calculating primary
addresses in master data sets. The intent of the first method is to
spread master entries as uniformly as possible throughout the record
space of the data file. This method is referred to as "hashing" and
applies only to master data sets with key items of type U, X, Z, or P. In
this case, the entire key item value regardless of its length is folded
into a positive 32-bit value.[REV BEG] This value is reduced modulo the
hashing capacity and then incremented by one to form a primary address.
The hashing capacity is the maximum capacity for a set not enabled for
dynamic expansion, or initial capacity for a set enabled for dynamic
expansion of a new database, or the original capacity (excluding
expansions) for a set enabled for dynamic expansion of an existing
database.
[REV END]
NOTE Because values are folded from right to left, applications should
place values in a type U, X, Z, or P field as follows:
* Values that change should be placed first in the field.
* Values that are static should be placed last in the field.
The intent of the second method is to permit you to control the primary
address assignment. This method applies only to master data sets with
key items of type I, J, K, R, or E. In this case, the right-most block of
32 bits of the key value is treated as a double integer. (If its sign
bit is not zero, it is set to zero. If the key item is only 16 bits
long, a 32-bit value is created by prefacing these 16 bits with 16 zero
bits.)
This 32-bit integer is decremented by 1 and reduced modulo the data set
hashing capacity after which 1 is added to the result to form the primary
address. If the application provides key values whose right-most 32 bits
take on values between 1 and N (where N is no greater than the data set
hashing capacity), the corresponding entries will be assigned primary
addresses 1 through N which is identical to the Direct Access Method
(DAM). In this event, there are no secondaries and performance is
outstanding. However, if the application has no control of the key value
assignment and/or if N exceeds the hashing capacity, secondaries will
occur; this, along with the clustering typical of this method, may result
in poor DBPUT performance. This method should be used only if you have
determined that the potential poor performance consequences cannot occur.
The intent of the two primary address algorithms is to spread master
entries as uniformly as possible throughout the record space of the data
file. This uniform spread reduces the number of synonyms occurring in
the master data set.
NOTE In general, a master data set with a capacity equal to a prime
number or to the product of two or three primes can yield fewer
synonyms than master data sets with capacities consisting of many
factors. Refer to appendix C for a list of prime numbers. More
information on dynamic data set expansion is given earlier in this
chapter.
Migrating Secondaries
In some cases, secondary entries of master data sets are automatically
moved to storage locations other than the one originally assigned. This
most often occurs when a new master data entry is assigned a primary
address occupied by a secondary entry. By definition, the secondary
entry is a synonym to some other primary entry resident at their common
primary address. Thus, the new entry represents the beginning of a new
synonym chain. To accommodate this new chain, the secondary entry is
moved to an alternate secondary address and the new entry is added to the
data set as a new primary entry. This move and the necessary linkage and
chain head maintenance is done automatically.
A move can also occur when the primary entry of a synonym chain that has
one or more secondary entries is deleted. Because retrieval of each
entry occurs through a synonym chain, each synonym chain must have a
primary entry. To maintain the integrity of a synonym chain,
TurboIMAGE/XL always moves the first secondary entry to the primary
address of the deleted primary entry.
Space Allocation for Master Data Sets
[REV BEG]
Space allocation for each master data set is controlled by a free space
counter resident in the user label of the data set, by each bit map that
monitors each block of the data set, and by enabling of the data set for
dynamic expansion.
When a data set is enabled for dynamic expansion, the expansion is
triggered at run-time during DBPUT (implied for automatic master) when
the data set has reached its almost-full capacity. Once an expansion is
done, the master data set is partitioned into two areas: the original
area corresponding to the original initial capacity (hashing) and the
expansion area corresponding to all extents allocated as a result of
run-time expansion. The original area has a mixture of primary and
secondary entries. The expansion area has only secondary entries which
are added using a delete chain head and end-of-file pointer (high water
mark), similar to detail data sets.
The master data sets which are not enabled for dynamic expansion have
only one area which contains both primary and secondary entries.
When a new entry is added, TurboIMAGE/XL decrements the free space
counter and sets the bit corresponding to the newly assigned record
address to a one. If the bit is a zero before the record is added, the
assigned record address is the primary address. If the bit is a one
before the record is added, it indicates that an entry already exists.
If this existing entry is a primary entry, a search is done to find a
free location, secondary address, for the new entry. However, if the
existing entry is a secondary entry, this secondary entry is relocated to
another free secondary address, and the new entry is added at this
location. If the data set is enabled for dynamic expansion, the search
for a free secondary address is done in the original as well as expansion
area. For the original (primary, hashing, or initial) area, a secondary
address is identified by a serial search of the bit maps of blocks for a
zero indicating an unused record. For the expanded area, a secondary
address is identified using the pointer to a delete chain and end-of-file
pointer, as in detail data sets.
[REV END]
Space Allocation for Detail Data Sets
Space allocation for each detail data set is controlled by a free space
counter, an end-of-file pointer and a pointer to a delete chain. The
end-of-file pointer contains the record address of the highest-numbered
entry which has existed so far in the data set. The delete chain pointer
contains the record address of the entry which was most recently deleted.
When each detail data set is first created, the end-of-file pointer and
delete chain pointer are both zero.
When a new entry is added to a detail data set, TurboIMAGE/XL assigns to
it the record address referenced by the delete chain pointer, unless the
pointer is zero or HWMPUT has been enabled either using DBCONTROL or
using DBUTIL.[REV BEG] If the delete chain pointer is zero or if the
HWMPUT flag is enabled, the end-of-file pointer is incremented and then
used as the assigned record address. The free space counter is
decremented in either case. When a new entry is to be added to a detail
data set and the free space counter is zero, at run-time, the data set is
expanded according to the capacity expansion parameter specified for this
data set (if any). The capacity expansion parameters can be set using
DBSCHEMA for new databases or by using DBCHANGE Plus or other third-party
utilities for existing databases.
[REV END]
When an existing entry is deleted, its media record is zeroed, the first
word is replaced with the current delete chain pointer, and the block is
written to disk. The delete chain pointer is set to the address of the
newly deleted entry and the free space counter is incremented.
The delete chain is, in effect, a "last-in-first-out" linked list of
reusable media record space. Reusable space is allocated in preference
to the unused space represented by record addresses beyond the
end-of-file pointer except when HWMPUT is enabled.
Addition and deletion of data entries also requires data chain
maintenance and turning on or turning off the corresponding bit of the
appropriate bit map. Both of these are necessary for retrieval integrity
but neither play a role in space allocation for detail data sets.
Buffer Management
TurboIMAGE/XL maintains a set of buffer partitions in the DBB for all
users of an open database. DBFIND, DBGET, DBUPDATE, DBPUT, and DBDELETE
locate a buffer header from one of these partitions.
Each partition is allocated its own buffer header pool, hash table, and
free list. The buffer header pool is a set of buffer headers allocated
for the accessors of its corresponding partition. The hash table
consists of linked lists of buffer header addresses either in use or
ready to be released. The free list is a linked list of available buffer
headers. Initially, when the DBB is created, all of the buffer headers
belonging to a partition are linked to a free list and all the hash table
chains are empty. TurboIMAGE/XL uses a two-level hashing algorithm based
on the block number of the data set to determine the partition number as
well as the hash table entry to be used.
When an intrinsic issues a request for a data set block, the buffer
manager starts the search from its hash table entry. If the hash table
chain is empty, it acquires a buffer header from the free list. The
buffer header is first allocated from the free list to build the buffer
header for the data set block and link it to its appropriate hash table
chain. When the hash chain, as well as the free list search, is
exhausted, the process pauses to wait for other processes to release
buffers then retries the buffer header pool scan.
Locking Internals
Within the DBG is a large lock area that provides space for the entries
described below.
Accessor Entries.
One of these is created for each successful call to DBOPEN (each access
path). Although located in the lock area, each accessor entry is the
link with which TurboIMAGE/XL controls access to the database. An
accessor entry is deleted when DBCLOSE is called for the access path, and
the space is reused.
Set Entries.
One of these is created for every data set that is specified in a lock
request. Therefore, the maximum number of set entries is equal to the
number of data sets in the database. These entries are never deleted.
Descriptor Entries.
These entries contain the internal form of the lock descriptors specified
in locking mode 5 or 6. They disappear when the locks are released (when
DBUNLOCK is called) and the space is reused.
[REV BEG]
In addition to DBG, the global database lock table, TURBOLKT, is used to
avoid deadlocks by IMAGE/SQL users, and from TurboIMAGE/XL users (if
deadlock detection is activated by DBCONTROL mode 7). The TURBOLKT
contains information pertaining to every lock by every user on the
system. In the event of a potential deadlock, an error is returned
instead of causing a system hang.[REV END]
MPE/iX 5.5 Documentation