HP 3000 Manuals

High Availability [ General Information Manual ] MPE/iX 5.0 Documentation


General Information Manual

High Availability 

The HP 3000 is targeted for operationally critical applications where the
operation of the business is dependent upon system availability and data
integrity--If the system stops, the business stops.  The HP 3000 900
Series offers significant enhancements that deliver very high data
availability and integrity for critical business applications.  High
availability begins with high-quality PA-RISC. The HP 3000 900 Series
(based on warranty data on the HP 3000 Series 950) has achieved over a 50
percent increase in reliability above the industry-acknowledged highly
reliable HP 3000 Series 70.

The high-availability strategy for the HP 3000 provides you with the
ability to configure your system with high-availability options to meet
the high-availability needs of your environment.  Specifically, the HP
3000 provides products that minimize downtime from both unplanned and
planned events.  Unplanned downtime results from component failures,
while planned downtime results from normal system operation events such
as data backup.  The goal of the HP 3000 high-availability products is to
provide system availability of 99.9 percent with respect to unplanned
downtime, to limit downtime from any given failure to 30 minutes or less,
and to allow 24-hour-a-day operation.

Some high-availability features, such as transaction management,
automatic powerfail recovery, user volumes, and online diagnostics, are
standard on all HP 3000 900 Series systems.  Other solutions, such as HP
Mirrored Disk/XL, HP AutoRestart/XL, HP SPU Switchover/XL, HP
TurboSTORE/XL, are separate products that can be purchased to tailor your
system to the specific requirements of your environment.

Transaction management 

Over the last 20 years, computers evolved from batch-oriented systems to
online systems that are characterized by a large number of users
performing simultaneous updates to common data.  OLTP systems have very
stringent requirements for response times, accuracy of data, and the
ability to recover from system and hardware failures with complete data
integrity.

For example, an airline reservation system must maintain data integrity
even if several users are performing updates to the same file.  It is
important that the application ensures that the same seat is not assigned
to more than one passenger.  This is accomplished by locking common data
so that only one user at a time may change it.

In addition, the application must not lose any data in the case of a
system failure.  Users must be able to recover from both soft failures,
which do not cause any data to be altered on disk, and hard failures,
such as a disk head crash when data is destroyed.  Checkpointing and
logging provide the ability to recover from both soft and hard failures.
Checkpointing refers to saving a snapshot of data in a known consistent
state.  Logging is the saving of the actual checkpoint data in a file.

If a transaction is aborted or a soft failure occurs before the
transaction is committed to disk, the file can be restored to its
original state by copying the before image of the data from the log file
back into the data file.  This is the same as rollback recovery in HP
TurboIMAGE or HP ALLBASE/SQL.

In the case of a hard failure, transactions from the log file can be
applied to a backup and a rollforward recovery of the transactions can be
performed.  This method of recovery just reapplies all the transactions
to some checkpoint version of the file.

Transaction manager.   

The transaction manager has been integrated into the operating system and
is standard on every 900 Series system.  The transaction manager performs
automatic checkpointing and logging activities for critical system data
structures such as the file system and HP ALLBASE/SQL and HP TurboIMAGE
databases.  In the event of a system software failure, the transaction
manager provides automatic data integrity recovery of these critical data
structures.

In most commercial computing environments, file system, databases, and
applications, each manage transactions and recovery differently.  The
result calls for a complex solution that requires duplication of effort,
which incurs high administrative and support overhead costs and
compromises overall performance.  The HP 3000's transaction manager
consolidates all of these functions into a single, efficient, and
consistent module that is common across all disk access methods.
Performance and efficiency gains are also realized over implementations
at higher levels of the system by tight coupling with memory management,
I/O, and PA-RISC protection hardware.

Transaction logging.   

The HP 3000 provides comprehensive logging facilities that are integrated
into the operating system and databases.

   *   System logging records the details of system resource requests.
   *   Database logging ensures the logical and physical data integrity
       of HP ALLBASE/SQL and HP TurboIMAGE data.
   *   User logging allows applications to log events and data to disk or
       tape files.

The system manager can enable and disable system logging, as well as
select which system events to record.  System log records are provided
for job and session initiation and termination, program completion, file
closing, spooling completion, system shutdown, and I/O device failures.

The database administrator relies on built-in database logging and
recovery.  The HP ALLBASE/SQL relational database management system logs
before and after images for every write transaction.  In the event of a
system failure or a program abort, the log file is used to automatically
roll back any partially completed transactions.  In the event of a
hardware or software failure, the transactions from the log file can be
reapplied to a backup copy of the database to bring it up to the current
state.

The HP TurboIMAGE database management system also includes an intrinsic
logging facility, Intrinsic Level Recovery (ILR), must recover, and
dynamic rollback features.  These features ensure the logical and
physical integrity of information maintained in TurboIMAGE databases.
TurboIMAGE databases can also be rolled forward or rolled back in case of
data loss.

The HP 3000 provides user logging using intrinsics, or system procedures,
which log application transactions to disk or tape.  The application
developer has the flexibility to choose whether to wait until the
transaction is physically posted to the logging device before continuing
to the next transaction, or to continue immediately.

Automatic power failure recovery 

Automatic power failure recovery is provided by the operating system in
conjunction with the HP 3000 hardware.  Should a power failure occur, the
system initiates a power failure procedure that preserves the operating
environment prior to a complete loss of power.  A battery pack, supplied
standard with each HP 3000, ensures the validity of main memory for at
least 15 minutes.  If power is restored within this 15-minute period, the
system automatically resumes processing from the point at which the power
failure occurred.  Jobs and sessions in progress continue where they were
interrupted, unaware of the interruption and without loss of data.

HP AutoRestart/XL.   

Software failures can contribute significantly to extended downtime.  HP
AutoRestart/XL reduces this downtime by automatically and immediately
saving the system state and initiating system restart.  No operator
intervention or action is necessary.  Hence system recovery time is
minimized due to a software failure.  AutoRestart/XL transfers the system
state directly to disk rather than tape, which reduces the time required
to save the information by at least 50 percent on high-end HP 3000
systems with larger memory configurations.  AutoRestart/XL also performs
data compression as it is saving the system state, thus minimizing the
amount of disk space required.  Once the system error information has
been transferred to disk, problem analysis can begin immediately either
locally or remotely.

HP Mirrored Disk/XL.   

Disk failure is one of the major causes of lengthy unplanned downtime.
Although HP disk reliability makes these events rare, their occasional
occurrence can result in several hours of downtime.  To prevent downtime
from disk failure, the HP 3000 offers HP Mirrored Disk/XL.

[FFN24]
Figure 3-3. HP Mirrored Disk/XL HP Mirrored Disk/XL provides duplicate disk drives for critical application data. In the event of failure of a disk drive that is mirrored, HP Mirrored Disk/XL automatically and transparently switches all I/O activity for the mirrored pair to the mirrored partner without disruption to the users. Neither existing HP 3000 applications nor new applications require any special coding to take advantage of HP Mirrored Disk/XL. Repair and resychronization of the failed mirrored disk are also performed transparently to users and applications without loss of data integrity. The figure above illustrates the features and benefits of HP Mirrored Disk/XL. HP SPU Switchover/XL. HP SPU Switchover/XL automatically detects system failures and allows for switchover between a primary and a secondary 900 Series HP 3000 processor. A single 900 Series can back up multiple 900 Series systems. An important feature of switchover is the full recovery of user data, including flat files, HP TurboIMAGE, HP ALLBASE/SQL, and third-party databases. Switchover can typically be completed in less than 30 minutes, which dramatically increases system uptime. HP SPU Switchover/XL requires that all systems and DTCs be connected using LAN and all disks be connected using HP-FL.
[FFN25]
Figure 3-4. SPU Switchover/XL SPU Switchover/XL imposes no additional system software performance overhead on either processor. Also, the secondary processor does not need to be rebooted, thereby minimizing the switchover impact on the users of the secondary processor. Finally, after the primary processor has been repaired, applications can be returned to the primary processor. This switchback procedure takes about five minutes, as there is no need to perform data recovery. User volumes The HP 3000 provides a user disk volume facility that allows the creation and access of files on removable disk volumes. User volumes are removable disk packs that can be accessed through the file system. Disk packs mounted on the drives during a system load are dynamically allocated to the system domain for normal use or to the nonsystem domain for private use. Nonsystem-domain packs can be both physically and logically mounted and dismounted during normal system operation. Thus, system security is improved, since sensitive information can be maintained on a separate disk. Failure of a user volume or nonsystem volume does not disrupt users and applications on other disks, which results in higher system availability. Online diagnostics A comprehensive set of online diagnostics can be used by HP customer engineers (CEs) to diagnose system hardware and peripheral problems while the system is in operation. Hewlett-Packard also provides a system self-test that takes 30 seconds to execute and is highly effective in isolating hardware failures. The self-test is designed for ease of use so that the customer can run it prior to requesting service from HP. All of the diagnostic functions are available remotely. A remote support modem is included with the system when you purchase a support contract. By connecting a remote terminal to the system console by way of a modem, a remote console can operate in parallel with the system console. This allows HP CEs to diagnose hardware and run software troubleshooting tools from a remote site. On-line diagnostics and remote support result in less system downtime and reduced maintenance costs.


MPE/iX 5.0 Documentation