HP Precision Architecture: Extending RISC [ General Information Manual ] MPE/iX 5.0 Documentation
General Information Manual
HP Precision Architecture: Extending RISC
The RISC principles are keys for providing high-performance processors.
However, providing a long-lasting architecture that can deliver
high-performance, cost-effective solutions in commercial processing
environments requires additional architectural features. PA-RISC goes
beyond RISC with the important extensions discussed below.
Expanded addressability
PA-RISC systems can be implemented with either 48- or 64-bit virtual
addresses, thus expanding addressability far beyond that of typical
32-bit systems. For example, 64-bit addressability provides over 4
billion times the virtual addressability typically available on
conventional 32-bit systems! This flexibility for supporting large
virtual address spaces ensures that 900 Series systems will be able to
meet expandability requirements as next-generation software evolves and
as commercial processing needs continue to grow.
Multiprocessors
PA-RISC allows for systems that use tightly coupled symmetric
multiprocessors. Multiprocessors share the same memory, I/O buses, and
I/O devices. They can be used to enhance system performance through
distribution of the system workload, and they provide higher availability
using CPU redundancy.
Floating-point coprocessors
The modular design of PA-RISC allows for the addition of special-function
coprocessors for accelerating execution of those complex functions that
may be important in some application mixes. For example, some
scientific, engineering, and statistical applications run on
general-purpose systems may require high-performance floating-point
calculations. For such applications, a floating-point coprocessor is
available to enhance performance.
Figure A-3. Coprocessor and Multiprocessor
Decimal arithmetic support
Decimal arithmetic is a data type commonly used in commercial
applications, and PA-RISC provides simple, powerful instruction
primitives to ensure high-speed decimal calculations. For example, the
Decimal Correct and Unit Add Complement instructions allow for packed and
unpacked decimal addition to be performed with the binary add
instruction. Decimal calculations actually require fewer CPU cycles to
execute on 900 Series systems than on conventional systems.
High-performance input/output
Providing effective support of database management systems is one of the
key strengths of the HP 3000 family. Thus, a key design objective of
PA-RISC was to ensure a high level of data security and high throughput
in I/O-intensive database applications. The first step was to provide a
large virtual address space, which can be used very effectively by
MPE/XL's file mapping schemes. Furthermore, PA-RISC incorporates a
memory-mapped I/O scheme, whereby I/O operations are initiated and
controlled using a series of load/store instructions to reserved virtual
or real memory locations. A key advantage of this scheme is that I/O
accesses use the same access protection mechanisms as code and data.
Coupled with other I/O subsystem features such as DMA chaining, which
allows multiple transactions to be processed without CPU intervention,
I/O operations on PA-RISC systems carry less overhead and deliver
increased I/O performance.
Instruction pipelining
Instruction pipelining refers to the simultaneous execution of multiple
instructions. For example, in a five-stage pipeline the instruction is
fetched from cache during the first stage and is decoded during the
second stage. The CPU internal calculation or function is then performed
during the third stage, and the fourth stage is used to generate the
condition code for the corresponding result. Finally in the fifth stage,
a general-purpose register is set with the corresponding cache or
internal result.
Fixed-length, fixed-format instructions help streamline instruction
pipelining. Additionally, load/store RISC-based machines are ideal for
minimizing the number of pipeline stages required for high performance
and for ensuring that the time required to perform each stage is as short
as possible.
Figure A-4. Instruction Pipeline
Delayed-branch capability
On conventional computers, the instruction sequentially following a taken
branch instruction is loaded into the pipeline but is not executed. The
result is a dead cycle that is not used for processing. On PA-RISC
systems, a branch instruction can specify that the instruction
sequentially following the branch is to be executed, so that this cycle
can be used for processing. Because branches constitute roughly
one-sixth of typical instruction mixes, using the available cycle after a
branch results in increased performance with PA-RISC systems.
Figure A-5. Delayed Branch Capability
Optimizing compilers
Optimizing compilers ensure the best possible coupling between high-level
languages and PA-RISC machine instructions. Reduced complexity systems
are ideal for optimizing compilers, and consequently, the best
performance on such systems depends on effective optimization. The
optimizing compilers on the 900 Series systems analyze program behavior
at a global level and ensure that instructions are executed in the most
efficient order. Frequently accessed operands are allocated to CPU
registers, so that the number of accesses to cache and main memory is
minimized. Instructions are scheduled such that the efficiency of the
instruction pipeline is maximized. For example, compilers schedule
instructions so that the available cycle after a taken branch is used for
useful processing, and they overlap other instructions with load
instructions to keep execution rates close to one instruction per cycle.
Millicode
The 900 Series systems use millicode routines to perform some of the more
frequently executed complex tasks. Millicode routines, quite simply, are
sequences of PA-RISC instructions that can be accessed and executed very
efficiently by the operating system and provide complex functions such as
moving characters, and so forth. These performance-tuned millicode
routines ensure effective support of complex functions sometimes required
by high-level languages.
Extensive data and code protection mechanism
PA-RISC specifies a four-level privilege scheme for all code, data, and
I/O accesses. This is supplemented by a 15-bit protection identifier
that is assigned to each virtual page and checked each time the page is
accessed. The flexibility of this scheme allows for efficient data and
code sharing and ensures a high level of data and code security.
MPE/iX 5.0 Documentation