Ch 10. Technical Articles [ COMMUNICATOR 3000 MPE/iX General Release 5.0 (Core Software Release C.50.00) ] MPE/iX Communicators
COMMUNICATOR 3000 MPE/iX General Release 5.0 (Core Software Release C.50.00)
Chapter 10 Technical Articles
Workload Manager Technical Overview
by Susan Campbell
Commercial Systems Division
Introduction
The Workload Manager product (product number B3879AA) becomes available
with this release of MPE/iX. This article provides a technical overview
of the Workload Manager. The article "Introducing New Workgroups for the
HP 3000" provides a discussion of the new workgroups and what is
available to all users (whether they have purchased the product or not).
The articles "CI Commands for the Workload Manager" and "AIF
Enhancements" cover changes to the CI and AIFs, which include support for
the Workload Manager.
In addition, the Online Help Facility contains the detailed syntax of all
the new commands. Significantly more information regarding the Workload
Manager is available in the product manual, Using the HP 3000 Workload
Manager (B3879-90001).
Article Contents
In an effort to include a variety of valuable information, this article
is rather lengthy, so the following list of contents can help the reader
negotiate to the areas of interest.
* Features and Benefits: Unlimited Partitioning, system manager in
Control, Controlling User Service Levels
* Workgroups: Membership Criteria, Scheduling Characteristics
* Commands
* Sample Uses: Consolidations, Reactive Changes, Proactive Changes,
More Consistent Response Times
Features and Benefits
The Workload Manager provides additional control to the system manager so
they can more effectively control the performance of their system. When
used in conjunction with a performance monitoring tool, the ability to
control and monitor the system is considerable. This control includes
the ability to partition the workload as necessary, to have strict
control over this partitioning, and to manage CPU access. These controls
are at a process level, controlling the access of processes to the system
CPU(s).
Unlimited Partitioning.
The Workload Manager allows the system manager to partition their system
workload into workgroups. While MPE/iX traditionally provided only five
scheduling subqueues (AS, BS, CS, DS, and ES), the Workload Manager
allows a system manager to create an essentially unlimited number of
user-defined workgroups.
Thus, the partitioning can contain as many workgroups as the system
manager feels are necessary to control their system workload. The
workload might be divided into workgroups based on the particular users
doing the work or the programs that are being run. A workgroup could
represent individuals with a similar task to perform, users in the same
department, or individuals accessing a shared database.
Workload Manager allows the system manager to define workgroups that
truly reflect their view of the system workload, allowing the controlling
and reporting of workgroups to correspond more naturally to the way in
which their system is used.
System Manager in Control.
Workload Manager intentionally gives full control over the system
workload partitioning to the system manager. It is the system manager
who determines the user-defined workgroups that are used to control
system performance. The system manager dictates which processes belong
to a given workgroup by specifying logon information (optional
job/session name and user.account), the program being run (in MPE or HFS
syntax), and/or the scheduling queue attribute of the process.
There are many things that influence the scheduling queue attribute of a
process. The queue can be specified via the ;PRI= parameter on the
HELLO, JOB, and ALTPROC commands. MAXPRI can be customized for
particular users and accounts. The Link Editor allows a maximum and
default scheduling queue to be set for program files. The routines
GETPRIORITY and AIFPROCPUT can change this attribute. Thus, the system
manager cannot precisely control the scheduling queue of a process. A
user might require a MAXPRI that allows CS queue access, but then they
can perform online compiles or other cpu-intensive tasks in the CS queue.
Workgroup membership is determined by the system manager and cannot be
changed by the user. The scheduling queue attribute (which the user can
influence) is one attribute that determines workgroup membership, but the
system manager can also use the logon and program information to
determine membership. Thus, the system manager has complete control over
workgroup membership.
Controlling User Service Levels.
The Workload Manager allows the system manager to control the level of
service they provide to their users by controlling the CPU access of the
workgroups that have been defined. The precise control depends on the
goals of the system manager, but could result in maintaining a certain
average response time, or providing a certain degree of throughput. The
"Sample Uses" portions of this article provides some specific examples.
The system manager controls user service by adjusting the scheduling
characteristics of each workgroup to control the CPU access of those
processes in that workgroup. The "Workgroups" section of this article
contains a discussion of the available scheduling characteristics.
In addition to the traditional scheduling characteristics that have been
available for the scheduling subqueues, user-defined workgroups have CPU
percentage bounds. The system manager can guarantee a minimum amount of
CPU to a workgroup, or restrict the workgroup to a maximum amount of CPU.
Workgroups
The article, "Introducing New Workgroups for the HP 3000", in Chapter 3
provided details regarding workgroups. This section provides a brief
review, serving as a foundation for the later sections of the article.
In addition to the user-specified name, a workgroup has a set of
membership criteria that determines which processes are members of the
workgroup, and scheduling characteristics that are used to determine the
CPU access of the workgroup member processes.
Membership Criteria.
As noted earlier, workgroup membership criteria are determined by the
system manager and include logon, program, and traditional scheduling
queue.
Logon.
Logon includes the user.account, and may include job/session name if
desired.
Program.
The program can be specified in MPE or HFS syntax, and must be fully
qualified or an absolute pathname (wildcards are allowed).
Queue.
The traditional scheduling queue attribute of the process (AS, BS, CS,
DS, ES). Recall that this attribute can be influenced in many ways. This
is just one of the attributes that can determine workgroup membership,
and it is workgroup membership that determines the scheduling
characteristics of the process.
Multiple values can be provided for each category; a process is a member
of a workgroup if it matches one value for each of the specified
membership categories.
Scheduling Characteristics.
The MPE/iX Dispatcher remains priority-driven, dispatching processes to
the CPU(s) based on their priority. The MPE/iX Scheduler controls
process priorities in accordance with the scheduling parameters
established by the user. The scheduling characteristics include those
characteristics associated with traditional scheduling queues, as well as
new CPU percentage bounds that are available for user-defined workgroups.
Base and Limit Priorities.
The base and limit priorities determine the range of priorities available
to processes within the workgroup. If no user-defined workgroups have
CPU percentage minimums, the CPU is allocated to processes based on their
priorities. Processes in a workgroup with base=152 and limit=160 run
before processes in a workgroup with base=170 and limit=180.
Quantum Bounds.
The minimum and maximum quantum values bound the calculation of the
workgroup quantum, which determines the rate of priority decay of
processes within the workgroup. The quantum represents the average
transaction time of processes within that workgroup. Process CPU
consumption is compared against the quantum to determine the amount of
priority decay. Small quantum values mean most transactions are short,
and process priorities decay quickly. Larger quantum values indicate
longer transaction times, and process priorities decay more slowly.
Boost Property.
While the quantum controls the rate of priority decay, the boost property
determines the behavior of the process once its priority has decayed to
the limit of the workgroup. The default value of DECAY indicates that
the process decays to the limit and remains there until it completes its
transaction. The value OSCILLATE indicates that if the process priority
decays to the limit of the workgroup, the priority is reset to the base
priority (the process oscillates between the base and limit priorities).
Timeslice.
The timeslice is used to ensure that one process does not monopolize the
CPU for long periods of time. When a process is launched, the Dispatcher
guarantees that it does not run for more than its timeslice value (even
if it is CPU-bound). The Dispatcher actually takes the CPU away from the
process if it is still running after the timeslice interval has passed
(provided the process can be interrupted).
CPU Percentage Bounds.
User-defined workgroups also allow the specification of minimum and
maximum CPU percentage bounds. The minimum percentage serves as a
guarantee. Processes in a workgroup with a minimum CPU percentage of 20
percent are guaranteed 20 percent of the CPU(s), provided they have
enough demand to use the 20 percent. Note that the guarantee is for the
collection of processes in the workgroup, not for each process in the
workgroup. If the processes demand more than 20 percent, they can
receive more, providing they do not violate the minimum values for other
workgroups. Thus, the minimum is a true minimum and can be exceeded; it
is not a strict target value.
The maximum percentage serves to restrict the CPU consumption of a
workgroup. Processes in a workgroup with a maximum CPU percentage of 50
percent never receive more than 50 percent of the CPU. If no other
workgroups require CPU, the system idles rather than allows the workgroup
to exceed its maximum.
The "Sample Uses" section of this article discussed how these scheduling
characteristics might be used.
Commands
The Workload Manager introduces four new commands (NEWWG, ALTWG, SHOWWG,
PURGEWG) and modifies two existing commands (SHOWPROC, and ALTPROC). The
article "CI Commands for the Workload Manager" in Chapter 3 contains more
information on the commands, and the Online Help Facility contains the
detailed syntax of all the commands.
NEWWG.
The NEWWG command allows the use of command-line specifications to add a
single workgroup to the existing workgroup configuration, or the use of
an indirect file to replace the existing workgroup configuration with
that in the file.
ALTWG.
The ALTWG command supports changing the scheduling characteristics of the
specified workgroup. All scheduling characteristics (base and limit
priorities, quantum bounds, boost property, timeslice, CPU percentages)
can be changed, and the processes belonging to that workgroup are
scheduled in accordance with the new parameters.
SHOWWG.
The SHOWWG command supports a variety of formats that display information
regarding the workgroups (both user-defined and system-default). This
includes summary information on the workgroups, member processes,
detailed information, or an output format suitable for CI I/O redirection
and input to the NEWWG command.
PURGEWG.
The PURGEWG command supports purging any of the user-defined workgroups;
the system-defined default workgroups cannot be purged. Wildcarding is
supported, as are a variety of prompting and display options.
SHOWPROC.
The SHOWPROC command continues to display process attributes. An
additional format (DETAIL) provides information regarding those
attributes that can determine workgroup membership (logon, program,
queue) and the resulting workgroup. Processes can be made artificial
members of workgroups (see ALTPROC) and the DETAIL display distinguishes
such processes from natural members of the workgroup.
ALTPROC.
The ALTPROC command continues to allow process attributes to be set. The
;PRI= option should not be used on a system that has user-defined
workgroups. Rather, the ;WG= option can be used to make a process an
artificial member of the target workgroup. Processes can be placed at
fixed priority by placing them in a workgroup with its base and limit set
to the desired priority value. Processes moved to a workgroup via
the ;WG= option can be restored to their natural workgroup via
;WG=Natural_Wg.
Sample Uses
The following examples of Workload Manager usage demonstrate how the
capabilities can be used in particular situations. These are by no means
comprehensive, but rather provide a few examples.
Consolidations.
The Workload Manager can provide value when consolidating multiple source
systems onto a target system. The Workload Manager features can address
concerns when planning for the consolidation, during the consolidation
itself, and when managing the final consolidated system.
Partitioning.
One typical concern regarding consolidation is the limited amount of
control available on the target system. Five scheduling subqueues were
available on each of the source systems, and only five scheduling
subqueues are available on the target system. The Workload Manager can
be used to define multiple workgroups that represent the users of the
various systems. If desired, workgroups can be created to represent the
CS, DS, and ES processes from each of the source systems. This would
serve to preserve the partioning that had been available with the
physical separation of the source systems.
Alternatively, the Workload Manager can be used to define workgroups that
more naturally reflect the needs of the combined user population.
Perhaps data entry clerks had been in the CS subqueue of several source
systems and can be combined into a single workgroup on the target system.
Similar batch jobs might be collected into a common workgroup. Users
from one system who were forced to share the CS subqueue can now be
broken into individual workgroups. The scheduling characteristics of the
workgroups on the target system can be adjusted to result in the CPU
access that the system manager requires to achieve desired performance.
User Expectations.
Another area of concern relates to the consolidation itself. Consider a
consolidation situation where systems A, B, and C are being consolidated
onto system D. Often the consolidations are spread over time, perhaps
bringing over System A on weekend 1, System B on weekend 2, and System C
on the following weekend. One problem that can result from this deals
with the performance expectations of the users of System A. While they
are running alone on System D, the performance (response time and
throughput) of the users from System A is excellent. When they are
joined by the users from System B, their performance may degrade. Once
all three systems are combined on System D, the System A users may
actually complain about their performance.
How can the Workload Manager be used to help this situation? Consider
the root cause of the dissatisfaction of the users from System A. They
had grown accustomed to the better performance when alone on System D,
and felt the hit when it settled to steady state after the consolidations
were complete. The CPU maximum feature of the Workload Manager can be
used to restrict the amount of CPU available to users. The users from
System A could be constrained to use only 30 percent of System D. Thus,
they would experience from the onset the performance that results when
all consolidations are complete. Their expectations would not be set
artificially high, and thus they would not be disappointed by the results
after the consolidations are complete.
This example is obviously simplified. The system manager may not wish to
divide the target system up evenly among the users from the three source
systems. Perhaps one set of users is more important and requires more of
the CPU. Alternatively, the issue with the consolidation may be a concern
of how to ensure that competing workloads from the various source systems
are able to co-exist on the target system. The partitioning into
workgroups addresses this concern.
The benefit in this case is to provide consistent performance throughout
the consolidation; the Workload Manager can also add value when a system
is in its steady state (as shown below).
Reactive Changes.
While the ideal is to proactively manage the system, it is often the case
that problems arise unexpectedly and reactive changes are necessary. The
Workload Manager can be used to make such reactive changes.
Increase CPU Access.
Once the Workload Manager has been used to group processes into
workgroups, the entire workgroup can be given increased access to the
CPU. Perhaps it is a busy season for orders and the telephone sales reps
need faster response time and therefore you want to increase their CPU
access. Perhaps there was a problem with a batch run the night before
and you want to give the batch jobs limited CPU access during the day so
they can complete.
The ALTWG command can be used to adjust the scheduling characteristics of
the workgroup and address this need to give a workgroup increased CPU
access. The base and limits of the workgroup might be increased to
higher priorities. Since the MPE/iX Dispatcher is priority-driven, this
gives the members of this workgroup preference over lower-priority
processes. If the workgroup has been given a minimum CPU percentage, and
is at that percentage, the ALTWG command can be used to increase the
percentage of this workgroup (and decrease the minimums of other
workgroups). Note that this is only effective if the workgroup is using
its minimum; if there is not enough CPU demand within the workgroup to
consume 20 percent of the CPU, raising the minimum to 25 percent does not
improve performance. Also note that if the workgroup is constrained by a
maximum and is reaching that maximum, raising the maximum gives the
workgroup greater access to the CPU.
The above scenario assumed that the entire workgroup (all member
processes) was to be given improved CPU access. If there is a specific
process that requires increased access to the CPU (a particular user who
has called with a problem, a batch job that must complete quickly, etc),
the ALTPROC command can be used to adjust that process. The ;WG= option
can be used to move the process to a workgroup with better CPU access
(higher priority or a larger CPU minimum value). For example, the system
manager may define a High_Priority workgroup that runs with base and
limit set to 152. Processes requiring fast CPU access could be placed in
this workgroup via :ALTPROC pin;WG=High_Priority.
Decrease CPU Access.
The opposite case involves a need to decrease the priority of a set of
users. Perhaps a batch run did not complete the night before and is now
impacting the response time of interactive users. Perhaps a group of
users is receiving 0.25 second response time when 0.5 second would be
sufficient. The ALTWG command can be used to alter the scheduling
characteristics of the appropriate workgroup. The base and limit
priorities might be lowered (thereby reducing the priorities of all
member processes). If the workgroup has been assigned a minimum amount
of CPU and is using that minimum, the value might be lowered. As before,
this only has an effect if the workgroup is using its minimum. Lowering
the CPU minimum from 25 percent to 20 percent has no effect if the
workgroup is only consuming 15 percent. Alternatively, a maximum CPU
percentage might be imposed on the workgroup. A workgroup might be
constrained to run within 20 percent of the system.
As before, it might be the case that a single process requires
adjustment, rather than an entire workgroup. There may be a process in
an infinite loop that is critical and cannot be killed, or a CPU
intensive process may be performing a non-time-critical task. The
ALTPROC command can be used to move the process to a workgroup running at
lower priority.
Proactive Changes.
In addition to supporting the reactive changes that are often required,
the Workload Manager allows the system manager to make proactive changes.
Often a system manager can predict the behavior of their system workload.
They know the performance requirements at 9am differ from those at 6pm
and on the weekends. The NEWWG command supports an indirect file format
that can be used to make proactive changes.
The system manager can create configuration files that represent the
desired workgroup configurations for different system performance needs.
The files need not be created from scratch; the SHOWWG command with the
WGFILE format option can be routed to a file. This file can then be
modified as appropriate. The NEWWG command supports a ;VALIDATE option
that merely checks the validity of the indirect file, it does not invoke
the changes. In this way, the system manager can be assured subsequent
NEWWG commands using the file do not fail. The NEWWG command with the
appropriate configuration file can be invoked via timed job streams or at
the completion of certain jobs.
More Consistent Response Times.
The Workload Manager features can be used to ensure more consistent
response times for users. The need might be to meet a specific Service
Level Agreement (SLA) with users, to minimize performance complaints, or
to facilitate capacity planning. The grouping of processes into
workgroups gives the System Manager the partitioning into groups with
similar needs. Altering the scheduling characteristics of those
workgroups provides the control over CPU access, which in turn helps
determine response time.
CAUTION It is critical to understand that CPU access is just one
component of response time. The Workload Manager can help
handle this aspect, but cannot handle problems with disk access
speeds, memory constraints, network latency and availability, or
the other components of response time.
In controlling CPU access, the system manager can either control the
workgroup needing the consistent response times, or identify the other
workgroup(s) whose behavior leads to inconsistent response times and
control those workgroups.
Single Workgroup.
In handling a single workgroup, the system manager can manipulate
priorities (changing the base and limit priorities), the rate of priority
decay (changing the quantum), or the CPU minimum percentages. Which
control is most effective depends on the characteristics of the processes
being controlled. It may be the case that placing the workgroup at
priorities 160-170 gives consistent 1-second response time to those
users. Alternatively, it may require that a minimum CPU percentage of 20
percent be given to the workgroup in order to ensure 1-second response
time.
Other Workgroup(s).
If processes in Workgroup_1 are having inconsistent response times, it
may be due to the influences of other workgroups. Perhaps Workgroup_2 is
at higher priority and contains processes who perform long transactions,
leading to increased response times for the processes in Workgroup_1.
Those processes might be removed from Workgroup_2 and placed in a
workgroup at lower priority. Workgroup_2 might be moved to a lower
priority. Alternatively, a CPU maximum might be set for Workgroup_2,
restricting the amount of CPU it can consume. If it is a single process
that is disrupting the response times of others, it might be moved to a
lower-priority workgroup.
An obvious question in all of these situations is how the System Manager
can determine what action is most appropriate. An analysis of the
specific situation is required. While no general discussion can provide
the level of detail necessary for this activity, the product manual
provides guidelines and trouble-shooting tips.
MPE/iX Communicators