 |
» |
|
|
|
The usual method of identifying problems is to characterize
the situation in which the problem occurs and then investigate which
of the possible causes are actually responsible for the problem.
Finding the cause is often sufficient to suggest the resolution
of the problem. For example, assume that the problem is characterized
as "the user is unable to open a line with the DSLINE
command." A possible cause is that the user entered a command
using incorrect syntax. You would resolve the problem by correcting
the command and reissuing it. However, if the syntax was correct,
you would have to look for another possible cause, such as an inactive
link or a failure of the remote node. Thus, in most cases you start with the characterization of
the problem and investigate the possible causes. The difficult part
of troubleshooting is to identify the actual cause of the problem.
Once you know the actual cause, you can take the appropriate action
to resolve the problem. To Characterize the Problem |  |
It is important to ask questions when you are trying to characterize
a problem. Start with global questions and gradually get more specific.
Depending on the response, you ask another series of question, until
you have enough information to understand exactly what happened. Key questions to ask are as follows: Was an error message generated? Use the NS 3000/iX
Error Messages Reference Manual to look up the cause
of the error and take the action suggested. If this does not resolve
the problem, continue with the next question. Is the problem isolated to one user or program?
If so, continue to the next question. If more than one user is involved,
proceed to question 6. Did the user perform the operation correctly? Was
syntax correct? Does the user have the correct logon and authority
to use the command or service? Correct any problems found. If the
operation was correct, continue with the next question. Did the problem occur while the user was running
a program? Were there program errors? If so, investigate and correct
the program errors. Otherwise, continue with the next question. Did the problem occur while attempting to open a
line or transmit data? If so, investigate the connection between
this system and the remote system. If more than one user is involved, does the problem
affect all users? The entire node? If so, has anything changed recently?
Some possibilities are: New software and hardware installation. Same hardware but changes to the software. Has the
configuration file been modified? Has the MPE/iX configuration been
changed? Same software but changes to the hardware.
Do you suspect hardware or software? It is often difficult to determine whether the problem is
hardware or software related. Symptoms that mean you should suspect
the hardware are: Bad LAN card or PSI dumps. Link level errors, either returned to the user or
logged to the console. This includes CI errors, NMERR errors, power
fails, and link shutdowns. Lost data—data is sent but not received
at the link destination. (This could also be caused by a software
problem.)
Symptoms that mean you should suspect the software are: Logging messages at the console. Network Services errors returned to users or programs. MPE/iX file system (FSERR) or command interface
(CIERR) errors (except "Remote Not Responding"
errors).
To Identify the Cause of Problems |  |
The type of investigation that you use to identify the possible
causes of a problem depends on whether the problem affects one user
or an individual situation, or if the problem is node-wide. Once
you have the answers to the questions listed previously, use the
flowchart in Figure 4-1 “Characterizing the Problem” as a guide
and see Chapter 5 “Common Network Problems ” for
a problem resolution strategy. Figure 4-1 Characterizing the Problem
|