General Portability Considerations [ HP C Programmer's Guide ] MPE/iX 5.0 Documentation
HP C Programmer's Guide
General Portability Considerations
This section summarizes some of the general considerations to take into
account when writing portable HP C programs. Some of the features listed
here may be different on other implementations of C. Differences between
Series 300/400 versus 700/800 implementations are also noted in this
section.
Data Type Sizes and Alignments
Table 2-1 in Chapter 2 shows the sizes and alignments of the C data types
on the different architectures.
Differences in data alignment can cause problems when porting code or
data between systems that have different alignment schemes. For example,
if you write a C program on Series 300/400 that writes records to a file,
then read the file using the same program on Series 700/800, it may not
work properly because the data may fall on different byte boundaries
within the file due to alignment differences. To help alleviate this
problem, HP C provides the HP_ALIGN pragma, which forces a particular
alignment scheme, regardless of the architecture on which it is used.
The HP_ALIGN pragma is described in Chapter 2.
Accessing Unaligned Data
The Series 700/800 like all PA-RISC processors requires data to be
accessed from locations that are aligned on multiples of the data size.
The C compiler provides an option to access data from misaligned
addresses using code sequences that load and store data in smaller
pieces, but this option will increase code size and reduce performance.
A bus error handling routine is also available to handle misaligned
accesses but can reduce performance severely if used heavily.
Here are your specific alternatives for avoiding bus errors:
1. Change your code to eliminate misaligned data, if possible. This
is the only way to get maximum performance, but it may be
difficult or impossible to do. The more of this you can do, the
less you'll need the next two alternatives.
2. Use the +ubytes compiler option available at 9.0 to allow 2-byte
alignment. However, the +ubytes option, as noted above, creates
big, slow code compared to the default code generation which is
able to load a double precision number with one 8-byte load
operation. Refer to the HP C/HP-UX Reference Manual (Series
700/800) for more information.
3. Finally, you can use allow_unaligned_data_access() to avoid
alignment errors. allow_unaligned_data_access() sets up a signal
handler for the SIGBUS signal. When the SIGBUS signal occurs, the
signal handler extracts the unaligned data from memory byte by
byte.
To implement, just add a call to allow_unaligned_data_access()
within your main program before the first access to unaligned data
occurs. Then link with -lhppa. Any alignment bus errors that
occur are trapped and emulated by a routine in the libhppa.a
library in a manner that will be transparent to you. The
performance degradation will be significant, but if it only occurs
in a few places in your program it shouldn't be a big concern.
Whether you use alternative 2 or 3 above depends on your specific code.
The +ubytes option costs significantly less per access than the handler,
but it costs you on every access, whether your data is aligned or not,
and it can make your code quite a bit bigger. You should use it
selectively if you can isolate the routines in your program that may be
exposed to misaligned pointers.
There is a performance degradation associated with alternative 3 because
each unaligned access has to trap to a library routine. You can use the
unaligned_access_count variable to check the number of unaligned accesses
in your program. If the number is fairly large, you should probably use
2. If you only occasionally use a misaligned pointer, it is probably
better just use the allow_unaligned_data_access handler. There is a
stiff penalty per bus error, but it doesn't cause your program to fail
and it won't cost you anything when you operate on aligned data.
The following is a an example of its use within a C program:
extern int unaligned_access_count;
/* This variable keeps a count
of unaligned accesses. */
char arr[]="abcdefgh";
char *cp, *cp2;
int i=99, j=88, k;
int *ip; /* This line would normally result in a
bus error on Series 700 or 800 */
main()
{
allow_unaligned_data_access();
cp = (char *)&i;
cp2 = &arr[1];
for (k=0; k<4; k++)
cp2[k] = * (cp+k);
ip = (int *)&arr[1];
j = *ip;
printf("%d\n", j);
printf("unaligned_access_count is : %d\n", unaligned_access_count);
}
To compile and link this program, enter
cc filename.c -lhppa
This enables you to link the program with allow_unaligned_data_access()
and the int unaligned_access_count that reside in /usr/lib/libhppa.a.
Note that there is a performance degradation associated with using this
library since each unaligned access has to trap to a library routine.
You can use the unaligned_access_count variable to check the number of
unaligned accesses in your program. If the number is fairly large, you
should probably use the compiler option.
Checking for Alignment Problems with lint
If invoked with the -s option, the lint command generates warnings for C
constructs that may cause portability and alignment problems between
Series 300/400 and Series 700/800, and vice versa. Specifically, lint
checks for these cases:
* Internal padding of structures. lint checks for instances where a
structure member may be aligned on a boundary that is
inappropriate according to the most-restrictive alignment rules.
For example, given the code
struct s1 { char c; long l; };
lint issues the warning:
warning: alignment of struct 's1' may not be portable
* Alignment of structures and simple types. For example, in the
following code, the nested struct would align on a 2-byte boundary
on Series 300/400 and an 8-byte boundary on Series 700/800:
struct s3 { int i; struct { double d; } s; };
In this case, lint issues this warning about alignment:
warning: alignment of struct 's3' may not be portable
* End padding of structures. Structures are padded to the alignment
of the most-restrictive member. For example, the following code
would pad to a 2-byte boundary on Series 300/400 and a 4-byte
boundary for Series 700/800:
struct s2 { int i; short s; };
In this case, lint issues the warning:
warning:
trailing padding of struct/union 's2' may not be portable
Note that these are only potential alignment problems. They would cause
problems only when a program writes raw files which are read by another
system. This is why the capability is accesible only through a command
line option; it can be switched on and off.
lint does not check the layout of bit-fields.
Ensuring Alignment without Pragmas
Another solution to alignment differences between systems would be to
define structures in such a way that they are forced into the same layout
on different systems. To do this, use padding bytes--that is, dummy
variables that are inserted solely for the purpose of forcing struct
layout to be uniform across implementations. For example, suppose you
need a structure with the following definition:
struct S {
char c1;
int i;
char c2;
double d;
};
An alternate definition of this structure that uses filler bytes to
ensure the same layout on Series 300/400 and Series 700/800 would look
like this:
struct S {
char c1; /* byte 0 */
char pad1,pad2,pad3; /* bytes 1 through 3 */
int i; /* bytes 4 through 7 */
char c2; /* byte 8 */
char pad9,pad10,pad11, /* bytes 9 */
pad12,pad13,pad14, /* through */
pad15; /* 15 */
double d; /* bytes 16 through 23 */
};
Casting Pointer Types
Before understanding how casting pointer types can cause portability
problems, you must understand how Series 700/800 aligns data types. In
general, a data type is aligned on a byte boundary equivalent to its
size. For example, the char data type can fall on any byte boundary, the
int data type must fall on a 4-byte boundary, and the double data type
must fall on an 8-byte boundary. A valid location for a data type would
then satisfy the following equation:
location mod sizeof(data_type) == 0
Consider the following program:
#include <string.h>
#include <stdio.h>
main()
{
struct chStruct {
char ch1; /* aligned on
an even boundary */
char chArray[9]; /* aligned on
an odd byte boundary */
} foo;
int *bar; /* must be aligned
on a word boundary */
strcpy(foo.chArray, "1234"); /* place a value
in the ch array */
bar = (int *) foo.chArray; /* type cast */
printf("*bar = %d\n",*bar); /* display the value */
}
Casting a smaller type (such as char) to a larger type (such as int) will
not cause a problem. However, casting a char* to an int* and then
dereferencing the int* may cause an alignment fault. Thus, the above
program crashes on the call to printf() when bar is dereferenced.
Such programming practices are inherently non-portable because there is
no standard for how different architectures reference memory. You should
try to avoid such programming practices.
As another example, if a program passes a casted pointer to a function
that expects a parameter with stricter alignment, an alignment fault may
occur. For example, the following program causes an alignment fault on
Series 700/800:
void main (int argc, char *argv[])
{
char pad;
char name[8];
intfunc((int *)&name[1]);
}
int intfunc (int *iptr)
{
printf("intfunc got passed %d\n", *iptr);
}
Type Incompatibilities and typedef
The C typedef keyword provides an easy way to write a program to be used
on systems with different data type sizes. Simply define your own type
equivalent to a provided type that has the size you wish to use.
For example, suppose system A implements int as 16 bits and long as 32
bits. System B implements int as 32 bits and long as 64 bits. You want
to use 32 bit integers. Simply declare all your integers as type INT32,
and insert the appropriate typedef on system A:
typedef long INT32;
The code on system B would be:
typedef int INT32;
Conditional Compilation
Using the #ifdef C preprocessor directive and the predefined symbols
__hp9000s300, __hp9000s700, and __hp9000s800, you can group blocks of
system-dependent code for conditional compilation, as shown below:
#ifdef __hp9000s300
:
Series 300/400-specific code goes here...
:
#endif
#ifdef __hp9000s700
:
Series 700-specific code goes here...
:
#endif
#ifdef __hp9000s800
:
Series 700/800-specific code goes here...
:
#endif
If this code is compiled on a Series 300/400 system, the first block is
compiled; if compiled on Series 700, the second block is compiled; if
compiled on either the Series 700 or the Series 800, the third block is
compiled. You can use this feature to ensure that a program will compile
properly on either Series 300/400 or 700/800.
If you want your code to compile only on the Series 800 but not on the
700, surround your code as follows:
#if (defined(__hp9000s800) && !defined(__hp9000s700))
...
Series 800-specific code goes here...
...
#endif
Isolating System-Dependent Code with #include Files
#include files are useful for isolating the system-dependent code like
the type definitions in the previous section. For instance, if your type
definitions were in a file mytypes.h, to account for all the data size
differences when porting from system A to system B, you would only have
to change the contents of file mytypes.h. A useful set of type
definitions is in /usr/include/model.h.
NOTE If you use the symbolic debugger, xdb, include files used within
union, struct, or array initialization will generate correct code.
However, such use is discouraged because xdb may show incorrect
debugging information about line numbers and source file numbers.
Parameter Lists
On the Series 300/400, parameter lists grow towards higher addresses. On
the Series 700/800, parameter lists are usually stacked towards
decreasing addresses (though the stack itself grows towards higher
addresses). The compiler may choose to pass some arguments through
registers for efficiency; such parameters will have no stack location at
all.
ANSI C function prototypes provide a way of having the compiler check
parameter lists for consistency between a function declaration and a
function call within a compilation unit. lint provides an option (-Aa)
that flags cases where a function call is made in the absence of a
prototype.
The ANSI C <stdarg.h> header file provides a portable method of writing
functions that accept a variable number of arguments. You should note
that <stdarg.h> supersedes the use of the varargs macros. varargs is
retained for compatibility with the pre-ANSI compilers and earlier
releases of HP C/HP-UX. See varargs(5) and vprintf(3S) for details and
examples of the use of varargs.
The char Data Type
The char data type defaults to signed. If a char is assigned to an int,
sign extension takes place. A char may be declared unsigned to override
this default. The line:
unsigned char ch;
declares one byte of unsigned storage named ch. On some non-HP-UX
systems, char variables are unsigned by default.
Register Storage Class
The register storage class is supported on Series 300/400 and 700/800
HP-UX, and if properly used, can reduce execution time. Using this type
should not hinder portability. However, its usefulness on systems will
vary, since some ignore it. Refer to the HP-UX Assembler and Supporting
Tools for Series 300/400 for a more complete description of the use of
the register storage class on Series 300/400.
Also, the register storage class declarations are ignored when optimizing
at level 2 or greater on all Series.
Identifiers
To guarantee portable code to non-HP-UX systems, the ANSI C standard
requires identifier names without external linkage to be significant to
31 case-sensitive characters. Names with external linkage (identifiers
that are defined in another source file) will be significant to six
case-insensitive characters. Typical C programming practice is to name
variables with all lower-case letters, and #define constants with all
upper case.
Predefined Symbols
The symbol __hp9000s300 is predefined on Series 300/400; the symbols
__hp9000s800 and __hppa are predefined on Series 700/800; and
__hp9000s700 is predefined on Series 700 only. The symbols __hpux and
__unix are predefined on all HP-UX implementations.
This is only an issue if you port code to or from systems that also have
predefined these symbols.
Shift Operators
On left shifts, vacated positions are filled with 0. On right shifts of
signed operands, vacated positions are filled with the sign bit
(arithmetic shift). Right shifts of unsigned operands fill vacated bit
positions with 0 (logical shift). Integer constants are treated as
signed unless cast to unsigned. Circular shifts are not supported in any
version of C. Shifts greater than 32 bits give an undefined result.
The sizeof Operator
The sizeof operator yields an unsigned int result, as specified in
section 3.3.3.4 of the ANSI C standard (X3.159-1989). Therefore,
expressions involving this operator are inherently unsigned. Do not
expect any expression involving the sizeof operator to have a negative
value (as may occur on some other systems). In particular, logical
comparisons of such an expression against zero may not produce the object
code you expect as the following example illustrates.
main()
{
int i;
i = 2;
if ((i-sizeof(i)) < 0) /* sizeof(i) is 4,
but unsigned! */
printf("test less than 0\n");
else
printf("an unsigned expression cannot be less than 0\n");
}
When run, this program will print
an unsigned expression cannot be less than 0
because the expression (i-sizeof(i)) is unsigned since one of its
operands is unsigned (sizeof(i)). By definition, an unsigned number
cannot be less than 0 so the compiler will generate an unconditional
branch to the else clause rather than a test and branch.
Bit-Fields
The ANSI C definition does not prescribe bit-field implementation;
therefore each vendor can implement bit-fields somewhat differently.
This section describes how bit-fields are implemented in HP C.
Bit-fields are assigned from most-significant to least-significant bit on
all HP-UX and Domain systems.
On all HP-UX implementations, bit-fields can be signed or unsigned,
depending on how they are declared.
On the Series 300/400, a bit-field declared without the signed or
unsigned keywords will be signed in ANSI mode and unsigned in
compatibility mode by default.
On the Series 700/800, plain int, char, or short bit-fields declared
without the signed or unsigned keywords will be signed in both
compatibility mode and ANSI mode by default.
On the Series 700/800, and for the most part on the Series 300/400,
bit-fields are aligned so that they cannot cross a boundary of the
declared type. Consequently, some padding within the structure may be
required. As an example,
struct foo
{
unsigned int a:3, b:3, c:3, d:3;
unsigned int remainder:20;
};
For the above struct, sizeof(struct foo) would return 4 (bytes) because
none of the bit-fields straddle a 4 byte boundary. On the other hand,
the following struct declaration will have a larger size:
struct foo2
{
unsigned char a:3, b:3, c:3, d:3;
unsigned int remainder:20;
};
In this struct declaration, the assignment of data space for c must be
aligned so it doesn't violate a byte boundary, which is the normal
alignment of unsigned char. Consequently, two undeclared bits of padding
are added by the compiler so that c is aligned on a byte boundary.
sizeof(struct foo2) returns 6 (bytes) on Series 300/400, and 8 on Series
700/800. Note, however, that on Domain systems or when using #pragma
HP_ALIGN NATURAL, which uses Domain bit-field mapping, 4 is returned
because the char bit-fields are considered to be ints.)
Bit-fields on HP-UX systems cannot exceed the size of the declared type
in length. The largest possible bit-field is 32 bits. All scalar types
are permissible to declare bit-fields, including enum.
Enum bit-fields are accepted on all HP-UX systems. On Series 300/400 in
compatibility mode they are implemented internally as unsigned integers.
On Series 700/800, however, they are implemented internally as signed
integers so care should be taken to allow enough bits to store the sign
as well as the magnitude of the enumerated type. Otherwise your results
may be unexpected. In ANSI mode, the type of enum bit-fields is signed
int on all HP-UX systems.
Floating-Point Exceptions
HP C on Series 700/800, in accordance with the IEEE standard, does not
trap on floating point exceptions such as division by zero. By contrast,
when using HP C on Series 300/400, floating-point exceptions will result
in the run-time error message Floating exception (core dumped). One way
to handle this error on Series 700/800 is by setting up a signal handler
using the signal system call, and trapping the signal SIGFPE. For
details, see signal(2), signal(5), and "Advanced HP-UX Programming" in
Programming on HP-UX.
For full treatment of floating-point exceptions and how to handle them,
see HP-UX Floating-Point Guide.
Integer Overflow
In HP C, as in nearly every other implementation of C, integer overflow
does not generate an error. The overflowed number is "rolled over" into
whatever bit pattern the operation happens to produce.
Overflow During Conversion from Floating Point to Integral Type
HP-UX systems will report a floating exception - core dumped at run time
if a floating point number is converted to an integral type and the value
is outside the range of that integral type. As with the error described
previously under "Floating Point Exceptions," a program to trap the
floating-point exception signal (SIGFPE) can be used. See signal(2) and
signal(5) for details.
Structure Assignment
The HP-UX C compilers support structure assignment, structure-valued
functions, and structure parameters. The structs in a struct assignment
s1=s2 must be declared to be the same struct type as in:
struct s s1,s2;
Structure assignment is in the ANSI standard. Prior to the ANSI
standard, it was a BSD extension that some other vendors may not have
implemented.
Structure-Valued Functions
Structure-valued functions support storing the result in a structure:
s = fs();
All HP-UX implementations allow direct field dereferences of a structure-
valued function. For example:
x = fs().a;
Structure-valued functions are ANSI standard. Prior to the ANSI
standard, they were a BSD extension that some vendors may not have
implemented.
Dereferencing Null Pointers
Dereferencing a null pointer has never been defined in any C standard.
Kernighan and Ritchie's The C Programming Language and the ANSI C
standard both warn against such programming practice. Nevertheless, some
versions of C permit dereferencing null pointers.
Dereferencing a null pointer returns a zero value on all HP-UX systems.
The Series 700/800 C compiler provides the -z compile line option, which
causes the signal SIGSEGV to be generated if the program attempts to read
location zero. Using this option, a program can "trap" such reads.
Since some programs written on other implementations of UNIX rely on
being able to dereference null pointers, you may have to change code to
check for a null pointer. For example, change:
if (*ch_ptr != '\0')
to:
if ((ch_ptr != NULL) && *ch_ptr != '\0')
Writes of location zero may be detected as errors even if reads are not.
If the hardware cannot assure that location zero acts as if it was
initialized to zero or is locked at zero, the hardware acts as if the -z
flag is always set.
Expression Evaluation
The order of evaluation for some expressions will differ between HP-UX
implementations. This does not mean that operator precedence is
different. For instance, in the expression:
x1 = f(x) + g(x) * 5;
f may be evaluated before or after g, but g(x) will always be multiplied
by 5 before it is added to f(x). Since there is no C standard for order
of evaluation of expressions, you should avoid relying on the order of
evaluation when using functions with side effects or using function calls
as actual parameters. You should use temporary variables if your program
relies upon a certain order of evaluation.
Variable Initialization
On some C implementations, auto (non-static) variables are implicitly
initialized to 0. This is not the case on HP-UX and it is most likely
not the case on other implementations of UNIX. Don't depend on the system
initializing your local variables; it is not good programming practice in
general and it makes for nonportable code.
Conversions between unsigned char or unsigned short and int
All HP-UX C implementations, when used in compatibility mode, are
unsigned preserving. That is, in conversions of unsigned char or
unsigned short to int, the conversion process first converts the number
to an unsigned int. This contrasts to some C implementations that are
value preserving (that is, unsigned char terms are first converted to
char and then to int before they are used in an expression).
Consider the following program:
main()
{
int i = -1;
unsigned char uc = 2;
unsigned int ui = 2;
if (uc > i)
printf("Value preserving\n");
else
printf("Unsigned preserving\n");
if (ui < i)
printf("Unsigned comparisons performed\n");
}
On HP-UX systems in compatibility mode, the program will print:
Unsigned preserving
Unsigned comparisons performed
In contrast, ANSI C specifies value preserving; so in ANSI mode, all
HP-UX C compilers are value preserving. The same program, when compiled
in ANSI mode, will print:
Value preserving
Unsigned comparisons performed
Temporary Files ($TMPDIR)
All HP-UX C compilers produce a number of intermediate temporary files
for their private use during the compilation process. These files are
normally invisible to you since they are created and removed
automatically. If, however, your system is tightly constrained for file
space these files, which are usually generated on /tmp or /usr/tmp, may
exceed space requirements. By assigning another directory to the TMPDIR
environment variable you can redirect these temporary files. See the cc
manual page for details.
Compile Line Options
There are some minor differences in HP C Series 700/800 and Series
300/400 compiler options. You may have to modify makefiles if they use
any of the options listed in the following table. Be aware that the
purpose of the table below is only to point out differences between
implementations.
Table 5-1. Differences in C Compile Line Options
--------------------------------------------------------------------------------------------
| | |
| Option | Difference |
| | |
--------------------------------------------------------------------------------------------
| | |
| +a | Series 700/800 only. |
| | |
| +bfpa | Series 300/400 only. |
| | |
| +DA | Series 700/800 only. |
| | |
| +dfname | Series 700/800 only. |
| | |
| +DS | Series 700/800 only. |
| | |
| +e | System-dependent options. |
| | |
| +f | Series 700/800 only. |
| | |
| +ESlit | Series 700/800 only. |
| | |
| +ESsfc | Series 700/800 only. |
| | |
| +ffpa | Series 300/400 only. |
| | |
--------------------------------------------------------------------------------------------
Table 5-1. Differences in C Compile Line Options (cont.)
--------------------------------------------------------------------------------------------
| | |
| Option | Difference |
| | |
--------------------------------------------------------------------------------------------
| | |
| +FPflags | Series 700/800 only. |
| | |
| +I | Series 700/800 only. |
| | |
| +L | Series 700/800 only. |
| | |
| +Lp | Series 700/800 only. |
| | |
| +M | Series 300/400 only. |
| | |
| +m | Series 700/800 only. |
| | |
| +N | Series 300/400 only. |
| | |
| -N | Such executables cannot be executed by exec on Series 700/800. |
| | |
| +Oopt | Semantics differ. |
| | |
| +o | Series 700/800 only. |
| | |
| +P | Series 700/800 only. |
| | |
| +pgmname | Series 700/800 only. |
| | |
| +Rn | Series 700/800 only. |
| | |
| +r | Series 700/800 only. |
| | |
--------------------------------------------------------------------------------------------
Table 5-1. Differences in C Compile Line Options (cont.)
--------------------------------------------------------------------------------------------
| | |
| Option | Difference |
| | |
--------------------------------------------------------------------------------------------
| | |
| +ubytes | Series 700/800 only. |
| | |
| -W | System-dependent options. |
| | |
| +wn | Series 700/800 only. |
| | |
| +opt | System-dependent options. |
| | |
| -Z | Not supported on Series 300/400. Is the default on Series 700/800. |
| | |
| -z | Series 700/800 only. |
| | |
--------------------------------------------------------------------------------------------
Input/Output
Since the C language definition provides no I/O capability, it depends on
library routines supplied by the host system. Data files produced by
using the HP-UX calls write(2) or fwrite(3) should not be expected to be
portable between different system implementations. Byte ordering and
structure packing rules will make the bits in the file system-dependent,
even though identical routines are used. When in doubt, move data files
using ASCII representations (as from printf(3)), or write translation
utilities that deal with the byte ordering and alignment differences.
Checking for Standards Compliance
In order to check for standards compliance to a particular standard, you
can use the lint program with one of the following -D options:
* -D_XOPEN_SOURCE
* -D_POSIX_SOURCE
For example, the command
lint -D_POSIX_SOURCE file.c
checks the source file file.c for compliance with the POSIX standard.
If you have the HP Advise product, you can also check for C standard
compliance using the apex command.
MPE/iX 5.0 Documentation