HP 3000 Manuals

Byte-Stream Emulation [ COMMUNICATOR 3000 MPE/iX Release 5.0 (Core Software Release X.50.20) ] MPE/iX Communicators


COMMUNICATOR 3000 MPE/iX Release 5.0 (Core Software Release X.50.20)

Byte-Stream Emulation 

by Steve Elmer 
Commercial Systems Division 

To facilitate transparent file sharing, the file system supports several
emulators.  The traditional MPE fixed and variable record formats can be
emulated as the byte-stream record format.  In addition, the byte-stream
record format can be emulated as the variable record format.  File
sharing on MPE/iX is transparent since the file system automatically
binds in the appropriate emulator when necessary.

What is Emulation? 

According to Webster, to emulate is "to try to equal or surpass." In this
context, we are attempting to make native file formats equally accessible
to non-native accessors.  In other words, we would like the applications
which only understand the byte-stream format to be able to operate on MPE
record format files and vice-versa.

To understand the emulators, it is essential to understand the underlying
internal layout for each record format.  To do so, one must view each
file as a simple array of bytes.  The fixed record format imposes a
simple structure on its array of bytes--every n bytes comprises a single
record and no extra bytes are evident.  The byte-stream record format
imposes no structure whatsoever on its array of bytes; however,
convention dictates that the linefeed character delimits records and that
records can be of any length.  The variable record format is the most
complex because each record contains from 2 to 4 bytes of descriptive
information and/or dead space.  Furthermore, the variable record format
imposes a second level structure on blocks which causes more bytes to be
used for description and/or dead space.

When emulating one record format to another, the idiosyncracies of the
native format must be transformed into the idiosyncracies of the emulated
format.  To understand this better, the concept of a virtual file is
useful.  The virtual file is the file that would result if the source
file were actually transformed into the target format.  For example, to
translate from byte-stream to variable, all the bytes between linefeed
characters must be fit into records inside blocks in the variable record
file.  The resulting file would have record length descriptors, pad
bytes, and block terminators appropriate to variable record files.  If
there were an editor which could handle both record formats, the files
would look identical when viewed through that editor.

Unfortunately, we can't simply translate the source file into a target
file and claim to have done our job.  To do so would imply that there
were actually multiple copies of the file on the system, each with a
different record format.  Clearly, this is too unwieldy and doesn't
really solve the problem.  For this reason, the virtual file is simply a
view imposed upon the source file, without changing its internal layout
in any way.  Thus, when emulating a fixed record format file to
byte-stream, a read() request returns a record delimited with a linefeed
even though the source file contains no linefeed characters.

The concept of the virtual file encompasses more than just the data
layouts within files.  Also included are such attributes as record size,
blocking factor, EOF offset, and file limit.  Thus, a fixed record file
with an EOF of 3 records could appear to have 213 bytes  when emulated as
byte-stream.  FFILEINFO must report the EOF value as 213 for emulated
byte-stream accessors, just as it would for a byte-stream file with 3
records and 213 characters.  In a similar vein, the FPOINT and FSPACE
intrinsics must position the data pointer within the virtual file by
finding the corresponding position in the source file.


\ \ \ Important Details \ Please Read Did you infer that FLABELINFO only returns information from the view of the native record format of the file? The reason FLABELINFO can't pretend to see the file in non-native formats is that it doesn't have any way to know that the named file is being emulated somewhere. Emulation is bound in at open time for a particular file-descriptor.
Caveats At this point, one might be thinking "this sounds great, any file can be perfectly viewed as any record format!" Alas, it is not so. Both the fixed and the variable record formats have distinctive features which prevent perfect emulation: * The maximum record size limit interferes with attempts to write large byte-stream records into files with fixed and variable record formats. * Appending data to the file alternately from native and emulated views cause spurious linefeeds from the emulator's point of view. * For reasons detailed below, writing to the middle of the virtual file can cause unpredictable results (look under the "Idiosyncracies" heading). When the byte-stream view is used to write records larger than the maximum, the record is broken down into multiple smaller records. Later, when the file is viewed again, spurious linefeed characters will have appeared at the sub-record boundaries. One feature of the byte-stream view is that records aren't terminated until the linefeed character is explicitly written to the file. Thus, the following sequence of writes result in the file shown: write('abc'); write('def'); write('ghi\n'); "abcdefghi\n" When the underlying file is either fixed record or variable record, the behavior is not so simple. If a native write were to be inserted between the second and third writes shown above, the file contents is handled as though the linefeed occurred after the "f": write('abc'); write('def'); FWRITE('MPE TEXT'); write('ghi\n'); "abcdef\nMPE TEXT\nghi\n" This handling is consistent from the MPE view since MPE applications only deal in complete records. The emulators implicitly cause the preceding partial record to be considered a complete record. Transparent Binding The file system has some built-in rules about when to use an emulator view versus when to use the native view: * The byte-stream view is the default for POSIX applications since the POSIX C library open() function always requests this view on the application's behalf. * The variable record view for native byte-stream files is the default for all applications other than POSIX applications. This is the first instance of MPE/iX binding an emulator by default rather than the native view. This was done to allow maximum co-existence with traditional MPE applications. * The byte-stream view may be specifically requested using option 77 of HPFOPEN. * All other access methods use the native view by default. Idiosyncracies Fixed to Byte-Stream. The native MPE access methods add fill characters to each record written to round them out to the record size. These characters are required to reside in the file's data since a byte's file offset determines which record it is in. The fill characters in the fixed format file are not a part of the virtual byte-stream file. Therefore, the fill characters must be stripped from each record and a linefeed added before presenting the data back to the byte-stream view. Unfortunately, for records preceding the EOF record, we cannot tell the difference between fill characters added by the native view and the same character written by the emulated view. Therefore, when the emulator writes a record with trailing fill characters, they do not appear in the resulting virtual file. Consider also that record boundaries in the native file determine the placement of linefeed characters in the virtual file. Therefore, emulated writes must also insert fill characters into the file data so that the virtual file has linefeed characters in the appropriate places. Now for the zinger! Fixed record files allow writes to the middle of the file without changing the EOF. This can cause a record to have a different number of fill characters after the write than it had before. The net result is that all subsequent characters have different offsets in the virtual file! This is a rather nasty consequence of the emulation which is impossible to predict from within the byte-stream view. Sorry, we haven't found any way to nullify this effect. Variable to Byte-Stream. Writing to the middle of a variable record file causes the EOF to be cut back to the end of that record. This is a feature imposed by the variable record format because preserving data down-stream from this write requires prohibitive overhead. Overlaying the data in the middle of the file can have down-stream effects all the way to the end of the file. Byte-Stream to Variable. Since a true byte-stream file is just an array of bytes, each record can be as large as desired. The entire file can be just one record! When emulating such a file back as a variable record file, a maximum record size must be chosen. The problem with this is that the maximum record size is not known. Our solution was to return a maximum record size of 8192, which we hope is larger than most files' largest records. Our objective was to optimize access while causing the greatest number of traditional MPE applications to perform correctly with no modifications. A file with records larger than approximately 8192 appears to be truncated in this view. Parting Words Although the emulators have some difficult corner cases to deal with, in practice none of the drawbacks occur frequently. Most applications either don't write huge records, or the effects of added linefeeds are negligible. Many applications have no need to write to the middle of a file, typically the entire file is rewritten. In fact, it was quite a thrill to bring CATALOG.PUB.SYS up into the shell's vi editor and be able to do paging, searches and gotos without any mis-steps. The existence of the emulators enable traditional MPE applications to smoothly interoperate with POSIX applications with little or no recoding required. You will see many benefits of our effort to integrate POSIX applications smoothly into the MPE world.


MPE/iX Communicators