tarformat of tar archives |
File Format |
|
This document describes the format of archives read and written by the
tar
utility, and by
pax
using the
-xtar
option. MKS Toolkit's
tar
utility supports both the older
UNIX-compatible TAR formats and the new USTAR format defined in the POSIX
(IEEE P1003.1) standard. The new USTAR format allows more information to be
stored and supports longer file path names.
A
tar
archive, in either format,
consists of one or more blocks, which are used to represent member files. Each
block is 512 bytes long; the
-b
option to
tar
can indicate how many of these
blocks are read and/or written at once.
Each member file consists of a header block (as described later in this page)
followed by 0 or more blocks containing the file contents. The end of the
archive is indicated by two blocks filled with binary zeros. Unused space in
the header is left as binary zeros.
The header information in a block is stored in a printable ASCII form, so that
tar
archives are easily ported to
different environments. If the contents of the files on the archive are all
ASCII, the entire archive is ASCII.
Table 1 shows the format of the header block for a file, in the older
UNIX-compatible TAR format.
Field Width | Field Name | Meaning |
100 | name | name of file |
8 | mode | file mode |
8 | uid | owner user ID |
8 | gid | owner group ID |
12 | size | length of file in bytes |
12 | mtime | modify time of file |
8 | chksum | checksum for header |
1 | link | indicator for links |
100 | linkname | name of linked file |
Table 1: tar
Header Block (TAR Format)
The
link field is
1
for a linked file,
2
for a
symbolic link, and
0
otherwise. A directory is indicated by a
trailing slash (
/
) in its
name.
For the new USTAR format, headers take on the format shown in Table 2. Note that
tar
can determine that the USTAR
format is being used by the presence of the null-terminated string
"
ustar
" in the
magic field. All fields before the
magic field correspond to those of the older format described earlier,
except that the
typeflag replaces the
link field.
Field Width | Field Name | Meaning |
100 | name | name of file |
8 | mode | file mode |
8 | uid | owner user ID |
8 | gid | owner group ID |
12 | size | length of file in bytes |
12 | mtime | modify time of file |
8 | chksum | checksum for header |
1 | typeflag | type of file |
100 | linkname | name of linked file |
6 | magic | USTAR indicator |
2 | version | USTAR version |
32 | uname | owner user name |
32 | gname | owner group name |
8 | devmajor | device major number |
8 | devminor | device minor number |
155 | prefix | prefix for file name |
Table 2: tar
Header Block (USTAR
Format)
This information is compatible with that returned by the UNIX
stat()
function; see also
stat
. The
magic,
uname,
and
gname fields are null-terminated character strings. The fields
name,
linkname, and
prefix are null-terminated unless the
full field is used to store a name (that is, the last character is not null).
All other fields are zero-filled octal numbers, in ASCII. Trailing nulls are
present for these numbers, except for the
size,
mtime, and
version fields.
The
name field contains the name of the archived file. On USTAR format
archives, the value of the
prefix field, if non-null, is prefixed to the
name field to allow names longer then 100 characters. For compatibility
with older
tar
commands, the MKS
Toolkit version of
tar
leaves
prefix null unless the file name exceeds 100 characters.
The
size field is 0 if the header describes a link.
The
chksum field is a checksum of all the bytes in the header, assuming
that the
chksum field itself is all blanks.
For USTAR, the
typeflag field is a compatible extension of the
link
field of the older TAR format. Table 3 shows the values that are recognized.
Type Flag | File Type |
0 or null | Regular file |
1 | Link to another file already archived |
2 | Symbolic link |
3 | Character special device |
4 | Block special device |
5 | Directory |
6 | FIFO special file |
7 | Reserved |
A-Z | Available for custom usage |
Table 3: Type Flag Values for USTAR Format Files
In USTAR format, the
uname and
gname fields contain the name of
the owner and group of the file respectively.
tar
archives are fully compatible
between UNIX and Windows systems because all header information is represented
in ASCII.
The ASCII digit
7
is commonly used in the
typeflag field to
indicate contiguous files. The use of
2
to indicate a symbolic link
is particular to some UNIX versions. These common extensions are mentioned in
the POSIX (IEEE P1003.1) standard.
- Commands:
cpio,
pax,
tar
- File Formats:
cpio