diff --git a/docs/manual/format.md b/docs/manual/format.md index 0325c4123e..553ce90b6f 100644 --- a/docs/manual/format.md +++ b/docs/manual/format.md @@ -2,13 +2,13 @@ layout: default title: rpm.org - RPM Package format --- + # Package format -This document describes the RPM file format version 3.0, which is used -by RPM versions 2.1 and greater. The format is subject to change, and -you should not assume that this document is kept up to date with the -latest RPM code. That said, the 3.0 format should not change for -quite a while, and when it does, it will not be 3.0 anymore :-). +This document describes the RPM file format version 4.0. The format is subject +to change, and you should not assume that this document is kept up to date with +the latest RPM code. With that said, the basic principles have not and are not +likely to change significantly over time. \warning In any case, THE PROPER WAY TO ACCESS THESE STRUCTURES IS THROUGH THE RPM LIBRARY!! @@ -23,17 +23,20 @@ package file is divided in 4 logical sections: . Payload -- compressed archive of the file(s) in the package (aka "payload") ``` -All 2 and 4 byte "integer" quantities (int16 and int32) are stored in -network byte order. When data is presented, the first number is the -byte number, or address, in hex, followed by the byte values in hex, -followed by character "translations" (where appropriate). +All 2 and 4 byte "integer" quantities (int16 and int32) are stored in network +byte order (big-endian). When data is presented, the first number is the byte +number, or address, in hex, followed by the byte values in hex, followed by +character "translations" (where appropriate). ## Lead -The Lead is basically for file(1). All the information contained in -the Lead is duplicated or superceded by information in the Header. -Much of the info in the Lead was used in old versions of RPM but is -now ignored. The Lead is stored as a C structure: +The Lead is basically for file(1). All the information contained in the Lead +is duplicated or superceded by information in the Header. Much of the info in +the Lead was used in old versions of RPM but is now ignored. The details here +are left for historical reasons, but current and future development should +use the Header structure instead. + +The Lead is stored as a C structure: \code struct rpmlead { @@ -48,31 +51,31 @@ struct rpmlead { }; \endcode -and is illustrated with one pulled from the rpm-2.1.2-1.i386.rpm -package: +and is illustrated with one pulled from the rpm-2.1.2-1.i386.rpm package: ``` 00000000: ed ab ee db 03 00 00 00 ``` -The first 4 bytes (0-3) are "magic" used to uniquely identify an RPM -package. It is used by RPM and file(1). The next two bytes (4, 5) -are int8 quantities denoting the "major" and "minor" RPM file format -version. This package is in 3.0 format. The following 2 bytes (6-7) -form an int16 which indicates the package type. As of this writing -there are only two types: 0 == binary, 1 == source. +The first 4 bytes (0-3) are the "magic" number used to uniquely identify a file +as an RPM package. It is used by RPM and file(1). The next two bytes (4, 5) +are int8 quantities denoting the "major" and "minor" RPM file format version. +For legacy reasons, this version is always "3.0" (major version "3", minor +version "0"), even with packages built by RPM 4.0+ (referred to as RPM v4 +packages). The following 2 bytes (6-7) form an int16 which indicates the +package type. As of this writing there are only two types: 0 == binary, +1 == source. ``` 00000008: 00 01 72 70 6d 2d 32 2e ..rpm-2. ``` -The next two bytes (8-9) form an int16 that indicates the architecture -the package was built for. While this is used by file(1), the true -architecture is stored as a string in the Header. See, lib/misc.c for -a list of architecture->int16 translations. In this case, 1 == i386. -Starting with byte 10 and extending to byte 75, are 65 characters and -a null byte which contain the familiar "name-version-release" of the -package, padded with null (0) bytes. +The next two bytes (8-9) form an int16 that indicates the architecture that the +package was built for. While this is used by file(1), the true architecture +is stored as a string in the Header. In this case, 1 == i386. Starting with +byte 10 and extending to byte 75, are 65 characters and a null byte which +contain the familiar "name-version-release" of the package, padded with null +(0) bytes. ``` 00000010: 31 2e 32 2d 31 00 00 00 1.2-1... @@ -85,88 +88,72 @@ package, padded with null (0) bytes. 00000048: 00 00 00 00 00 01 00 05 ........ ``` -Bytes 76-77 ("00 01" above) form an int16 that indicates the OS the -package was built for. In this case, 1 == Linux. The next 2 bytes -(78-79) form an int16 that indicates the signature type. This tells -RPM what to expect in the Signature. For version 3.0 packages, this -is 5, which indicates the new "Header-style" signatures. +Bytes 76-77 ("00 01" above) form an int16 that indicates the OS the package was +built for. In this case, 1 == Linux. The next 2 bytes (78-79) form an int16 +that indicates the signature type. This tells RPM what to expect in the +Signature. This is generally expected to be 5, which indicates the use of +"Header-style" signatures. ``` 00000050: 04 00 00 00 68 e6 ff bf ........ 00000058: ab ad 00 08 3c eb ff bf ........ ``` -The remaining 16 bytes (80-95) are currently unused and are reserved -for future expansion. +The remaining 16 bytes (80-95) are unused. ## Signature -A 3.0 format signature (denoted by signature type 5 in the Lead), uses -the same structure as the Header. For historical reasons, this -structure is called a "header structure", which can be confusing since -it is used for both the Header and the Signature. The details of the -header structure are given below, and you'll want to read them so the -rest of this makes sense. The tags for the Signature are defined in -lib/signature.h. - -The Signature can contain multiple signatures, of different types. -There are currently only three types, each with its own tag in the -header structure: - -``` - Name Tag Header Type - ---- ---- ----------- - SIZE 1000 INT_32 - MD5 1001 BIN - PGP 1002 BIN -``` - -The MD5 signature is 16 bytes, and the PGP signature varies with -the size of the PGP key used to sign the package. - -As of RPM 2.1, all packages carry at least SIZE and MD5 signatures, -and the Signature section is padded to a multiple of 8 bytes. +"Header-style" signatures (denoted by signature type 5 in the Lead), use the +same structure as the Header. For historical reasons, this structure is called +a "header structure", which can be confusing since it is used for both the +Header and the Signature. The details of the header structure are given below, +and you'll want to read them so the rest of this makes sense. The tags for the +Signature are defined in include/rpm/rpmtag.h. + +The Signature can contain multiple different types of signatures, stored under +unique tags (just like the Header). Details about these tags and the information +they store can be found [here](signatures_digests.md). + +RPM v4 packages are expected to contain at least one of SHA1HEADER or SHA256HEADER +tags, providing a cryptographic digest of the main header, and may contain one +or both of the PAYLOADDIGEST and PAYLOADDIGESTALT tags, providing a cryptographic +digest of the package payload in the compressed and uncompressed forms, respectively. + +If the package has been cryptographically signed using OpenPGP, an RSAHEADER or +DSAHEADER tag ought to be present, which contains an OpenPGP signature of the +package header. Which tag is present depends on which of the two (supported) +OpenPGP algorithms was used at signing time. Using a key based upon the RSA +algorithm to sign the package will result in the signature being stored in the +RSAHEADER tag, whereas the use of the EdDSA (ed25519) algorithm will use the +DSAHEADER tag instead. The name of the DSAHEADER tag is a historical artifact, +it originally referred to the long-obsolete DSA algorithm but was later reused +for EdDSA (ed25519) signatures. + +As the package header itself contains a checksum of the payload (as of RPM 4.14+), +the header signature is sufficient to establish cryptographic provenance of the +package. + +Other signature tags which may be present are considered legacy and their use is +discouraged if a more modern option is available. ## Header -The Header contains all the information about a package: name, -version, file list, etc. It uses the same "header structure" as the -Signature, which is described in detail below. A complete list of the -tags for the Header would take too much space to list here, and the -list grows fairly frequently. For the complete list see lib/rpmlib.h -in the RPM sources. - -## Payload - -The Payload is currently a cpio archive, gzipped by default. The cpio archive -type used is SVR4 with a CRC checksum. - -As cpio is limited to 4 GB (32 bit unsigned) file sizes RPM since -version 4.12 uses a stripped down version of cpio for packages with -files > 4 GB. This format uses `07070X` as magic bytes and the file -header otherwise only contains the index number of the file in the RPM -header as 8 byte hex string. The file metadata that is normally found -in a cpio file header - including the file name - is completely -omitted as it is stored in the RPM header already. - -To use a different compression method when building new packages with -`rpmbuild(8)`, define the `%_binary_payload` or `%_source_payload` macros for -the binary or source packages, respectively. These macros accept an -[RPM IO mode string](https://ftp.osuosl.org/pub/rpm/api/4.17.0/group__rpmio.html#example-mode-strings) -(only `w` mode). +The Header contains all the information about a package: name, version, file +list, etc. It uses the same "header structure" as the Signature, which is +described in further detail below. A complete list of the tags for the Header +would take too much space to list here, and the list grows fairly frequently. +For the complete list see include/rpm/rpmtag.h in the RPM sources. ## The Header Structure -The header structure is a little complicated, but actually performs a -very simple function. It acts almost like a small database in that it -allows you to store and retrieve arbitrary data with a key called a -"tag". When a header structure is written to disk, the data is -written in network byte order, and when it is read from disk, is is -converted to host byte order. +The header structure is a little complicated, but actually performs a very +simple function. It acts almost like a small database in that it allows you +to store and retrieve arbitrary data with a key called a "tag". When a header +structure is written to disk, the data is written in network byte order +(big-endian), and when it is read from disk, is is converted to host byte order. Along with the tag and the data, a data "type" is stored, which indicates, -obviously, the type of the data associated with the tag. There are -currently 9 types: +obviously, the type of the data associated with the tag. There are currently 9 types: ``` Type Number @@ -178,7 +165,7 @@ currently 9 types: INT32 4 INT64 5 STRING 6 - BIN 7 + BIN 7 STRING_ARRAY 8 I18NSTRING_TYPE 9 ``` @@ -229,7 +216,7 @@ In our example there would be 32 such 16-byte index entries, followed by the data section: ``` -00000210: 72 70 6d 00 32 2e 31 2e 32 00 31 00 52 65 64 20 rpm.2.1.2.1.Red +00000210: 72 70 6d 00 32 2e 31 2e 32 00 31 00 52 65 64 20 rpm.2.1.2.1.Red 00000220: 48 61 74 20 50 61 63 6b 61 67 65 20 4d 61 6e 61 Hat Package Mana 00000230: 67 65 72 00 31 e7 cb b4 73 63 68 72 6f 65 64 65 ger.1...schroede 00000240: 72 2e 72 65 64 68 61 74 2e 63 6f 6d 00 00 00 00 r.redhat.com.... @@ -264,3 +251,19 @@ could start at byte 589, byte that is an improper boundary for an INT32. As a result, 3 null bytes are inserted and the date for the SIZE actually starts at byte 592: "00 09 9b 31", which is 629553). +## Payload + +The Payload is currently a cpio archive, typically compressed using the gzip, +zstandard, or LZMA algorithms. The cpio archive type used is SVR4 with a CRC checksum. + +As cpio is limited to 4 GB (32 bit unsigned) file sizes, RPM (since version 4.12) +uses a stripped down variant of cpio for packages with files > 4 GB. This format +uses `07070X` as magic bytes and the file header otherwise only contains the +index number of the file in the RPM header as 8 byte hex string. The file +metadata that is normally found in a cpio file header - including the file name - +is completely omitted as it is stored in the RPM header already. + +To use a different compression method when building new packages with `rpmbuild(8)`, +define the `%_binary_payload` or `%_source_payload` macros for the binary or source +packages, respectively. These macros accept an [RPM IO mode string](https://ftp.osuosl.org/pub/rpm/api/4.17.0/group__rpmio.html#example-mode-strings) +(only `w` mode). diff --git a/docs/manual/signatures_digests.md b/docs/manual/signatures_digests.md index a1c4a68344..bbc7b5bfcf 100644 --- a/docs/manual/signatures_digests.md +++ b/docs/manual/signatures_digests.md @@ -2,24 +2,25 @@ layout: default title: rpm.org - Signatures and Digests --- + # Signatures and Digests Table describing signatures and digests which RPM uses to verify package contents: -| RPMSIGTAG_ | RPMTAG_ | Version | Algorithm | Location | Range | -| :---: | :-------: | :---: | :-----: | :--: | :-----: | -| MD5 | SIGMD5 | 3.0 | MD5 | S | HP | -| PGP | SIGPGP | 3.0 | OpenPGP/RSA | S | HP | -| GPG | SIGGPG | 3.0 | OpenPGP/DSA | S | HP | -| SHA1 | SHA1HEADER | 4.0 | SHA1 | S | H | -| RSA | RSAHEADER | 4.0 | OpenPGP/RSA | S | H | -| DSA | DSAHEADER | 4.0 | OpenPGP/DSA | S | H | -| SHA256 | SHA256HEADER | 4.14 | SHA256 | S | H | -| - | PAYLOADDIGEST | 4.14 | SHA256 (*) | H | Pc | -| - | PAYLOADDIGESTALT | 4.16 | SHA256 (*) | H | P | -| - | FILEMD5 | 3.0 | MD5 | H | F | -| - | FILEDIGESTS | 4.6 | SHA256 (**) | H | F | +| RPMSIGTAG_ | RPMTAG_ | Version | Deprecated | Algorithm | Location | Range | +| :---: | :-------: | :---: | :---: | :-----: | :--: | :-----: | +| MD5 | SIGMD5 | 3.0 | Y | MD5 | S | HP | +| PGP | SIGPGP | 3.0 | Y | OpenPGP/RSA | S | HP | +| GPG | SIGGPG | 3.0 | Y | OpenPGP/EdDSA | S | HP | +| SHA1 | SHA1HEADER | 4.0 | Y | SHA1 | S | H | +| RSA | RSAHEADER | 4.0 | | OpenPGP/RSA | S | H | +| DSA | DSAHEADER | 4.0 | | OpenPGP/EdDSA | S | H | +| SHA256 | SHA256HEADER | 4.14 | | SHA256 | S | H | +| - | FILEMD5 | 3.0 | Y | MD5 | H | F | +| - | FILEDIGESTS | 4.6 | | SHA256 (**) | H | F | +| - | PAYLOADDIGEST | 4.14 | | SHA256 (*) | H | Pc | +| - | PAYLOADDIGESTALT | 4.16 | | SHA256 (*) | H | P | * S = Signature header * H = Main header diff --git a/docs/manual/tags.md b/docs/manual/tags.md index 86ddbf8764..19d2bbfde8 100644 --- a/docs/manual/tags.md +++ b/docs/manual/tags.md @@ -298,22 +298,22 @@ Transfiletriggerversion | 5081 | string array ## Signatures and digests -[Signatures](signatures.md) allow to verify the origin of a package. +[Signatures](signatures_digests.md) allow verifying the origin of a package. Tag Name | Value| Type | Description ------------------|------|--------------|------------ -Dsaheader | 267 | bin | OpenPGP DSA signature of the header (if thus signed) -Longsigsize | 270 | int64 | Header+payload size if > 4GB. +Dsaheader | 267 | bin | OpenPGP EdDSA signature of the header (if thus signed) +Longsigsize | 270 | int64 | Deprecated: Header+payload size if > 4GB. Payloaddigest | 5092 | string array | Cryptographic digest of the compressed payload. Payloaddigestalgo | 5093 | int32 | ID of the payload digest algorithm. Payloaddigestalt | 5097 | string array | Cryptographic digest of the uncompressed payload. Rsaheader | 268 | bin | OpenPGP RSA signature of the header (if thus signed). -Sha1header | 269 | string | SHA1 digest of the header. +Sha1header | 269 | string | Deprecated: SHA1 digest of the header. Sha256header | 273 | string | SHA256 digest of the header. -Siggpg | 262 | bin | OpenPGP DSA signature of the header+payload (if thus signed). -Sigmd5 | 261 | bin | MD5 digest of the header+payload. -Sigpgp | 259 | bin | OpenPGP RSA signature of the header+payload (if thus signed). -Sigsize | 257 | int32 | Header+payload size. +Siggpg | 262 | bin | Deprecated: OpenPGP DSA signature of the header+payload (if thus signed). +Sigmd5 | 261 | bin | Deprecated: MD5 digest of the header+payload. +Sigpgp | 259 | bin | Deprecated: OpenPGP RSA signature of the header+payload (if thus signed). +Sigsize | 257 | int32 | Deprecated: Header+payload size. ## Installed package headers only