Bijective padding


  1. The basic idea
  2. Implementation
  3. Paddable files
  4. Recipe
  5. Use in BIAES
  6. Optimality
  1. The basic idea

    The basic idea of bijective padding is to create a bijective map between two sets of files with differing granularity.

    Often one wants to create a map between the set of 8-bit granular files and the set of 128-bit granular files in order to feed variable-length files through a block cypher.

    If the padding is deterministic, it appears desirable for the map to be bijective, in order to best prevent the attacker rejecting keys if decrypt(x) != pad(unpad(decrypt(x))).

  2. Implementation

    One obvious way in which this could be done is by numbering the set of 8-bit granular files (smallest first), and numbering the set of 128-bit granular files in a similar manner.

    Then, a bijective map between the two sets could be created by connecting objects with the same number.

    The problem with this naive approach is that numbering files in order is a non-trivial operation.

    Fortunately there exists a simple and effective method of creating such a bijection, by mapping via an intermediate set of paddable files.

  3. Paddable files

    A paddable file is hereby defined to be a file that does not end in a particular symbol.

    It is called paddable because it may be unambiguously and reversibly padded to any given size by adding any number of occurrences of the symbol the file does not end with.

    This is essentially the same idea of what Matt Timmermans christened a "finitely odd file" - a term that seems most appropriate when the padding symbol is the zero bit.

    In order to make a paddable file, two symbols are needed - a MARKER symbol, and a PADDING symbol.

    We will take the PADDING symbol to consist of all zero bits, and the MARKER symbol to be a 1 bit followed by as many zero bits as necessary. Using this convention PADDING and MARKER symbols of any size can be constructed.

    Making a paddable file involves appending a MARKER symbol, if the file ends in the PADDING symbol, or if the file ends in the PADDING symbol followed by a finite number of MARKER symbols.

    If neither of these conditions holds, the file is left untouched.

    "Unmaking" a paddable file involves stripping off any trailing MARKER symbols if that MARKER follows a PADDING symbol, or it follows a PADDING symbol which is followed by a a finite number of MARKER symbols.

    Observe that a paddable file never ends in the PADDING symbol.

    Also observe that the padding operation can produce any file (that does not end in the PADDING symbol) - and that the unpadding operation can produce every possible file.

    From here we can see that unmakePaddable(makePaddable(f)) = f for all f.

    Similarly makePaddable(unmakePaddable(pf)) = pf for all paddable files, pf.

    Once we have a paddable file it can be reversibly padded to any desired length by appending copies of the PADDING symbol.

  4. Recipe

    To illustrate the idea we will consider the example of mapping between the set of 8-bit granular files and the set of 16-bit granular files:

    Simply put the idea is to:

    • Make an 8-bit paddable file;
    • Pad it with padding bytes until it is a multiple of 16-bits in length;
    • "Unmake" a 16-bit paddable file;

    This procedure can be reversed by:

    • Make a 16-bit paddable file;
    • Strip off any padding bytes;
    • "Unmake" a 8-bit paddable file;

    Java source code demonstrating this is available here. In particular, see the methods "makePaddableByteFile" and "unmakePaddableByteFile".

  5. Use in BIAES

    The implementation of bijective padding in BIAES has a couple of additional wrinkles:

    Firstly it bijectively maps from the set of all 8-bit granular files, to the set of all 128-bit granular files excluding the null file. This is done to ensure that the null file does not always map to itself when encrypted.

    Secondly if a 192 or 256-bit key is used, it maps to the set of all 128-bit granular files greater than 128 bits in length. This is done to avoid constricting the keyspace when encrypting short files.

  6. Optimality

    Bijective padding is optimal for certain file distributions. In particular if the probability of transmission of a file of length N is 256 times the probability of a file of length N - 1, it would be hard to improve on this padding scheme.

    In practice files rarely follow this distribution. Usually the probability of file transmission does not depend very strongly on file length.

    Under such conditions, no deterministic padding algorithm will work terribly well. If a genuine random source is freely available, then what is needed under those circumstances is a form of random padding.

BIAES | Bijective padding | Random padding |