Core API: Tunneling

construct.RawCopy(subcon)

Used to obtain byte representation of a field (aside of object value).

Returns a dict containing both parsed subcon value, the raw bytes that were consumed by subcon, starting and ending offset in the stream, and amount in bytes. Builds either from raw bytes representation or a value used by subcon. Size is same as subcon.

Object is a dictionary with either “data” or “value” keys, or both.

When building, if both the “value” and “data” keys are present, then the “data” key is used and the “value” key is ignored. This is undesirable in the case that you parse some data for the purpose of modifying it and writing it back; in this case, delete the “data” key when modifying the “value” key to correctly rebuild the former.

Parameters:

subcon – Construct instance

Raises:
  • StreamError – stream is not seekable and tellable

  • RawCopyError – building and neither data or value was given

  • StringError – building from non-bytes value, perhaps unicode

Example:

>>> d = RawCopy(Byte)
>>> d.parse(b"\xff")
Container(data=b'\xff', value=255, offset1=0, offset2=1, length=1)
>>> d.build(dict(data=b"\xff"))
'\xff'
>>> d.build(dict(value=255))
'\xff'
construct.ByteSwapped(subcon)

Swaps the byte order within boundaries of given subcon. Requires a fixed sized subcon.

Parameters:

subcon – Construct instance, subcon on top of byte swapped bytes

Raises:

SizeofError – ctor or compiler could not compute subcon size

See Transformed and Restreamed for raisable exceptions.

Example:

Int24ul <--> ByteSwapped(Int24ub) <--> BytesInteger(3, swapped=True) <--> ByteSwapped(BytesInteger(3))
construct.BitsSwapped(subcon)

Swaps the bit order within each byte within boundaries of given subcon. Does NOT require a fixed sized subcon.

Parameters:

subcon – Construct instance, subcon on top of bit swapped bytes

Raises:

SizeofError – compiler could not compute subcon size

See Transformed and Restreamed for raisable exceptions.

Example:

>>> d = Bitwise(Bytes(8))
>>> d.parse(b"\x01")
'\x00\x00\x00\x00\x00\x00\x00\x01'
>>>> BitsSwapped(d).parse(b"\x01")
'\x01\x00\x00\x00\x00\x00\x00\x00'
construct.Prefixed(lengthfield, subcon, includelength=False)

Prefixes a field with byte count.

Parses the length field. Then reads that amount of bytes, and parses subcon using only those bytes. Constructs that consume entire remaining stream are constrained to consuming only the specified amount of bytes (a substream). When building, data gets prefixed by its length. Optionally, length field can include its own size. Size is the sum of both fields sizes, unless either raises SizeofError.

Analog to PrefixedArray which prefixes with an element count, instead of byte count. Semantics is similar but implementation is different.

VarInt is recommended for new protocols, as it is more compact and never overflows.

Parameters:
  • lengthfield – Construct instance, field used for storing the length

  • subcon – Construct instance, subcon used for storing the value

  • includelength – optional, bool, whether length field should include its own size, default is False

Raises:

StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

Example:

>>> d = Prefixed(VarInt, GreedyRange(Int32ul))
>>> d.parse(b"\x08abcdefgh")
[1684234849, 1751606885]

>>> d = PrefixedArray(VarInt, Int32ul)
>>> d.parse(b"\x02abcdefgh")
[1684234849, 1751606885]
construct.PrefixedArray(countfield, subcon)

Prefixes an array with item count (as opposed to prefixed by byte count, see Prefixed).

VarInt is recommended for new protocols, as it is more compact and never overflows.

Parameters:
  • countfield – Construct instance, field used for storing the element count

  • subcon – Construct instance, subcon used for storing each element

Raises:
  • StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

  • RangeError – consumed or produced too little elements

Example:

>>> d = Prefixed(VarInt, GreedyRange(Int32ul))
>>> d.parse(b"\x08abcdefgh")
[1684234849, 1751606885]

>>> d = PrefixedArray(VarInt, Int32ul)
>>> d.parse(b"\x02abcdefgh")
[1684234849, 1751606885]
construct.FixedSized(length, subcon)

Restricts parsing to specified amount of bytes.

Parsing reads length bytes, then defers to subcon using new BytesIO with said bytes. Building builds the subcon using new BytesIO, then writes said data and additional null bytes accordingly. Size is same as length, although negative amount raises an error.

Parameters:
  • length – integer or context lambda, total amount of bytes (both data and padding)

  • subcon – Construct instance

Raises:
  • StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

  • PaddingError – length is negative

  • PaddingError – subcon written more bytes than entire length (negative padding)

Can propagate any exception from the lambda, possibly non-ConstructError.

Example:

>>> d = FixedSized(10, Byte)
>>> d.parse(b'\xff\x00\x00\x00\x00\x00\x00\x00\x00\x00')
255
>>> d.build(255)
b'\xff\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>> d.sizeof()
10
construct.NullTerminated(subcon, term=b'\x00', include=False, consume=True, require=True)

Restricts parsing to bytes preceding a null byte.

Parsing reads one byte at a time and accumulates it with previous bytes. When term was found, (by default) consumes but discards the term. When EOF was found, (by default) raises same StreamError exception. Then subcon is parsed using new BytesIO made with said data. Building builds the subcon and then writes the term. Size is undefined.

The term can be multiple bytes, to support string classes with UTF16/32 encodings for example. Be warned however: as reported in Issue 1046, the data read must be a multiple of the term length and the term must start at a unit boundary, otherwise strange things happen when parsing.

Parameters:
  • subcon – Construct instance

  • term – optional, bytes, terminator byte-string, default is x00 single null byte

  • include – optional, bool, if to include terminator in resulting data, default is False

  • consume – optional, bool, if to consume terminator or leave it in the stream, default is True

  • require – optional, bool, if EOF results in failure or not, default is True

Raises:
  • StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

  • StreamError – encountered EOF but require is not disabled

  • PaddingError – terminator is less than 1 bytes in length

Example:

>>> d = NullTerminated(Byte)
>>> d.parse(b'\xff\x00')
255
>>> d.build(255)
b'\xff\x00'
construct.NullStripped(subcon, pad=b'\x00')

Restricts parsing to bytes except padding left of EOF.

Parsing reads entire stream, then strips the data from right to left of null bytes, then parses subcon using new BytesIO made of said data. Building defers to subcon as-is. Size is undefined, because it reads till EOF.

The pad can be multiple bytes, to support string classes with UTF16/32 encodings.

Parameters:
  • subcon – Construct instance

  • pad – optional, bytes, padding byte-string, default is x00 single null byte

Raises:

PaddingError – pad is less than 1 bytes in length

Example:

>>> d = NullStripped(Byte)
>>> d.parse(b'\xff\x00\x00')
255
>>> d.build(255)
b'\xff'
construct.RestreamData(datafunc, subcon)

Parses a field on external data (but does not build).

Parsing defers to subcon, but provides it a separate BytesIO stream based on data provided by datafunc (a bytes literal or another BytesIO stream or Construct instances that returns bytes or context lambda). Building does nothing. Size is 0 because as far as other fields see it, this field does not produce or consume any bytes from the stream.

Parameters:
  • datafunc – bytes or BytesIO or Construct instance (that parses into bytes) or context lambda, provides data for subcon to parse from

  • subcon – Construct instance

Can propagate any exception from the lambdas, possibly non-ConstructError.

Example:

>>> d = RestreamData(b"\x01", Int8ub)
>>> d.parse(b"")
1
>>> d.build(0)
b''

>>> d = RestreamData(NullTerminated(GreedyBytes), Int16ub)
>>> d.parse(b"\x01\x02\x00")
0x0102
>>> d = RestreamData(FixedSized(2, GreedyBytes), Int16ub)
>>> d.parse(b"\x01\x02\x00")
0x0102
construct.Transformed(subcon, decodefunc, decodeamount, encodefunc, encodeamount)

Transforms bytes between the underlying stream and the (fixed-sized) subcon.

Parsing reads a specified amount (or till EOF), processes data using a bytes-to-bytes decoding function, then parses subcon using those data. Building does build subcon into separate bytes, then processes it using encoding bytes-to-bytes function, then writes those data into main stream. Size is reported as decodeamount or encodeamount if those are equal, otherwise its SizeofError.

Used internally to implement Bitwise Bytewise ByteSwapped BitsSwapped .

Possible use-cases include encryption, obfuscation, byte-level encoding.

Warning

Remember that subcon must consume (or produce) an amount of bytes that is same as decodeamount (or encodeamount).

Warning

Do NOT use seeking/telling classes inside Transformed context.

Parameters:
  • subcon – Construct instance

  • decodefunc – bytes-to-bytes function, applied before parsing subcon

  • decodeamount – integer, amount of bytes to read

  • encodefunc – bytes-to-bytes function, applied after building subcon

  • encodeamount – integer, amount of bytes to write

Raises:
  • StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

  • StreamError – subcon build and encoder transformed more or less than encodeamount bytes, if amount is specified

  • StringError – building from non-bytes value, perhaps unicode

Can propagate any exception from the lambdas, possibly non-ConstructError.

Example:

>>> d = Transformed(Bytes(16), bytes2bits, 2, bits2bytes, 2)
>>> d.parse(b"\x00\x00")
b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'

>>> d = Transformed(GreedyBytes, bytes2bits, None, bits2bytes, None)
>>> d.parse(b"\x00\x00")
b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
construct.Restreamed(subcon, decoder, decoderunit, encoder, encoderunit, sizecomputer)

Transforms bytes between the underlying stream and the (variable-sized) subcon.

Used internally to implement Bitwise Bytewise ByteSwapped BitsSwapped .

Warning

Remember that subcon must consume or produce an amount of bytes that is a multiple of encoding or decoding units. For example, in a Bitwise context you should process a multiple of 8 bits or the stream will fail during parsing/building.

Warning

Do NOT use seeking/telling classes inside Restreamed context.

Parameters:
  • subcon – Construct instance

  • decoder – bytes-to-bytes function, used on data chunks when parsing

  • decoderunit – integer, decoder takes chunks of this size

  • encoder – bytes-to-bytes function, used on data chunks when building

  • encoderunit – integer, encoder takes chunks of this size

  • sizecomputer – function that computes amount of bytes outputed

Can propagate any exception from the lambda, possibly non-ConstructError. Can also raise arbitrary exceptions in RestreamedBytesIO implementation.

Example:

Bitwise  <--> Restreamed(subcon, bits2bytes, 8, bytes2bits, 1, lambda n: n//8)
Bytewise <--> Restreamed(subcon, bytes2bits, 1, bits2bytes, 8, lambda n: n*8)
construct.ProcessXor(padfunc, subcon)

Transforms bytes between the underlying stream and the subcon.

Used internally by KaitaiStruct compiler, when translating process: xor tags.

Parsing reads till EOF, xors data with the pad, then feeds that data into subcon. Building first builds the subcon into separate BytesIO stream, xors data with the pad, then writes that data into the main stream. Size is the same as subcon, unless it raises SizeofError.

Parameters:
  • padfunc – integer or bytes or context lambda, single or multiple bytes to xor data with

  • subcon – Construct instance

Raises:

StringError – pad is not integer or bytes

Can propagate any exception from the lambda, possibly non-ConstructError.

Example:

>>> d = ProcessXor(0xf0 or b'\xf0', Int16ub)
>>> d.parse(b"\x00\xff")
0xf00f
>>> d.sizeof()
2
construct.ProcessRotateLeft(amount, group, subcon)

Transforms bytes between the underlying stream and the subcon.

Used internally by KaitaiStruct compiler, when translating process: rol/ror tags.

Parsing reads till EOF, rotates (shifts) the data left by amount in bits, then feeds that data into subcon. Building first builds the subcon into separate BytesIO stream, rotates right by negating amount, then writes that data into the main stream. Size is the same as subcon, unless it raises SizeofError.

Parameters:
  • amount – integer or context lambda, shift by this amount in bits, treated modulo (group x 8)

  • group – integer or context lambda, shifting is applied to chunks of this size in bytes

  • subcon – Construct instance

Raises:
  • RotationError – group is less than 1

  • RotationError – data length is not a multiple of group size

Can propagate any exception from the lambda, possibly non-ConstructError.

Example:

>>> d = ProcessRotateLeft(4, 1, Int16ub)
>>> d.parse(b'\x0f\xf0')
0xf00f
>>> d = ProcessRotateLeft(4, 2, Int16ub)
>>> d.parse(b'\x0f\xf0')
0xff00
>>> d.sizeof()
2
construct.Checksum(checksumfield, hashfunc, bytesfunc)

Field that is build or validated by a hash of a given byte range. Usually used with RawCopy .

Parsing compares parsed subcon checksumfield with a context entry provided by bytesfunc and transformed by hashfunc. Building fetches the contect entry, transforms it, then writes is using subcon. Size is same as subcon.

Parameters:
  • checksumfield – a subcon field that reads the checksum, usually Bytes(int)

  • hashfunc – function that takes bytes and returns whatever checksumfield takes when building, usually from hashlib module

  • bytesfunc – context lambda that returns bytes (or object) to be hashed, usually like this.rawcopy1.data

Raises:

ChecksumError – parsing and actual checksum does not match actual data

Can propagate any exception from the lambdas, possibly non-ConstructError.

Example:

import hashlib
d = Struct(
    "fields" / RawCopy(Struct(
        Padding(1000),
    )),
    "checksum" / Checksum(Bytes(64),
        lambda data: hashlib.sha512(data).digest(),
        this.fields.data),
)
d.build(dict(fields=dict(value={})))
import hashlib
d = Struct(
    "offset" / Tell,
    "checksum" / Padding(64),
    "fields" / RawCopy(Struct(
        Padding(1000),
    )),
    "checksum" / Pointer(this.offset, Checksum(Bytes(64),
        lambda data: hashlib.sha512(data).digest(),
        this.fields.data)),
)
d.build(dict(fields=dict(value={})))
construct.Compressed(subcon, encoding, level=None)

Compresses and decompresses underlying stream when processing subcon. When parsing, entire stream is consumed. When building, it puts compressed bytes without marking the end. This construct should be used with Prefixed .

Parsing and building transforms all bytes using a specified codec. Since data is processed until EOF, it behaves similar to GreedyBytes. Size is undefined.

Parameters:
  • subcon – Construct instance, subcon used for storing the value

  • encoding – string, any of module names like zlib/gzip/bzip2/lzma, otherwise any of codecs module bytes<->bytes encodings, each codec usually requires some Python version

  • level – optional, integer between 0..9, although lzma discards it, some encoders allow different compression levels

Raises:
  • ImportError – needed module could not be imported by ctor

  • StreamError – stream failed when reading until EOF

Example:

>>> d = Prefixed(VarInt, Compressed(GreedyBytes, "zlib"))
>>> d.build(bytes(100))
b'\x0cx\x9cc`\xa0=\x00\x00\x00d\x00\x01'
>>> len(_)
13
construct.CompressedLZ4(subcon)

Compresses and decompresses underlying stream before processing subcon. When parsing, entire stream is consumed. When building, it puts compressed bytes without marking the end. This construct should be used with Prefixed .

Parsing and building transforms all bytes using LZ4 library. Since data is processed until EOF, it behaves similar to GreedyBytes. Size is undefined.

Parameters:

subcon – Construct instance, subcon used for storing the value

Raises:
  • ImportError – needed module could not be imported by ctor

  • StreamError – stream failed when reading until EOF

Can propagate lz4.frame exceptions.

Example:

>>> d = Prefixed(VarInt, CompressedLZ4(GreedyBytes))
>>> d.build(bytes(100))
b'"\x04"M\x18h@d\x00\x00\x00\x00\x00\x00\x00#\x0b\x00\x00\x00\x1f\x00\x01\x00KP\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>> len(_)
35
construct.EncryptedSym(subcon, cipher)

Perform symmetrical encryption and decryption of the underlying stream before processing subcon. When parsing, entire stream is consumed. When building, it puts encrypted bytes without marking the end.

Parsing and building transforms all bytes using the selected cipher. Since data is processed until EOF, it behaves similar to GreedyBytes. Size is undefined.

The key for encryption and decryption should be passed via contextkw to build and parse methods.

This construct is heavily based on the cryptography library, which supports the following algorithms and modes. For more details please see the documentation of that library.

Algorithms: - AES - Camellia - ChaCha20 - TripleDES - CAST5 - SEED - SM4 - Blowfish (weak cipher) - ARC4 (weak cipher) - IDEA (weak cipher)

Modes: - CBC - CTR - OFB - CFB - CFB8 - XTS - ECB (insecure)

Note

Keep in mind that some of the algorithms require padding of the data. This can be done e.g. with Aligned.

Note

For GCM mode use EncryptedSymAead.

Parameters:
  • subcon – Construct instance, subcon used for storing the value

  • cipher – Cipher object or context lambda from cryptography.hazmat.primitives.ciphers

Raises:
  • ImportError – needed module could not be imported

  • StreamError – stream failed when reading until EOF

  • CipherError – no cipher object is provided

  • CipherError – an AEAD cipher is used

Can propagate cryptography.exceptions exceptions.

Example:

>>> from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
>>> d = Struct(
...     "iv" / Default(Bytes(16), os.urandom(16)),
...     "enc_data" / EncryptedSym(
...         Aligned(16,
...             Struct(
...                 "width" / Int16ul,
...                 "height" / Int16ul,
...             )
...         ),
...         lambda ctx: Cipher(algorithms.AES(ctx._.key), modes.CBC(ctx.iv))
...     )
... )
>>> key128 = b"\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f"
>>> d.build({"enc_data": {"width": 5, "height": 4}}, key=key128)
b"o\x11i\x98~H\xc9\x1c\x17\x83\xf6|U:\x1a\x86+\x00\x89\xf7\x8e\xc3L\x04\t\xca\x8a\xc8\xc2\xfb'\xc8"
>>> d.parse(b"o\x11i\x98~H\xc9\x1c\x17\x83\xf6|U:\x1a\x86+\x00\x89\xf7\x8e\xc3L\x04\t\xca\x8a\xc8\xc2\xfb'\xc8", key=key128)
Container: 
    iv = b'o\x11i\x98~H\xc9\x1c\x17\x83\xf6|U:\x1a\x86' (total 16)
    enc_data = Container: 
        width = 5
        height = 4
construct.EncryptedSymAead(subcon, cipher, nonce, associated_data=b'')

Perform symmetrical AEAD encryption and decryption of the underlying stream before processing subcon. When parsing, entire stream is consumed. When building, it puts encrypted bytes and tag without marking the end.

Parsing and building transforms all bytes using the selected cipher and also authenticates the associated_data. Since data is processed until EOF, it behaves similar to GreedyBytes. Size is undefined.

The key for encryption and decryption should be passed via contextkw to build and parse methods.

This construct is heavily based on the cryptography library, which supports the following AEAD ciphers. For more details please see the documentation of that library.

AEAD ciphers: - AESGCM - AESCCM - ChaCha20Poly1305

Parameters:
  • subcon – Construct instance, subcon used for storing the value

  • cipher – Cipher object or context lambda from cryptography.hazmat.primitives.ciphers

Raises:
  • ImportError – needed module could not be imported

  • StreamError – stream failed when reading until EOF

  • CipherError – unsupported cipher object is provided

Can propagate cryptography.exceptions exceptions.

Example:

>>> from cryptography.hazmat.primitives.ciphers import aead
>>> d = Struct(
...     "nonce" / Default(Bytes(16), os.urandom(16)),
...     "associated_data" / Bytes(21),
...     "enc_data" / EncryptedSymAead(
...         GreedyBytes,
...         lambda ctx: aead.AESGCM(ctx._.key),
...         this.nonce,
...         this.associated_data
...     )
... )
>>> key128 = b"\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f"
>>> d.build({"associated_data": b"This is authenticated", "enc_data": b"The secret message"}, key=key128)
b'\xe3\xb0"\xbaQ\x18\xd3|\x14\xb0q\x11\xb5XZ\xeeThis is authenticated\x88~\xe5Vh\x00\x01m\xacn\xad k\x02\x13\xf4\xb4[\xbe\x12$\xa0\x7f\xfb\xbf\x82Ar\xb0\x97C\x0b\xe3\x85'
>>> d.parse(b'\xe3\xb0"\xbaQ\x18\xd3|\x14\xb0q\x11\xb5XZ\xeeThis is authenticated\x88~\xe5Vh\x00\x01m\xacn\xad k\x02\x13\xf4\xb4[\xbe\x12$\xa0\x7f\xfb\xbf\x82Ar\xb0\x97C\x0b\xe3\x85', key=key128)
Container: 
    nonce = b'\xe3\xb0"\xbaQ\x18\xd3|\x14\xb0q\x11\xb5XZ\xee' (total 16)
    associated_data = b'This is authenti'... (truncated, total 21)
    enc_data = b'The secret messa'... (truncated, total 18)
construct.Rebuffered(subcon, tailcutoff=None)

Caches bytes from underlying stream, so it becomes seekable and tellable, and also becomes blocking on reading. Useful for processing non-file streams like pipes, sockets, etc.

Warning

Experimental implementation. May not be mature enough.

Parameters:
  • subcon – Construct instance, subcon which will operate on the buffered stream

  • tailcutoff – optional, integer, amount of bytes kept in buffer, by default buffers everything

Can also raise arbitrary exceptions in its implementation.

Example:

Rebuffered(..., tailcutoff=1024).parse_stream(nonseekable_stream)