construct.core – entire module

class construct.core.Adapter(subcon)

Abstract adapter parent class.

Needs to implement _decode() for parsing and _encode() for building.

Parameters:subcon – the construct to wrap
class construct.core.Aligned(modulus, subcon, pattern='x00')

Appends additional null bytes to achieve a length that is shortest multiple of a modulus.

Parameters:
  • modulus – the modulus to final length, an integer or a context function returning such an integer
  • subcon – the subcon to align
  • pattern – optional, the padding pattern, a bytes character (default is x00)

Example:

>>> d = Aligned(4, Int16ub)
>>> d.parse(b'\x00\x01\x00\x00')
1
>>> d.sizeof()
4
construct.core.AlignedStruct(modulus, *subcons, **kw)

Makes a structure where each field is aligned to the same modulus (it is a struct of aligned fields, not an aligned struct).

See also

Uses Aligned() and Struct().

Parameters:
  • modulus – passed to each member
  • *subcons – subcons that make up the Struct
  • **kw – named subcons, extend the Struct

Example:

>>> d = AlignedStruct(4, "a"/Int8ub, "b"/Int16ub)
>>> d.build(dict(a=1,b=5))
b'\x01\x00\x00\x00\x00\x05\x00\x00'
>>> d.parse(_)
Container(a=1)(b=5)
>>> d.sizeof()
8
construct.core.Array(count, subcon)

Homogenous array of elements. The array will iterate through exactly count elements. This is just a macro around Range.

Operator [] can be used to make instances.

Parameters:
  • count – an integer or a context function that returns such an integer
  • subcon – the subcon to process individual elements

Example:

>>> d = Array(5,5,Byte) or Byte[5]
>>> d.build(range(5))
b'\x00\x01\x02\x03\x04'
>>> d.parse(_)
[0, 1, 2, 3, 4]

Alternative syntax (recommended):
>>> Byte[3:5], Byte[3:], Byte[:5], Byte[:]
construct.core.BitStruct(*subcons)

Makes a structure inside a Bitwise.

See also

Uses Bitwise() and Struct().

Parameters:*subcons – the subcons that make up this structure

Example:

>>> d = BitStruct(
...     "a" / Flag,
...     "b" / Nibble,
...     "c" / BitsInteger(10),
...     "d" / Padding(1),
... )
>>> d.parse(b"\xbe\xef")
Container(a=True)(b=7)(c=887)(d=None)
>>> d.sizeof()
2
class construct.core.BitsInteger(length, signed=False, swapped=False)

Field that builds from integers as opposed to bytes. Similar to Bit/Nibble/Octet fields but can have arbitrary sizes. Must be enclosed in Bitwise.

Parameters:
  • length – number of bits in the field, an integer or a context function that returns such an integer
  • signed – whether the value is signed (two’s complement), default is False (unsigned)
  • swapped – whether to swap byte order (little endian), default is False (big endian)

Example:

>>> d = Bitwise(BitsInteger(8))
>>> d.parse(b"\x10")
16
>>> d.build(255)
b'\xff'
>>> d.sizeof()
1
construct.core.BitsSwapped(subcon)

Swap the bit order within each byte within boundaries of the given subcon. Does NOT require a fixed sized subcon.

Parameters:subcon – the subcon on top of bit swapped bytes

Example:

>>> d = Bitwise(Bytes(8))
>>> d.parse(b"\x01")
'\x00\x00\x00\x00\x00\x00\x00\x01'
>>>> BitsSwapped(d).parse(b"\x01")
'\x01\x00\x00\x00\x00\x00\x00\x00'
construct.core.Bitwise(subcon)

Converts the stream from bytes to bits, and passes the bitstream to underlying subcon.

See also

Analog Bytewise() that transforms subset of bits back to bytes.

Warning

Do not use pointers inside this or other restreamed contexts.

Parameters:subcon – any field that works with bits like BitStruct or Bit Nibble Octet BitsInteger

Example:

>>> d = Bitwise(Octet)
>>> d.parse(b"\xff")
255
>>> d.build(1)
b'\x01'
>>> d.sizeof()
1
construct.core.ByteSwapped(subcon)

Swap the byte order within boundaries of the given subcon. Requires a fixed sized subcon.

Parameters:subcon – the subcon on top of byte swapped bytes

Example:

Int24ul <--> ByteSwapped(Int24ub) <--> BytesInteger(3, swapped=True)
class construct.core.Bytes(length)

Field consisting of a specified number of bytes. Builds from a bytes, or an integer (although deprecated and BytesInteger should be used instead).

See also

Analog BytesInteger() that parses and builds from integers, as opposed to bytes.

Parameters:length – an integer or a context lambda that returns such an integer

Example:

>>> d = Bytes(4)
>>> d.parse(b'beef')
b'beef'
>>> d.build(b'beef')
b'beef'
>>> d.build(0)
b'\x00\x00\x00\x00'
>>> d.sizeof()
4
class construct.core.BytesInteger(length, signed=False, swapped=False)

Field that builds from integers as opposed to bytes. Similar to Int* fields but can have arbitrary size.

See also

Analog BitsInteger() that operates on bits.

Parameters:
  • length – number of bytes in the field, and integer or a context function that returns such an integer
  • signed – whether the value is signed (two’s complement), default is False (unsigned)
  • swapped – whether to swap byte order (little endian), default is False (big endian)

Example:

>>> d = BytesInteger(4) or Int32ub
>>> d.parse(b"abcd")
1633837924
>>> d.build(1)
b'\x00\x00\x00\x01'
>>> d.sizeof()
4
construct.core.Bytewise(subcon)

Converts the stream from bits back to bytes. Must be used within Bitwise.

Parameters:subcon – any field that works with bytes

Example:

>>> d = Bitwise(Bytewise(Byte))
>>> d.parse(b"\xff")
255
>>> d.build(255)
b'\xff'
>>> d.sizeof()
1
construct.core.CString(terminators='\x00', encoding=None)

String ending in a terminator byte.

By default, the terminator is the x00 byte character. Terminators field can be a longer bytes, and any one of the characters breaks parsing. First terminator byte is used when building.

Parameters:
  • terminators – sequence of valid terminators, first is used when building, all are used when parsing
  • encoding – encoding (eg. “utf8”), or None for bytes

Warning

Do not use >1 byte encodings like UTF16 or UTF32 with CStrings, they are not safe.

Example:

>>> d = CString(encoding="utf8")
>>> d.build(u"Афон")
b'\xd0\x90\xd1\x84\xd0\xbe\xd0\xbd\x00'
>>> d.parse(_)
u'Афон'
class construct.core.Check(func)

Checks for a condition, and raises ValidationError if the check fails.

Parameters:func – a context function returning a bool (or truthy value)
Raises:ValidationError – when condition fails

Example:

Check(lambda ctx: len(ctx.payload.data) == ctx.payload_len)
Check(len_(this.payload.data) == this.payload_len)
class construct.core.Checksum(checksumfield, hashfunc, bytesfunc)

Field that is build or validated by a hash of a given byte range.

Parameters:
  • checksumfield – a subcon field that reads the checksum, usually Bytes(int)
  • hashfunc – a function taking bytes and returning whatever checksumfield takes when building
  • bytesfunc – a function taking context and returning the bytes or object to be hashed, usually like this.rawcopy1.data

Example:

import hashlib
d = Struct(
    "fields" / RawCopy(Struct(
        "a" / Byte,
        "b" / Byte,
    )),
    "checksum" / Checksum(Bytes(64), lambda data: hashlib.sha512(data).digest(), this.fields.data),
)
d.build(dict(fields=dict(value=dict(a=1,b=2))))
-> b'\x01\x02\xbd\xd8\x1a\xb23\xbc\xebj\xd23\xcd'...
class construct.core.Compressed(subcon, encoding, level=None)

Compresses and decompresses underlying stream when processing the subcon. When parsing, entire stream is consumed. When building, puts compressed bytes without marking the end. This construct should be used with Prefixed() or entire stream.

Parameters:
  • subcon – the subcon used for storing the value
  • encoding – any of the module names like zlib/gzip/bzip2/lzma, otherwise any of codecs module bytes<->bytes encodings, usually requires some Python version
  • level – optional, an integer between 0..9, lzma discards it

Example:

Prefixed(VarInt, Compressed(GreedyBytes, "zlib"))
class construct.core.Computed(func)

Field computing a value. Underlying byte stream is unaffected. When parsing, the context function provides the value. Constant literal value can also be provided.

Building does not require a value, the value gets computed.

Size is defined as 0 because parsing and building does not consume or produce bytes.

Parameters:func – a context function or a constant value
Example::
>>> d = Struct(
...     "width" / Byte,
...     "height" / Byte,
...     "total" / Computed(this.width * this.height),
... )
>>> d.build(dict(width=4,height=5))
b'\x04\x05'
>>> d.parse(b"12")
Container(width=49)(height=50)(total=2450)
>>> import os
>>> d = Computed(lambda ctx: os.urandom(10))
>>> d.parse(b"")
b'\x98\xc2\xec\x10\x07\xf5\x8e\x98\xc2\xec'
class construct.core.Const(subcon, value=None)

Field enforcing a constant value. It is used for file signatures, to validate that the given pattern exists. When parsed, the value must match.

Usually a member of a Struct, where it can be anonymous (so it does not appear in parsed dictionary for simplicity).

Note that a variable length subcon may still provide positive verification. Const does not consume a precomputed amount of bytes (and hence does NOT require a fixed sized lenghtfield), but depends on the subcon to read the appropriate amount (eg. VarInt is acceptable).

Parameters:
  • subcon – the subcon used to build value from, or a bytes value
  • value – optional, the expected value
Raises:

ConstError – when parsed data does not match specified value, or building from wrong value

Example:

>>> d = Const(b"IHDR")
>>> d.build(None)
b'IHDR'
>>> d.parse(b"JPEG")
construct.core.ConstError: expected b'IHDR' but parsed b'JPEG'

>>> d = Const(Int32ul, 16)
>>> d.build(None)
b'\x10\x00\x00\x00'
class construct.core.Construct

The mother of all constructs.

This object is generally not directly instantiated, and it does not directly implement parsing and building, so it is largely only of interest to subclass implementors. There are also other abstract classes sitting on top of this one.

The external user API:

  • parse()
  • parse_stream()
  • build()
  • build_stream()
  • sizeof()

Subclass authors should not override the external methods. Instead, another API is available:

  • _parse()
  • _build()
  • _sizeof()

And stateful copying:

  • __getstate__()
  • __setstate__()

All constructs have a name and flags. The name is used for naming struct members and context dictionaries. Note that the name can be a string, or None by default. A single underscore “_” is a reserved name, used as up-level in nested containers. The name should be descriptive, short, and valid as a Python identifier, although these rules are not enforced. The flags specify additional behavioral information about this construct. Flags are used by enclosing constructs to determine a proper course of action. Flags are often inherited from inner subconstructs but that depends on each class behavior.

build(obj, context=None, **kw)

Build an object in memory.

Returns:bytes
build_stream(obj, stream, context=None, **kw)

Build an object directly into a stream.

Returns:None
parse(data, context=None, **kw)

Parse an in-memory buffer (often bytes object).

Strings, buffers, memoryviews, and other complete buffers can be parsed with this method.

parse_stream(stream, context=None, **kw)

Parse a stream.

Files, pipes, sockets, and other streaming sources of data are handled by this method.

sizeof(context=None, **kw)

Calculate the size of this object, optionally using a context.

Some constructs have no fixed size and can only know their size for a given hunk of data. These constructs will raise an error if they are not passed a context.

Parameters:context – a container
Returns:an integer for a fixed size field
Raises:SizeofError – the size could not be determined, ever or just with actual context
class construct.core.Default(subcon, value)

Allows to make a field have a default value, which comes handly when building a Struct from a dict with missing keys.

Building does not require a value, but can accept one.

Size is the same as subcon size.

Example:

>>> d = Struct(
...     "a" / Default(Byte, 0),
... )
>>> d.build(dict(a=1))
b'\x01'
>>> d.build(dict())
b'\x00'
class construct.core.Embedded(subcon)

Embeds a struct into the enclosing struct, merging fields. Can also embed sequences into sequences, merging items. Name is inherited from subcon.

Warning

You can use Embedded(Switch(...)) but not Switch(Embedded(...)). Sames applies to If and IfThenElse macros.

Parameters:subcon – the inner struct to embed inside outer struct or sequence

Example:

>>> d = Struct("a"/Byte, Embedded(Struct("b"/Byte)), "c"/Byte)
>>> d.parse(b"abc")
Container(a=97)(b=98)(c=99)
construct.core.EmbeddedBitStruct(*subcons)

Makes an embedded BitStruct.

See also

Uses Bitwise() and Embedded() and Struct().

Parameters:*subcons – the subcons that make up this structure

Example:

???
construct.core.Enum(subcon, default=NotImplemented, **mapping)

Set of named values mapping. Can build both from names and values.

Parameters:
  • subcon – the subcon to map
  • **mapping – keyword arguments which serve as the encoding mapping
  • default – an optional, keyword-only argument that specifies the default value to use when the mapping is undefined. if not given, and exception is raised when the mapping is undefined. use Pass topass the unmapped value as-is

Example:

>>> d = Enum(Byte, a=1, b=2)
>>> d.parse(b"\x01")
'a'
>>> d.parse(b"\x08")
construct.core.MappingError: no decoding mapping for 8
>>> d.build("a")
b'\x01'
>>> d.build(1)
b'\x01'
class construct.core.ExprAdapter(subcon, encoder, decoder)

A generic adapter that takes encoder and decoder as parameters. You can use ExprAdapter instead of writing a full-blown class when only a simple lambda is needed.

Parameters:
  • subcon – the subcon to adapt
  • encoder – a function that takes (obj, context) and returns an encoded version of obj, or None for identity
  • decoder – a function that takes (obj, context) and returns an decoded version of obj, or None for identity

Example:

Ident = ExprAdapter(Byte,
    encoder = lambda obj,ctx: obj+1,
    decoder = lambda obj,ctx: obj-1, )
class construct.core.ExprValidator(subcon, validator)

A generic adapter that takes validator as parameter. You can use ExprValidator instead of writing a full-blown class when only a simple expression is needed.

Parameters:
  • subcon – the subcon to adapt
  • encoder – a function that takes (obj, context) and returns a bool

Example:

OneOf = ExprValidator(Byte,
    validator = lambda obj,ctx: obj in [1,3,5])
construct.core.Filter(predicate, subcon)

Filters a list leaving only the elements that passed through the validator.

Parameters:
  • subcon – a construct to validate, usually a Range Array Sequence
  • predicate – a function taking (obj, context) and returning a bool

Example:

>>> d = Filter(obj_ != 0, Byte[:])
>>> d.parse(b"\x00\x02\x00")
[2]
>>> d.build([0,1,0,2,0])
b'\x01\x02'
class construct.core.FlagsEnum(subcon, **flags)

Set of flag values mapping. Each flag is extracted from the number, resulting in a FlagsContainer dict that has each key assigned True or False.

Parameters:
  • subcon – the subcon to extract
  • **flags – a dictionary mapping flag-names to their value

Example:

>>> d = FlagsEnum(Byte, a=1, b=2, c=4, d=8)
>>> d.parse(b"\x03")
Container(c=False)(b=True)(a=True)(d=False)
class construct.core.FocusedSeq(parsebuildfrom, *subcons, **kw)

Parses and builds a sequence where only one subcon value is returned from parsing or taken into building, other fields are parsed and discarded or built from nothing.

Parameters:
  • parsebuildfrom – which subcon to use, an integer index or string name, or a context lambda returning either
  • *subcons – a list of members
  • **kw – a list of members (works ONLY on python 3.6)

Excample:

>>> d = FocusedSeq("num", Const(b"MZ"), "num"/Byte, Terminated)
>>> d = FocusedSeq(1,     Const(b"MZ"), "num"/Byte, Terminated)

>>> d.parse(b"MZ\xff")
255
>>> d.build(255)
b'MZ\xff'
class construct.core.FormatField(endianity, format)

Field that uses struct module to pack and unpack data. This is used to implement basic Int* and Float* fields alongside BytesInteger.

See struct documentation for instructions on crafting format strings.

Parameters:
  • endianity – endianness character like < > =
  • format – format character like f d B H L Q b h l q

Example:

>>> d = FormatField(">", "H")
>>> d.parse(b"\x01\x00")
256
>>> d.build(256)
b"\x01\x00"
>>> d.sizeof()
2
construct.core.GreedyRange(subcon)

Homogenous array of elements that parses until end of stream and builds from all elements.

Operator [] can be used to make instances.

Parameters:subcon – the subcon to process individual elements

Example:

>>> d = GreedyRange(Byte) or Byte[:]
>>> d.build(range(8))
b'\x00\x01\x02\x03\x04\x05\x06\x07'
>>> d.parse(_)
[0, 1, 2, 3, 4, 5, 6, 7]

Alternative syntax (recommended):
>>> Byte[3:5], Byte[3:], Byte[:5], Byte[:]
construct.core.GreedyString(encoding=None)

String that reads the rest of the stream until EOF, and writes a given string as is. If no encoding is specified, this is essentially GreedyBytes.

Parameters:encoding – encoding (eg. “utf8”), or None for bytes

See also

Analog to GreedyBytes and the same when no enoding is used.

Example:

>>> d = GreedyString(encoding="utf8")
>>> d.build(u"Афон")
b'\xd0\x90\xd1\x84\xd0\xbe\xd0\xbd'
>>> d.parse(_)
u'Афон'
construct.core.Hex(subcon)

Adapter for hex-dumping bytes. It returns a hex dump when parsing, and un-dumps when building.

Example:

>>> d = Hex(GreedyBytes)
>>> d.parse(b"abcd")
b'61626364'
>>> d.build("01020304")
b'\x01\x02\x03\x04'
construct.core.HexDump(subcon, linesize=16)

Adapter for hex-dumping bytes. It returns a hex dump when parsing, and un-dumps when building.

Parameters:
  • linesize – default 16 bytes per line
  • buildraw – by default build takes the same format that parse returns, set to build from a bytes directly

Example:

>>> d = HexDump(Bytes(10))
>>> d.parse(b"12345abc;/")
'0000   31 32 33 34 35 61 62 63 3b 2f                     12345abc;/       \n'
construct.core.If(predicate, subcon)

An if-then conditional construct. If the context predicate indicates True, the subcon will be used for parsing and building, otherwise parsing returns None and building is no-op. Note that the predicate has no access to parsed value, it computes only on context.

Parameters:
  • predicate – a function taking context and returning a bool
  • subcon – the subcon that will be used if the predicate returns True

Example:

>>> d = If(this.x > 0, Byte)
>>> d.build(255, dict(x=1))
b'\xff'
>>> d.build(255, dict(x=0))
b''
construct.core.IfThenElse(predicate, thensubcon, elsesubcon)

An if-then-else conditional construct. One of the two subcons is used for parsing or building, depending whether the predicate returns a truthy or falsey value for given context. Constant truthy value can also be used.

Parameters:
  • predicate – a context function that returns a bool (or truthy value)
  • thensubcon – the subcon that will be used if the predicate indicates True
  • elsesubcon – the subcon that will be used if the predicate indicates False

Example:

>>> d = IfThenElse(this.x > 0, VarInt, Byte)
>>> d.build(255, dict(x=1))
b'\xff\x01'
>>> d.build(255, dict(x=0))
b'\xff'
class construct.core.Indexing(subcon, count, index, empty=None)

Adapter for indexing a list (getting a single item from that list). Works with Range and Sequence and their lazy equivalents.

Parameters:
  • subcon – the subcon to index
  • count – expected number of elements, needed during building
  • index – the index of the list to get
  • empty – value to fill the list with during building

Example:

???
class construct.core.LazyBound(subconfunc)

Lazy-bound construct that binds to the construct only at runtime. Useful for recursive data structures (like linked lists or trees), where a construct needs to refer to itself (while it does not exist yet).

Parameters:subconfunc – a context function returning a Construct (derived) instance, can also return Pass or itself

Example:

>>> d = Struct(
...     "value"/Byte,
...     "next"/If(this.value > 0, LazyBound(lambda ctx: d)),
... )
...
>>> d.parse(b"\x05\x09\x00")
Container(value=5)(next=Container(value=9)(next=Container(value=0)(next=None)))
...
>>> print(d.parse(b"\x05\x09\x00"))
Container:
    value = 5
    next = Container:
        value = 9
        next = Container:
            value = 0
            next = None
class construct.core.LazyRange(min, max, subcon)

Equivalent to Range construct, but members are parsed on demand. Works only with fixed size subcon. Entire parse is essentially one stream seek.

See also

Equivalent to Range().

class construct.core.LazySequence(*subcons, **kw)

Equivalent to Sequence construct, however fixed size members are parsed on demand, others are parsed immediately. If entire sequence is fixed size then entire parse is essentially one seek.

See also

Equivalent to Sequence().

class construct.core.LazyStruct(*subcons, **kw)

Equivalent to Struct construct, however fixed size members are parsed on demand, others are parsed immediately. If entire struct is fixed size then entire parse is essentially one stream seek.

See also

Equivalent to Struct().

class construct.core.Mapping(subcon, decoding, encoding, decdefault=NotImplemented, encdefault=NotImplemented)

Adapter that maps objects to other objects. Translates objects before parsing and before building.

Parameters:
  • subcon – the subcon to map
  • decoding – the decoding (parsing) mapping as a dict
  • encoding – the encoding (building) mapping as a dict
  • decdefault – the default return value when object is not found in the mapping, if no object is given an exception is raised, if Pass is used, the unmapped object will be passed as-is
  • encdefault – the default return value when object is not found in the mapping, if no object is given an exception is raised, if Pass is used, the unmapped object will be passed as-is

Example:

???
class construct.core.NamedTuple(tuplename, tuplefields, subcon)

Both arrays, structs, and sequences can be mapped to a namedtuple from collections module. To create a named tuple, you need to provide a name and a sequence of fields, either a string with space-separated names or a list of string names. Just like the standard namedtuple.

Raises:AdaptationError – when subcon is not either Struct Sequence Range

Example:

>>> d = NamedTuple("coord", "x y z", Byte[3])
>>> d = NamedTuple("coord", "x y z", Byte >> Byte >> Byte)
>>> d = NamedTuple("coord", "x y z", "x"/Byte + "y"/Byte + "z"/Byte)
>>> d.parse(b"123")
coord(x=49, y=50, z=51)
construct.core.NoneOf(subcon, invalids)

Validates that the object is none of the listed values, both during parsing and building.

Parameters:
  • subcon – a construct to validate
  • valids – a collection implementing __contains__
Raises:

ValidationError – when actual value is among invalids

See also

Analog of OneOf().

class construct.core.OnDemand(subcon)

Allows for on demand (lazy) parsing. When parsing, it will return a parameterless function that when called, will return the parsed value. Object is cached after first parsing, so non-deterministic subcons will be affected. Works only with fixed size subcon.

Parameters:subcon – the subcon to read/write on demand, must be fixed size

Example:

>>> d = OnDemand(Byte)
>>> d.parse(b"\xff")
<function OnDemand._parse.<locals>.<lambda> at 0x7fdc241cfc80>
>>> _()
255
>>> d.build(255)
b'\xff'

Can also re-build from the lambda returned at parsing.

>>> d.parse(b"\xff")
<function OnDemand._parse.<locals>.<lambda> at 0x7fcbd9855f28>
>>> d.build(_)
b'\xff'
construct.core.OnDemandPointer(offset, subcon)

On demand pointer. Is both lazy and jumps to a position before reading.

Warning

CURRENTLY BROKEN.

See also

Base OnDemand() and Pointer() construct.

Parameters:
  • offset – an integer or a context function that returns such an integer
  • subcon – subcon that will be parsed or build at the offset stream position

Example:

>>> d = OnDemandPointer(lambda ctx: 2, Byte)
>>> d.parse(b"\x01\x02\x03\x04\x05")
<function OnDemand._parse.<locals>.effectuate at 0x7f6f011ad510>
>>> _()
3
construct.core.OneOf(subcon, valids)

Validates that the object is one of the listed values, both during parsing and building. Note that providing a set instead of a list may increase performance.

Parameters:
  • subcon – a construct to validate
  • valids – a collection implementing __contains__
Raises:

ValidationError – when actual value is not among valids

Example:

>>> d = OneOf(Byte, [1,2,3])
>>> d.parse(b"\x01")
1
>>> d.parse(b"\xff")
construct.core.ValidationError: ('object failed validation', 255)

>>> d = OneOf(Bytes(2), b"1234567890")
>>> d.parse(b"78")
b'78'
>>> d.parse(b"19")
construct.core.ValidationError: ('invalid object', b'19')
construct.core.Optional(subcon)

Makes an optional construct, that tries to parse the subcon. If parsing fails, returns None. If building fails, writes nothing.

Size cannot be computed, because whether bytes are consumed or produced depends on actual data and context.

Parameters:subcon – the subcon to optionally parse or build

Example:

>>> d = Optional(Int64ul)
>>> d.parse(b"12345678")
4050765991979987505
>>> d.parse(b"")
None
>>> d.build(1)
b'\x01\x00\x00\x00\x00\x00\x00\x00'
>>> d.build(None)
b''
class construct.core.Padded(length, subcon, pattern='x00', strict=False)

Appends additional null bytes to achieve a fixed length. Fails if actual data is longer than specified length.

Raises:PaddingError – when parsed or build data is longer than the length

Example:

>>> d = Padded(4, Byte)
>>> d.build(255)
b'\xff\x00\x00\x00'
>>> d.parse(_)
255
>>> d.sizeof()
4

>>> d = Padded(4, VarInt)
>>> d.build(1)
b'\x01\x00\x00\x00'
>>> d.build(70000)
b'\xf0\xa2\x04\x00'
construct.core.Padding(length, pattern='\x00', strict=False)

Padding field that adds bytes when building, discards bytes when parsing.

Parameters:
  • length – length of the padding, an integer or a context function returning such an integer
  • pattern – padding pattern as bytes character, default is x00
  • strict – whether to verify during parsing that the stream contains the exact pattern, raises PaddingError if actual padding differs from the pattern, default is False
Raises:

PaddingError – when strict is set and actual parsed pattern differs from specified

Example:

>>> d = Padding(4, strict=True)
>>> d.build(None)
b'\x00\x00\x00\x00'
>>> d.parse(b"****")
construct.core.PaddingError: expected b'\x00\x00\x00\x00', found b'****'
>>> d.sizeof()
4
construct.core.PascalString(lengthfield, encoding=None)

Length-prefixed string. The length field can be variable length (such as VarInt) or fixed length (such as Int64ul). VarInt is recommended for new designs. Stored length is in bytes, not characters.

Parameters:
  • lengthfield – a field used to parse and build the length (eg. VarInt Int64ul)
  • encoding – encoding (eg. “utf8”), or None for bytes

Example:

>>> d = PascalString(VarInt, encoding="utf8")
>>> d.build(u"Афон")
b'\x08\xd0\x90\xd1\x84\xd0\xbe\xd0\xbd'
>>> d.parse(_)
u'Афон'
class construct.core.Peek(subcon)

Peeks at the stream. Parses without changing the stream position. If the end of the stream is reached when reading, returns None. Building is no-op.

Size is defined as 0 because build does not put anything into the stream.

See also

The Union() class.

Parameters:subcon – the subcon to peek at

Example:

>>> d = Sequence(Peek(Int8ub), Peek(Int16ub))
>>> d.parse(b"\x01\x02")
[1, 258]
>>> d.sizeof()
0
class construct.core.Pointer(offset, subcon)

Changes the stream position to a given offset, where the construction should take place, and restores the stream position when finished.

Offset can also be nagative, indicating a position from EOF backwards.

Size is defined as unspecified, instead of previous 0.

See also

Analog to OnDemandPointer(), which also seeks to a given offset but returns a lazy lambda instead of constructing the value immediately.

Parameters:
  • offset – an integer or a context function that returns a stream position, where the construction would take place
  • subcon – the subcon to use at the offset

Example:

>>> d = Pointer(8, Bytes(1))
>>> d.parse(b"abcdefghijkl")
b'i'
>>> d.build(b"Z")
b'\x00\x00\x00\x00\x00\x00\x00\x00Z'
class construct.core.Prefixed(lengthfield, subcon, includelength=False)

Parses the length field. Then reads that amount of bytes and parses the subcon using only those bytes. Constructs that consume entire remaining stream are constrained to consuming only the specified amount of bytes. When building, data is prefixed by its length. Optionally, length field can include its own size.

See also

The VarInt encoding should be preferred over Int* fixed sized fields. VarInt is more compact and never overflows.

Parameters:
  • lengthfield – a subcon used for storing the length
  • subcon – the subcon used for storing the value
  • includelength – optional, whether length field should include own size

Example:

>>> Prefixed(VarInt, GreedyRange(Int32ul)).parse(b"\x08abcdefgh")
[1684234849, 1751606885]

>>> PrefixedArray(VarInt, Int32ul).parse(b"\x02abcdefgh")
[1684234849, 1751606885]
construct.core.PrefixedArray(lengthfield, subcon)

Homogenous array prefixed by item count (as opposed to prefixed by byte count, see Prefixed()).

Parameters:
  • lengthfield – field parsing and building an integer
  • subcon – subcon to process individual elements

Example:

>>> Prefixed(VarInt, GreedyRange(Int32ul)).parse(b"\x08abcdefgh")
[1684234849, 1751606885]

>>> PrefixedArray(VarInt, Int32ul).parse(b"\x02abcdefgh")
[1684234849, 1751606885]
class construct.core.Range(min, max, subcon)

Homogenous array of elements. The array will iterate through between min to max times. If an exception occurs (EOF, validation error) then repeater exits cleanly. If less than min or more than max elements have been successfully processed, error is raised.

Operator [] can be used to make instances.

See also

Analog to GreedyRange() that parses until end of stream.

Parameters:
  • min – the minimal count, an integer or a context lambda
  • max – the maximal count, an integer or a context lambda
  • subcon – the subcon to process individual elements
Raises:

RangeError – when consumed or produced too little or too many elements

Example:

>>> d = Range(3,5,Byte) or Byte[3:5]
>>> d.parse(b'\x01\x02\x03\x04')
[1,2,3,4]
>>> d.build([1,2,3,4])
b'\x01\x02\x03\x04'
>>> d.build([1,2])
construct.core.RangeError: expected from 3 to 5 elements, found 2
>>> d.build([1,2,3,4,5,6])
construct.core.RangeError: expected from 3 to 5 elements, found 6

Alternative syntax (recommended):
>>> Byte[3:5], Byte[3:], Byte[:5], Byte[:]
class construct.core.RawCopy(subcon)

Returns a dict containing both parsed subcon, the raw bytes that were consumed by it, starting and ending offset in the stream, and the amount of bytes. Builds either from raw bytes or a value used by subcon.

Context does contain a dict with data (if built from raw bytes) or with both (if built from value or parsed).

Raises:ConstructError – when building and neither data or value is given

Example:

>>> d = RawCopy(Byte)
>>> d.parse(b"\xff")
Container(data=b'\xff')(value=255)(offset1=0)(offset2=1)(length=1)
>>> d.build(dict(data=b"\xff"))
'\xff'
>>> d.build(dict(value=255))
'\xff'
class construct.core.Rebuffered(subcon, tailcutoff=None)

Caches bytes from the underlying stream, so it becomes seekable and tellable. Also makes the stream blocking, in case it came from a socket or a pipe. Optionally, stream can forget bytes that went a certain amount of bytes beyond the current offset, allowing only a limited seeking capability while allowing to process an endless stream.

Warning

Experimental implementation. May not be mature enough.

Parameters:
  • subcon – the subcon which will operate on the buffered stream
  • tailcutoff – optional, amount of bytes kept in buffer, by default buffers everything

Example:

Rebuffered(..., tailcutoff=1024).parse_stream(endless_nonblocking_stream)
class construct.core.Rebuild(subcon, func)

Parses the field like normal, but computes the value used for building from a context function. Constant value can also be used instead.

Building does not require a value, because the value gets recomputed anyway.

Size is the same as subcon size.

See also

Useful for length and count fields when Prefixed and PrefixedArray cannot be used.

Example:

>>> d = Struct(
...     "count" / Rebuild(Byte, len_(this.items)),
...     "items" / Byte[this.count],
... )
>>> d.build(dict(items=[1,2,3]))
b'\x03\x01\x02\x03'
class construct.core.Renamed(newname, subcon)

Renames an existing construct. This creates a wrapper so underlying subcon retains it’s original name, which by default is just None. Can be used to give same construct few different names. Used internally by / operator.

Also this wrapper is responsible for building a path (a chain of names) that gets attached to error message when parsing, building, or sizeof fails. Fields that are not named do not appear in the path string.

Parameters:
  • newname – the new name, as string
  • subcon – the subcon to rename

Example:

>>> "name" / Int32ul
<Renamed: name>
class construct.core.RepeatUntil(predicate, subcon)

Homogenous array that repeats until the predicate indicates it to stop. Note that the last element (which caused the repeat to exit) is included in the return list.

Parameters:
  • predicate – a predicate function that takes (obj, list, context) and returns True to break or False to continue (or a truthy value)
  • subcon – the subcon used to parse and build each element

Example:

>>> d = RepeatUntil(lambda x,lst,ctx: x>7, Byte)
>>> d.build(range(20))
b'\x00\x01\x02\x03\x04\x05\x06\x07\x08'
>>> d.parse(b"\x01\xff\x02")
[1, 255]

>>> d = RepeatUntil(lambda x,lst,ctx: lst[-2:]==[0,0], Byte)
>>> d.parse(b"\x01\x00\x00\xff")
[1, 0, 0]
class construct.core.Restreamed(subcon, encoder, encoderunit, decoder, decoderunit, sizecomputer)

Transforms bytes between the underlying stream and the subcon.

When the parsing or building is done, the wrapper stream is closed. If read buffer or write buffer is not empty, error is raised.

See also

Both Bitwise() and Bytewise() are implemented using Restreamed.

Warning

Remember that subcon must consume or produce an amount of bytes that is a multiple of encoding or decoding units. For example, in a Bitwise context you should process a multiple of 8 bits or the stream will fail after parsing/building. Also do NOT use pointers inside.

Parameters:
  • subcon – the subcon which will operate on the buffer
  • encoder – a function that takes bytes and returns bytes (used when building)
  • encoderunit – ratio as integer, encoder takes that many bytes at once
  • decoder – a function that takes bytes and returns bytes (used when parsing)
  • decoderunit – ratio as integer, decoder takes that many bytes at once
  • sizecomputer – a function that computes amount of bytes outputed by some bytes

Example:

Bitwise  <--> Restreamed(subcon, bits2bytes, 8, bytes2bits, 1, lambda n: n//8)
Bytewise <--> Restreamed(subcon, bytes2bits, 1, bits2bytes, 8, lambda n: n*8)
class construct.core.Seek(at, whence=0)

Sets a new stream position when parsing or building. Seeks are useful when many other fields follow the jump. Pointer works when there is only one field to look at, but when there is more to be done, Seek may come useful.

See also

Analog Pointer() wrapper that has same side effect but also processed a subcon.

Parameters:
  • at – where to jump to, can be an integer or a context lambda returning such an integer
  • whence – is the offset from beginning (0) or from current position (1) or from ending (2), can be an integer or a context lambda returning such an integer, default is 0

Example:

>>> d = (Seek(5) >> Byte)
>>> d.parse(b"01234x")
[5, 120]

>>> d = (Bytes(10) >> Seek(5) >> Byte)
>>> d.build([b"0123456789", None, 255])
b'01234\xff6789'
class construct.core.Select(*subcons, **kw)

Selects the first matching subconstruct. It will literally try each of the subconstructs, until one matches.

Parameters:
  • subcons – the subcons to try (order sensitive)
  • includename – indicates whether to include the name of the selected subcon in the return value of parsing, default is False

Example:

>>> d = Select(Int32ub, CString(encoding="utf8"))
>>> d.build(1)
b'\x00\x00\x00\x01'
>>> d.build(u"Афон")
b'\xd0\x90\xd1\x84\xd0\xbe\xd0\xbd\x00'

Alternative syntax, note this works ONLY on python 3.6+:
>>> Select(num=Int32ub, text=CString(encoding="utf8"))
class construct.core.Sequence(*subcons, **kw)

Sequence of unnamed constructs. The elements are parsed and built in the order they are defined.

Size is the sum of all subcon sizes.

Operator >> can also be used to make Sequences.

See also

Can be nested easily, and embedded using Embedded() wrapper that merges entries into parent’s entries.

Parameters:subcons – subcons that make up this sequence

Example:

>>> d = (Byte >> Byte)
>>> d.parse(b'\x01\x02')
[1, 2]
>>> d.build([1, 2])
b'\x01\x02'
>>> d.sizeof()
2

>>> d = Sequence(Byte, CString(), Float32b)
>>> d.build([255, b"hello", 123.0])
b'\xffhello\x00B\xf6\x00\x00'
>>> d.parse(_)
[255, b'hello', 123.0]

Alternative syntax (not recommended):
>>> (Byte >> "Byte >> "c"/Byte >> "d"/Byte)

Alternative syntax, note this works ONLY on python 3.6+:
>>> Sequence(a=Byte, b=Byte, c=Byte, d=Byte)
class construct.core.Slicing(subcon, count, start, stop, step=1, empty=None)

Adapter for slicing a list (getting a slice from that list). Works with Range and Sequence and their lazy equivalents.

Parameters:
  • subcon – the subcon to slice
  • count – expected number of elements, needed during building
  • start – start index (or None for entire list)
  • stop – stop index (or None for up-to-end)
  • step – step (or 1 for every element)
  • empty – value to fill the list with during building

Example:

???
class construct.core.StopIf(condfunc)

Checks for a condition, and stops a Struct/Sequence/Range from parsing or building further.

Parameters:condfunc – a context function returning a bool (or truthy value)

Example:

Struct('x'/Byte, StopIf(this.x == 0), 'y'/Byte)
Sequence('x'/Byte, StopIf(this.x == 0), 'y'/Byte)
GreedyRange(FocusedSeq(0, 'x'/Byte, StopIf(this.x == 0)))
construct.core.String(length, encoding=None, padchar='\x00', paddir='right', trimdir='right')

Configurable, fixed-length or variable-length string field.

When parsing, the byte string is stripped of pad character (as specified) from the direction (as specified) then decoded (as specified). Length is a constant integer or a context function. When building, the string is encoded (as specified) then padded (as specified) from the direction (as specified) or trimmed (as specified).

The padding character and direction must be specified for padding to work. The trim direction must be specified for trimming to work.

If encoding is not specified, it works with bytes (not unicode strings).

Parameters:
  • length – length in bytes (not unicode characters), as integer or context function
  • encoding – encoding (eg. “utf8”) or None for bytes
  • padchar – bytes character to pad out strings (by default b”x00”)
  • paddir – direction to pad out strings (one of: right left both)
  • trimdir – direction to trim strings (one of: right left)

Example:

>>> d = String(10)
>>> d.build(b"hello")
b'hello\x00\x00\x00\x00\x00'
>>> d.parse(_)
b'hello'
>>> d.sizeof()
10

>>> d = String(10, encoding="utf8")
>>> d.build(u"Афон")
b'\xd0\x90\xd1\x84\xd0\xbe\xd0\xbd\x00\x00'
>>> d.parse(_)
u'Афон'

>>> d = String(10, padchar=b"XYZ", paddir="center")
>>> d.build(b"abc")
b'XXXabcXXXX'
>>> d.parse(b"XYZabcXYZY")
b'abc'

>>> d = String(10, trimdir="right")
>>> d.build(b"12345678901234567890")
b'1234567890'
class construct.core.StringEncoded(subcon, encoding)

Used internally.

class construct.core.StringPaddedTrimmed(length, subcon, padchar='x00', paddir='right', trimdir='right')

Used internally.

class construct.core.Struct(*subcons, **kw)

Sequence of usually named constructs, similar to structs in C. The elements are parsed and built in the order they are defined.

Some fields do not need to be named, since they are built without value anyway. See Const Padding Pass Terminated for examples of such fields.

Size is the sum of all subcon sizes.

Operator + can also be used to make Structs.

See also

Can be nested easily, and embedded using Embedded() wrapper that merges members into parent’s members.

Parameters:subcons – subcons that make up this structure

Example:

>>> d = Struct("a"/Int8ul, "data"/Bytes(2), "data2"/Bytes(this.a))
>>> d.parse(b"\x01abc")
Container(a=1)(data=b'ab')(data2=b'c')
>>> d.build(_)
b'\x01abc'
>>> d.build(dict(a=5, data=b"??", data2=b"hello"))
b'\x05??hello'

>>> d = Struct(Const(b"MZ"), Padding(2), Pass, Terminated)
>>> d.build({})
b'MZ\x00\x00'
>>> d.parse(_)
Container()
>>> d.sizeof()
4

Alternative syntax (not recommended):
>>> ("a"/Byte + "b"/Byte + "c"/Byte + "d"/Byte)

Alternative syntax, note this works ONLY on python 3.6+:
>>> Struct(a=Byte, b=Byte, c=Byte, d=Byte)
class construct.core.Subconstruct(subcon)

Abstract subconstruct (wraps an inner construct, inheriting its name and flags). Parsing and building is by default deferred to subcon, same as sizeof.

Parameters:subcon – the construct to wrap
class construct.core.Switch(keyfunc, cases, default=<NoDefault: None>, includekey=False)

A conditional branch. Switch will choose the case to follow based on the return value of keyfunc. If no case is matched and no default value is given, SwitchError will be raised.

Warning

You can use Embedded(Switch(...)) but not Switch(Embedded(...)). Same applies to If and IfThenElse macros.

Parameters:
  • keyfunc – a context function that returns a key which will choose a case, or a constant
  • cases – a dictionary mapping keys to subcons
  • default – a default field to use when the key is not found in the cases. if not supplied, an exception will be raised when the key is not found. Pass can be used for do-nothing
  • includekey – whether to include the key in the return value of parsing, default is False
Raises:

SwitchError – when actual value is not in the dict nor a default is given

Example:

>>> d = Switch(this.n, { 1:Int8ub, 2:Int16ub, 4:Int32ub })
>>> d.build(5, dict(n=1))
b'\x05'
>>> d.build(5, dict(n=4))
b'\x00\x00\x00\x05'
class construct.core.SymmetricAdapter(subcon)

Abstract adapter parent class.

Needs to implement _decode() only, for both parsing and building.

Parameters:subcon – the construct to wrap
construct.core.SymmetricMapping(subcon, mapping, default=NotImplemented)

Defines a symmetric mapping, same mapping is used on parsing and building.

See also

Based on Mapping().

Parameters:
  • subcon – the subcon to map
  • encoding – the mapping as a dict
  • decdefault – the default return value when object is not found in the mapping, if no object is given an exception is raised, if Pass is used, the unmapped object will be passed as-is

Example:

???
class construct.core.Tunnel(subcon)

Abstract class that reads entire stream when parsing, and writes all data when building, but serves as an adapter as well.

Needs to implement _decode() for parsing and _encode() for building.

class construct.core.Union(parsefrom, *subcons, **kw)

Treats the same data as multiple constructs (similar to C union statement) so you can look at the data in multiple views.

When parsing, all fields read the same data bytes, but stream remains at initial offset, unless parsefrom selects a subcon by index or name. When building, the first subcon that can find an entry in the dict (or builds from None, so it does not require an entry) is automatically selected.

Warning

If you skip the parsefrom parameter then stream will be left back at the starting offset. Many users fail to use this properly.

Parameters:
  • parsefrom – how to leave stream after parsing, can be integer index or string name selecting a subcon, None (leaves stream at initial offset, the default), a context lambda returning either of previously mentioned
  • subcons – subconstructs (order and name sensitive)

Example:

>>> d = Union(0, "raw"/Bytes(8), "ints"/Int32ub[2], "shorts"/Int16ub[4], "chars"/Byte[8])
>>> d.parse(b"12345678")
Container(raw=b'12345678')(ints=[825373492, 892745528])(shorts=[12594, 13108, 13622, 14136])(chars=[49, 50, 51, 52, 53, 54, 55, 56])
>>> d.build(dict(chars=range(8)))
b'\x00\x01\x02\x03\x04\x05\x06\x07'

Alternative syntax, note this works ONLY on python 3.6+:
>>> Union(0, raw=Bytes(8), ints=Int32ub[2], shorts=Int16ub[4], chars=Byte[8])
class construct.core.Validator(subcon)

Abstract class that validates a condition on the encoded/decoded object.

Needs to implement _validate() that returns a bool (or a truthy value)

Parameters:subcon – the subcon to validate
construct.core.setglobalstringencoding(encoding)

Sets the encoding globally for all String/PascalString/CString/GreedyString instances.

Parameters:encoding – a string like “utf8”, or None which means working with bytes (not unicode)