Core API: Miscellaneous

construct.Const(value, subcon=None)

Field enforcing a constant. It is used for file signatures, to validate that the given pattern exists. Data in the stream must strictly match the specified value.

Note that a variable sized subcon may still provide positive verification. Const does not consume a precomputed amount of bytes, but depends on the subcon to read the appropriate amount (eg. VarInt is acceptable). Whatever subcon parses into, gets compared against the specified value.

Parses using subcon and return its value (after checking). Builds using subcon from nothing (or given object, if not None). Size is the same as subcon, unless it raises SizeofError.

Parameters:
  • value – expected value, usually a bytes literal
  • subcon – optional, Construct instance, subcon used to build value from, assumed to be Bytes if value parameter was a bytes literal
Raises:
  • StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes
  • ConstError – parsed data does not match specified value, or building from wrong value
  • StringError – building from non-bytes value, perhaps unicode

Example:

>>> d = Const(b"IHDR")
>>> d.build(None)
b'IHDR'
>>> d.parse(b"JPEG")
construct.core.ConstError: expected b'IHDR' but parsed b'JPEG'

>>> d = Const(255, Int32ul)
>>> d.build(None)
b'\xff\x00\x00\x00'
construct.Computed(func)

Field computing a value from the context dictionary or some outer source like os.urandom or random module. Underlying byte stream is unaffected. The source can be non-deterministic.

Parsing and Building return the value returned by the context lambda (although a constant value can also be used). Size is defined as 0 because parsing and building does not consume or produce bytes into the stream.

Parameters:func – context lambda or constant value

Can propagate any exception from the lambda, possibly non-ConstructError.

Example::
>>> d = Struct(
...     "width" / Byte,
...     "height" / Byte,
...     "total" / Computed(this.width * this.height),
... )
>>> d.build(dict(width=4,height=5))
b'\x04\x05'
>>> d.parse(b"12")
Container(width=49, height=50, total=2450)
>>> d = Computed(7)
>>> d.parse(b"")
7
>>> d = Computed(lambda ctx: 7)
>>> d.parse(b"")
7
>>> import os
>>> d = Computed(lambda ctx: os.urandom(10))
>>> d.parse(b"")
b'\x98\xc2\xec\x10\x07\xf5\x8e\x98\xc2\xec'
construct.Index()

Indexes a field inside outer Array GreedyRange RepeatUntil context.

Note that you can use this class, or use this._index expression instead, depending on how its used. See the examples.

Parsing and building pulls _index key from the context. Size is 0 because stream is unaffected.

Raises:IndexFieldError – did not find either key in context

Example:

>>> d = Array(3, Index)
>>> d.parse(b"")
[0, 1, 2]
>>> d = Array(3, Struct("i" / Index))
>>> d.parse(b"")
[Container(i=0), Container(i=1), Container(i=2)]

>>> d = Array(3, Computed(this._index+1))
>>> d.parse(b"")
[1, 2, 3]
>>> d = Array(3, Struct("i" / Computed(this._._index+1)))
>>> d.parse(b"")
[Container(i=1), Container(i=2), Container(i=3)]
construct.Rebuild(subcon, func)

Field where building does not require a value, because the value gets recomputed when needed. Comes handy when building a Struct from a dict with missing keys. Useful for length and count fields when Prefixed and PrefixedArray cannot be used.

Parsing defers to subcon. Building is defered to subcon, but it builds from a value provided by the context lambda (or constant). Size is the same as subcon, unless it raises SizeofError.

Difference between Default and Rebuild, is that in first the build value is optional and in second the build value is ignored.

Parameters:
  • subcon – Construct instance
  • func – context lambda or constant value
Raises:

StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

Can propagate any exception from the lambda, possibly non-ConstructError.

Example:

>>> d = Struct(
...     "count" / Rebuild(Byte, len_(this.items)),
...     "items" / Byte[this.count],
... )
>>> d.build(dict(items=[1,2,3]))
b'\x03\x01\x02\x03'
construct.Default(subcon, value)

Field where building does not require a value, because the value gets taken from default. Comes handy when building a Struct from a dict with missing keys.

Parsing defers to subcon. Building is defered to subcon, but it builds from a default (if given object is None) or from given object. Building does not require a value, but can accept one. Size is the same as subcon, unless it raises SizeofError.

Difference between Default and Rebuild, is that in first the build value is optional and in second the build value is ignored.

Parameters:
  • subcon – Construct instance
  • value – context lambda or constant value
Raises:

StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

Can propagate any exception from the lambda, possibly non-ConstructError.

Example:

>>> d = Struct(
...     "a" / Default(Byte, 0),
... )
>>> d.build(dict(a=1))
b'\x01'
>>> d.build(dict())
b'\x00'
construct.Check(func)

Checks for a condition, and raises CheckError if the check fails.

Parsing and building return nothing (but check the condition). Size is 0 because stream is unaffected.

Parameters:func – bool or context lambda, that gets run on parsing and building
Raises:CheckError – lambda returned false

Can propagate any exception from the lambda, possibly non-ConstructError.

Example:

Check(lambda ctx: len(ctx.payload.data) == ctx.payload_len)
Check(len_(this.payload.data) == this.payload_len)
construct.Error()

Raises ExplicitError, unconditionally.

Parsing and building always raise ExplicitError. Size is undefined.

Raises:ExplicitError – unconditionally, on parsing and building

Example:

>>> d = Struct("num"/Byte, Error)
>>> d.parse(b"data...")
construct.core.ExplicitError: Error field was activated during parsing
construct.FocusedSeq(parsebuildfrom, *subcons, **subconskw)

Allows constructing more elaborate “adapters” than Adapter class.

Parse does parse all subcons in sequence, but returns only the element that was selected (discards other values). Build does build all subcons in sequence, where each gets build from nothing (except the selected subcon which is given the object). Size is the sum of all subcon sizes, unless any subcon raises SizeofError.

This class does context nesting, meaning its members are given access to a new dictionary where the “_” entry points to the outer context. When parsing, each member gets parsed and subcon parse return value is inserted into context under matching key only if the member was named. When building, the matching entry gets inserted into context before subcon gets build, and if subcon build returns a new value (not None) that gets replaced in the context.

This class supports embedding. Embedded semantics dictate, that during instance creation (in ctor), each field is checked for embedded flag, and its subcon members are merged. This changes behavior of some code examples. Only few classes are supported: Struct Sequence FocusedSeq Union LazyStruct, although those can be used interchangably (a Struct can embed a Sequence, or rather its members).

This class exposes subcons as attributes. You can refer to subcons that were inlined (and therefore do not exist as variable in the namespace) by accessing the struct attributes, under same name. Also note that compiler does not support this feature. See examples.

This class exposes subcons in the context. You can refer to subcons that were inlined (and therefore do not exist as variable in the namespace) within other inlined fields using the context. Note that you need to use a lambda (this expression is not supported). Also note that compiler does not support this feature. See examples.

This class is used internally to implement PrefixedArray.

Parameters:
  • parsebuildfrom – string name or context lambda, selects a subcon
  • *subcons – Construct instances, list of members, some can be named
  • **subconskw – Construct instances, list of members (requires Python 3.6)
Raises:
  • StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes
  • UnboundLocalError – selector does not match any subcon

Can propagate any exception from the lambda, possibly non-ConstructError.

Excample:

>>> d = FocusedSeq("num", Const(b"SIG"), "num"/Byte, Terminated)
>>> d.parse(b"SIG\xff")
255
>>> d.build(255)
b'SIG\xff'

>>> d = FocusedSeq("animal",
...     "animal" / Enum(Byte, giraffe=1),
... )
>>> d.animal.giraffe
'giraffe'
>>> d = FocusedSeq("count",
...     "count" / Byte,
...     "data" / Padding(lambda this: this.count - this._subcons.count.sizeof()),
... )
>>> d.build(4)
b'\x04\x00\x00\x00'

PrefixedArray <--> FocusedSeq("items",
    "count" / Rebuild(lengthfield, len_(this.items)),
    "items" / subcon[this.count],
)
construct.Pickled()

Preserves arbitrary Python objects.

Parses using pickle.load() and builds using pickle.dump() functions, using default Pickle binary protocol. Size is undefined.

Raises:StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

Can propagate pickle.load() and pickle.dump() exceptions.

Example:

>>> x = [1, 2.3, {}]
>>> Pickled.build(x)
b'\x80\x03]q\x00(K\x01G@\x02ffffff}q\x01e.'
>>> Pickled.parse(_)
[1, 2.3, {}]
construct.Numpy()

Preserves numpy arrays (both shape, dtype and values).

Parses using numpy.load() and builds using numpy.save() functions, using Numpy binary protocol. Size is undefined.

Raises:
  • ImportError – numpy could not be imported during parsing or building
  • StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

Can propagate numpy.load() and numpy.save() exceptions.

Example:

>>> import numpy
>>> a = numpy.asarray([1,2,3])
>>> Numpy.build(a)
b"\x93NUMPY\x01\x00F\x00{'descr': '<i8', 'fortran_order': False, 'shape': (3,), }            \n\x01\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x03\x00\x00\x00\x00\x00\x00\x00"
>>> Numpy.parse(_)
array([1, 2, 3])
construct.NamedTuple(tuplename, tuplefields, subcon)

Both arrays, structs, and sequences can be mapped to a namedtuple from collections module. To create a named tuple, you need to provide a name and a sequence of fields, either a string with space-separated names or a list of string names, like the standard namedtuple.

Parses into a collections.namedtuple instance, and builds from such instance (although it also builds from lists and dicts). Size is undefined.

Parameters:
  • tuplename – string
  • tuplefields – string or list of strings
  • subcon – Construct instance, either Struct Sequence Array GreedyRange
Raises:
  • StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes
  • NamedTupleError – subcon is neither Struct Sequence Array GreedyRange

Can propagate collections exceptions.

Example:

>>> d = NamedTuple("coord", "x y z", Byte[3])
>>> d = NamedTuple("coord", "x y z", Byte >> Byte >> Byte)
>>> d = NamedTuple("coord", "x y z", "x"/Byte + "y"/Byte + "z"/Byte)
>>> d.parse(b"123")
coord(x=49, y=50, z=51)
construct.Timestamp(subcon, unit, epoch)

Datetime, represented as Arrow object.

Note that accuracy is not guaranteed, because building rounds the value to integer (even when Float subcon is used), due to floating-point errors in general, and because MSDOS scheme has only 5-bit (32 values) seconds field (seconds are rounded to multiple of 2).

Unit is a fraction of a second. 1 is second resolution, 10**-3 is milliseconds resolution, 10**-6 is microseconds resolution, etc. Usually its 1 on Unix and MacOSX, 10**-7 on Windows. Epoch is a year (if integer) or a specific day (if Arrow object). Usually its 1970 on Unix, 1904 on MacOSX, 1600 on Windows. MSDOS format doesnt support custom unit or epoch, it uses 2-seconds resolution and 1980 epoch.

Parameters:
  • subcon – Construct instance like Int* Float*, or Int32ub with msdos format
  • unit – integer or float, or msdos string
  • epoch – integer, or Arrow instance, or msdos string
Raises:
  • ImportError – arrow could not be imported during ctor
  • TimestampError – subcon is not a Construct instance
  • TimestampError – unit or epoch is a wrong type

Example:

>>> d = Timestamp(Int64ub, 1., 1970)
>>> d.parse(b'\x00\x00\x00\x00ZIz\x00')
<Arrow [2018-01-01T00:00:00+00:00]>
>>> d = Timestamp(Int32ub, "msdos", "msdos")
>>> d.parse(b'H9\x8c"')
<Arrow [2016-01-25T17:33:04+00:00]>
construct.Hex(subcon)

Adapter for displaying hexadecimal/hexlified representation of integers/bytes/RawCopy dictionaries.

Parsing results in int-alike bytes-alike or dict-alike object, whose only difference from original is pretty-printing. If you look at the result, you will be presented with its repr which remains as-is. If you print it, then you will see its str whic is a hexlified representation. Building and sizeof defer to subcon.

To obtain a hexlified string (like before Hex HexDump changed semantics) use binascii.(un)hexlify on parsed results.

Example:

>>> d = Hex(Int32ub)
>>> obj = d.parse(b"\x00\x00\x01\x02")
>>> obj
258
>>> print(obj)
0x00000102

>>> d = Hex(GreedyBytes)
>>> obj = d.parse(b"\x00\x00\x01\x02")
>>> obj
b'\x00\x00\x01\x02'
>>> print(obj)
unhexlify('00000102')

>>> d = Hex(RawCopy(Int32ub))
>>> obj = d.parse(b"\x00\x00\x01\x02")
>>> obj
{'data': b'\x00\x00\x01\x02',
 'length': 4,
 'offset1': 0,
 'offset2': 4,
 'value': 258}
>>> print(obj)
unhexlify('00000102')
construct.HexDump(subcon)

Adapter for displaying hexlified representation of bytes/RawCopy dictionaries.

Parsing results in bytes-alike or dict-alike object, whose only difference from original is pretty-printing. If you look at the result, you will be presented with its repr which remains as-is. If you print it, then you will see its str whic is a hexlified representation. Building and sizeof defer to subcon.

To obtain a hexlified string (like before Hex HexDump changed semantics) use construct.lib.hexdump on parsed results.

Example:

>>> d = HexDump(GreedyBytes)
>>> obj = d.parse(b"\x00\x00\x01\x02")
>>> obj
b'\x00\x00\x01\x02'
>>> print(obj)
hexundump('''
0000   00 00 01 02                                       ....
''')

>>> d = HexDump(RawCopy(Int32ub))
>>> obj = d.parse(b"\x00\x00\x01\x02")
>>> obj
{'data': b'\x00\x00\x01\x02',
 'length': 4,
 'offset1': 0,
 'offset2': 4,
 'value': 258}
>>> print(obj)
hexundump('''
0000   00 00 01 02                                       ....
''')