Adapters and Validators

Adapting

Adapting is the process of converting one representation of an object to another. One representation is usually “lower” (closer to the byte level), and the other “higher” (closer to the python object model). The process of converting the lower representation to the higher one is called decoding, and the process of converting the higher level representation to the lower one is called encoding. Encoding and decoding are expected to be symmetrical, so that they counter-act each other encode(decode(x)) == x and decode(encode(x)) == x.

Custom adapter classes derive from the abstract Adapter class, and implement their own versions of _encode and _decode, as shown below:

>>> class IpAddressAdapter(Adapter):
...     def _encode(self, obj, context):
...         return list(map(int, obj.split(".")))
...     def _decode(self, obj, context):
...         return "{0}.{1}.{2}.{3}".format(*obj)
...
>>> IpAddress = IpAddressAdapter(Byte[4])

As you can see, the IpAddressAdapter encodes strings of the format “XXX.XXX.XXX.XXX” to a binary string of 4 bytes, and decodes such binary strings into the more readable “XXX.XXX.XXX.XXX” format. Also note that the adapter does not perform any manipulation of the stream, it only converts between the objects!

This is called separation of concern, and is a key feature of component-oriented programming. It allows us to keep each component very simple and unaware of its consumers. Whenever we need a different representation of the data, we don’t need to write a new Construct – we only write the suitable adapter.

So, let’s see our adapter in action:

>>> IpAddress.parse(b"\x01\x02\x03\x04")
'1.2.3.4'
>>> IpAddress.build("192.168.2.3")
b'\xc0\xa8\x02\x03'

Having the representation separated from the actual parsing or building means an adapter is loosely coupled with its underlying construct. As we’ll see with enums in a moment, we can use the same enum for Byte or Int32sl or Float64l, as long as the underlying construct returns an object we can map. Moreover, we can stack several adapters on top of one another, to create a nested adapter.

Using expressions instead of classes

Adaters can be created declaratively using ExprAdapter:

>>> IpAddress = ExprAdapter(Byte[4],
...     encoder = lambda obj,ctx: list(map(int, obj.split("."))),
...     decoder = lambda obj,ctx: "{0}.{1}.{2}.{3}".format(*obj), )

Enums

Enums provide symmetrical name-to-value mapping. The name may be misleading, as it’s not an enumeration as you would expect in C. But since enums in C are often just used as a collection of named values, we’ll stick with the name. Enums used to be implemented by the Mapping adapter, which provides mapping of values to other values (not necessarily strings to numbers).

>>> c = Enum(Byte, TCP=6, UDP=17)
>>> c.parse(b"\x06")
'TCP'
>>> c.build("UDP")
b'\x11'
>>> c.build(17)
b'\x11'

We can also supply a default mapped value when no mapping exists for them. We do this by supplying a keyword argument named default. If we don’t supply a default value, an exception is raised.

>>> c = Enum(Byte, TCP=6, UDP=17)
>>> c.parse(b"\xff")
construct.core.MappingError: no decoding mapping for 255
>>> c.build("unknown")
construct.core.MappingError: no encoding mapping for 'unknown'
>>> c = Enum(Byte, TCP=6, UDP=17, default=0)
>>> c.parse(b"\xff")
0
>>> c.build(99)
b'\x00'

We can also just “pass through” unmapped values. We do this by supplying default = Pass. If you are curious, Pass is a special construct that “does nothing”. In this context, we use it to indicate the Enum to “pass through” the unmapped value as-is.

>>> c = Enum(Byte, TCP=6, UDP=17, default=Pass)
>>> c.parse(b"\xff")
255

FlagsEnum

>>> FlagsEnum(Byte, a=1, b=2, c=4, d=8).parse(b"\x03")
Container(c=False)(b=True)(a=True)(d=False)

Validating and filtering

Validating means making sure the parsed/built object meets a given condition. Validators simply raise the ValidatorError if the object is invalid. They are usually used to make sure a “magic number” is found, the correct version of the protocol, a file signature is matched. You can write custom validators by deriving from the Validator class and implementing the _validate method. This allows you to write validators for more complex things, such as making sure a CRC field (or even a cryptographic hash) is correct.

The two most common cases already exist as builtins.

class construct.OneOf

Validates that the object is one of the listed values, both during parsing and building. Note that providing a set instead of a list may increase performance.

Notice that OneOf(dtype, [value]) is essentially equivalent to Const(dtype, value).

Parameters:
  • subcon – a construct to validate
  • valids – a collection implementing __contains__
Raises:

ValidationError – when actual value is not among valids

Example:

>>> d = OneOf(Byte, [1,2,3])
>>> d.parse(b"\x01")
1
>>> d.parse(b"\xff")
construct.core.ValidationError: ('object failed validation', 255)

>>> d = OneOf(Bytes(2), b"1234567890")
>>> d.parse(b"78")
b'78'
>>> d.parse(b"19")
construct.core.ValidationError: ('invalid object', b'19')
class construct.NoneOf

Validates that the object is none of the listed values, both during parsing and building.

Parameters:
  • subcon – a construct to validate
  • valids – a collection implementing __contains__
Raises:

ValidationError – when actual value is among invalids

class construct.Filter

Filters a list leaving only the elements that passed through the validator.

Parameters:
  • subcon – a construct to validate, usually a Range Array Sequence
  • predicate – a function taking (obj, context) and returning a bool

Example:

>>> d = Filter(obj_ != 0, Byte[:])
>>> d.parse(b"\x00\x02\x00")
[2]
>>> d.build([0,1,0,2,0])
b'\x01\x02'
class construct.Slicing(subcon, count, start, stop, step=1, empty=None)

Adapter for slicing a list (getting a slice from that list). Works with Range and Sequence and their lazy equivalents.

Parameters:
  • subcon – the subcon to slice
  • count – expected number of elements, needed during building
  • start – start index (or None for entire list)
  • stop – stop index (or None for up-to-end)
  • step – step (or 1 for every element)
  • empty – value to fill the list with during building

Example:

???
class construct.Indexing(subcon, count, index, empty=None)

Adapter for indexing a list (getting a single item from that list). Works with Range and Sequence and their lazy equivalents.

Parameters:
  • subcon – the subcon to index
  • count – expected number of elements, needed during building
  • index – the index of the list to get
  • empty – value to fill the list with during building

Example:

???

Presenting data in different representation

class construct.Hex

Adapter for hex-dumping bytes. It returns a hex dump when parsing, and un-dumps when building.

Example:

>>> d = Hex(GreedyBytes)
>>> d.parse(b"abcd")
b'61626364'
>>> d.build("01020304")
b'\x01\x02\x03\x04'
class construct.HexDump

Adapter for hex-dumping bytes. It returns a hex dump when parsing, and un-dumps when building.

Parameters:
  • linesize – default 16 bytes per line
  • buildraw – by default build takes the same format that parse returns, set to build from a bytes directly

Example:

>>> d = HexDump(Bytes(10))
>>> d.parse(b"12345abc;/")
'0000   31 32 33 34 35 61 62 63 3b 2f                     12345abc;/       \n'

Using expressions instead of classes

Validators can be created declaratively using ExprValidator:

>>> OneOf = ExprValidator(Byte,
...     validator = lambda obj,ctx: obj in [1,3,5])

Checking

Checks can also be made using the context, being done just in the middle of parsing or building and not on a particular object. Check class takes the value (or values) that need to be validated out of the context, which is a dict populated with previous Struct or Sequence members parsing results. It is NOT a wrapper around a subcon.

class construct.Check(func)

Checks for a condition, and raises ValidationError if the check fails.

Parameters:func – a context function returning a bool (or truthy value)
Raises:ValidationError – when condition fails

Example:

Check(lambda ctx: len(ctx.payload.data) == ctx.payload_len)
Check(len_(this.payload.data) == this.payload_len)

You can also explicitly raise an error, declaratively with a construct.

construct.Error()

Raises an exception when triggered by parse or build. Can be used as a sentinel that blows a whistle when a conditional branch goes the wrong way, or to raise an error explicitly the declarative way.

Raises:ExplicitError – when parsed or build

Example:

>>> d = ("x"/Byte >> IfThenElse(this.x > 0, Byte, Error))
>>> d.parse(b"\xff\x05")
construct.core.ExplicitError: Error field was activated during parsing