Core API: Strings

construct.core.possiblestringencodings = {'ascii': 1, 'u16': 2, 'u32': 4, 'u8': 1, 'utf16': 2, 'utf32': 4, 'utf8': 1, 'utf_16': 2, 'utf_16_be': 2, 'utf_16_le': 2, 'utf_32': 4, 'utf_32_be': 4, 'utf_32_le': 4, 'utf_8': 1}

Explicitly supported encodings (by PaddedString and CString classes).

construct.PaddedString(length, encoding)

Configurable, fixed-length or variable-length string field.

When parsing, the byte string is stripped of null bytes (per encoding unit), then decoded. Length is an integer or context lambda. When building, the string is encoded and then padded to specified length. If encoded string is larger than the specified length, it fails with PaddingError. Size is same as length parameter.

Warning

PaddedString and CString only support encodings explicitly listed in possiblestringencodings .

Parameters:
  • length – integer or context lambda, length in bytes (not unicode characters)

  • encoding – string like: utf8 utf16 utf32 ascii

Raises:
  • StringError – building a non-unicode string

  • StringError – selected encoding is not on supported list

Can propagate any exception from the lambda, possibly non-ConstructError.

Example:

>>> d = PaddedString(10, "utf8")
>>> d.build(u"Афон")
b'\xd0\x90\xd1\x84\xd0\xbe\xd0\xbd\x00\x00'
>>> d.parse(_)
u'Афон'
construct.PascalString(lengthfield, encoding)

Length-prefixed string. The length field can be variable length (such as VarInt) or fixed length (such as Int64ub). VarInt is recommended when designing new protocols. Stored length is in bytes, not characters. Size is not defined.

Parameters:
  • lengthfield – Construct instance, field used to parse and build the length (like VarInt Int64ub)

  • encoding – string like: utf8 utf16 utf32 ascii

Raises:

StringError – building a non-unicode string

Example:

>>> d = PascalString(VarInt, "utf8")
>>> d.build(u"Афон")
b'\x08\xd0\x90\xd1\x84\xd0\xbe\xd0\xbd'
>>> d.parse(_)
u'Афон'
construct.CString(encoding)

String ending in a terminating null byte (or null bytes in case of UTF16 UTF32).

Warning

String and CString only support encodings explicitly listed in possiblestringencodings .

Parameters:

encoding – string like: utf8 utf16 utf32 ascii

Raises:
  • StringError – building a non-unicode string

  • StringError – selected encoding is not on supported list

Example:

>>> d = CString("utf8")
>>> d.build(u"Афон")
b'\xd0\x90\xd1\x84\xd0\xbe\xd0\xbd\x00'
>>> d.parse(_)
u'Афон'
construct.GreedyString(encoding)

String that reads entire stream until EOF, and writes a given string as-is. Analog to GreedyBytes but also applies unicode-to-bytes encoding.

Parameters:

encoding – string like: utf8 utf16 utf32 ascii

Raises:
  • StringError – building a non-unicode string

  • StreamError – stream failed when reading until EOF

Example:

>>> d = GreedyString("utf8")
>>> d.build(u"Афон")
b'\xd0\x90\xd1\x84\xd0\xbe\xd0\xbd'
>>> d.parse(_)
u'Афон'
construct.setGlobalPrintFullStrings(enabled=False)

When enabled, Container __str__ produces full content of bytes and unicode strings, otherwise and by default, it produces truncated output (16 bytes and 32 characters).

Parameters:

enabled – bool