ProtocolDataUnits.jl
Encoders and decoders for Protocol Data Units (PDUs)
PDUs encode information as byte streams that can be transmitted across a network or stored. ProtocolDataUnits.jl
simplifies the process of encoding and decoding information as PDUs in a declarative way.
Getting started
The usage of the package is best illustrated with a simple example:
using ProtocolDataUnits
# define PDU format
Base.@kwdef struct EthernetFrame <: AbstractPDU
dstaddr::NTuple{6,UInt8} # fixed length
srcaddr::NTuple{6,UInt8} # fixed length
ethtype::UInt16 # fixed length
payload::Vector{UInt8} # variable length
crc::UInt32 = 0 # fixed length
end
# declare that the variable length of the payload can be computed
Base.length(::Type{EthernetFrame}, ::Val{:payload}, info) = info.length - 18
# create an Ethernet frame
frame = EthernetFrame(
dstaddr = (0x01, 0x02, 0x03, 0x04, 0x05, 0x06),
srcaddr = (0x11, 0x12, 0x13, 0x14, 0x15, 0x16),
ethtype = 0x0800,
payload = [0x01, 0x02, 0x03, 0x04, 0x11, 0x12, 0x13, 0x14]
)
# convert to a byte array
bytes = PDU.encode(frame)
# convert back to Ethernet frame
decoded = PDU.decode(bytes, EthernetFrame)
# check that they are the same
@assert frame == decoded
The package can do much more, including nested PDUs, wire-encoding, CRC computation, etc.
Basic usage
A PDU is declared as a struct
subtyped from PDU
. It may contain fields of the following types:
Number
types (various sized integers and floats)NTuple
ofNumber
typesAbstractVector
ofNumber
typesAbstractString
- Other
PDU
s Nothing
- Any other data type
T
that supportsread(::IO, ::Type{T})
andwrite(::IO, ::T)
Union
of any of the above types
The size (in bytes) of numeric types, tuples of numeric types and nothing
is known. However, vectors, strings and other data types may have variable sizes. If the size is unknown, a wire-encoded size/length field is implicitly added to the PDU representation when encoding it, and is used during decoding to infer size/length. Alternatively, the size/length of specific fields may be declared by defining a length()
for specific fields in a PDU.
By default, network byte order (big endian) is used for multi-byte numeric values. That may be overridden for the PDU or for specific fields by declaring a byteorder()
.
When a field is of a union type, a fieldtype()
definition must be available to resolve which concrete type to expect when decoding a PDU from bytes.
PDUs are encoded into bytes in one of two ways:
bytes = PDU.encode(pdu) # returns a vector of bytes
write(io, pdu) # writes bytes to an IOStream
PDUs are decoded from bytes in one of two ways:
pdu = PDU.decode(bytes, MyPDU) # creates a MyPDU from bytes
pdu = read(io, MyPDU) # creates a MyPDU by reading bytes from an IOStream
Usage is best illustrated through a series of examples.
PDUs with fixed length fields
Lets define a simple PDU where all field sizes are known:
struct MySimplePDU <: AbstractPDU
a::Int16
b::UInt8
c::UInt8
d::NTuple{2,Int32}
e::Float32
f::Float64
end
pdu = MySimplePDU(1, 2, 3, (4,5), 6f0, 7.0)
and then encode it into bytes:
bytes = PDU.encode(pdu)
This yields bytes = [0x00, 0x01, 0x02, 0x03, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00, 0x05, 0x40, 0xc0, 0x00, 0x00, 0x40, 0x1c, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00]
.
We can change the byte ordering for the PDU to little-endian:
PDU.byteorder(::Type{MySimplePDU}) = LITTLE_ENDIAN
Now:
bytes = PDU.encode(pdu)
yields [0x01, 0x00, 0x02, 0x03, 0x04, 0x00, 0x00, 0x00, 0x05, 0x00, 0x00, 0x00, 0x00, 0x00, 0xc0, 0x40, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x1c, 0x40]
.
The bytes can be converted back to a PDU:
pdu2 = PDU.decode(bytes, MySimplePDU)
and we can verify that the recovered PDU has the same contents as the original: @assert pdu == pdu2
.
PDUs with variable length fields
We can define a slightly more complex PDU containing strings of potentially unknown length:
struct MyLessSimplePDU <: AbstractPDU
a::Int16
b::String
end
pdu = MyLessSimplePDU(1, "hello world!")
We can convert the PDU to bytes and back:
bytes = PDU.encode(pdu)
pdu2 = PDU.decode(bytes, MyLessSimplePDU)
@assert pdu == pdu2
The PDU will have a size of 15 bytes (2 bytes for a
, 12 bytes for b = "hello world!"
, and 1 byte to store the length of b
). The length of the string is encoded as a variable length number using wire-encoding.
If we knew the maximum length of the string beforehand (say 14 bytes), and wanted a fixed length PDU (14+2=16 bytes), we could declare the length:
Base.length(::Type{MyLessSimplePDU}, ::Val{:b}, info) = PadTo(14)
bytes = PDU.encode(pdu)
@assert length(bytes) == 16
pdu2 = PDU.decode(bytes, MyLessSimplePDU)
@assert pdu == pdu2
Since the string b = "hello world!"
occupies only 12 bytes, it is padded with two null ('\0
) bytes. If the length of b
was larger than the allocated length, the encoding will throw an exception.
We could also support variable length strings without having to store the length in the PDU if we knew the size of the PDU while decoding. To do so, we need to declare that the length of the string must be 2 bytes less than the length of the whole PDU:
Base.length(::Type{MyLessSimplePDU}, ::Val{:b}, info) = info.length - 2
The info
object provides information on the PDU being encoded or decoded. info.length
tells us the size of the PDU in bytes, if known (otherwise it is missing
). Now, we can encode arbitrary length strings in our PDU without the overhead of storing the length of the string:
bytes = PDU.encode(pdu)
@assert length(bytes) == 2 + length("hello world! how are you?")
pdu2 = PDU.decode(bytes, MyLessSimplePDU)
@assert pdu2.b == "hello world! how are you?"
@assert pdu == pdu2
We can also define field lengths that depend on the value of preceding fields. For example, if we happened to know that the length of string b
is always 2a
, we can declare:
Base.length(::Type{MyLessSimplePDU}, ::Val{:b}, info) = 2 * info.get(:a)
Here info.get()
provides us access to fields that are decoded earlier in the byte stream.
pdu = MyLessSimplePDU(6, "hello world!")
bytes = PDU.encode(pdu)
@assert length(bytes) == 2 + 2*6
pdu2 = PDU.decode(bytes, MyLessSimplePDU)
@assert pdu2.b == "hello world!"
Had we set an a
that is too small or big, we would get an exception complaining about the length. If we wanted the string to be null padded automatically, we could specify that:
Base.length(::Type{MyLessSimplePDU}, ::Val{:b}, info) = PadTo(2 * info.get(:a))
# string is null padded to 16 bytes
pdu = MyLessSimplePDU(8, "hello world!")
bytes = PDU.encode(pdu)
@assert length(bytes) == 2 + 2*8
pdu2 = PDU.decode(bytes, MyLessSimplePDU)
@assert pdu2.b == "hello world!"
Variable length vector fields work exactly in the same way, with length being defined as the number of elements in the vector (not number of bytes). However, for vectors, the default length is nothing
(unknown), and so we need an explicit declaration to change it to wire-encoding if we want the vector length to be stored in the PDU:
struct MyVectorPDU <: AbstractPDU
a::Int16
b::Vector{Float64}
end
Base.length(::Type{MyVectorPDU2}, ::Val{:b}, info) = WireEncoded()
# vector length is in number of Float64, but info.length is in number of bytes
Base.length(::Type{MyVectorPDU}, ::Val{:b}, info) = (info.length - 2) ÷ sizeof(Float64)
pdu = MyVectorPDU(1, [1.0, 2.0, 3.0])
bytes = PDU.encode(pdu)
@assert length(bytes) == 2 + 3 * sizeof(Float64)
pdu2 = PDU.decode(bytes, MyVectorPDU)
@assert pdu == pdu2
PDUs with nested PDUs
We can even nest PDUs:
struct InnerPDU <: AbstractPDU
a::Int8
b::Float32
end
struct OuterPDU <: AbstractPDU
x::Int16
y::InnerPDU
z::Int8
end
pdu = OuterPDU(1, InnerPDU(2, 3f0), 4)
and encode and decode them effortlessly:
bytes = PDU.encode(pdu)
@assert length(bytes) == 2 + (1 + 4) + 1
pdu2 = PDU.decode(bytes, OuterPDU)
@assert pdu2.y == pdu.y # inner PDU matches
@assert pdu == pdu # so does the outer PDU2
We can infer sizes of variable length fields in nested PDUs too:
struct InnerPDU2 <: AbstractPDU
a::Int8
b::String
end
struct OuterPDU2 <: AbstractPDU
x::Int16
y::InnerPDU2
z::Int8
end
Base.length(::Type{InnerPDU2}, ::Val{:b}, info) = info.length - 1
Base.length(::Type{OuterPDU2}, ::Val{:y}, info) = info.length - 3
pdu = OuterPDU2(1, InnerPDU2(2, "hello world!"), 4)
bytes = PDU.encode(pdu)
@assert length(bytes) == 2 + (1 + 12) + 1
pdu2 = PDU.decode(bytes, OuterPDU2)
@assert pdu2.y == pdu.y
@assert pdu == pdu2
PDUs with dependent fields
A PDU may contain a field that is dependent on another field. We saw in an example above, where MyVectorPDU
has field a
which specified the number of elements in field b
. A good way to ensure consistency is to populate dependent fields at construction:
struct MyVectorPDU2 <: AbstractPDU
a::Int16
b::Vector{Float64}
end
MyVectorPDU2(b::Vector{Float64}) = MyVectorPDU2(length(b), b)
pdu = MyVectorPDU2([1.0, 2.0, 3.0])
@assert pdu.a == 3
However, since vector b
can be mutated after construction, the consistency at construction does not guarantee consistency at encoding. We could enforce consistency an encoding using a pre-encode hook:
using Accessors
function PDU.preencode(pdu::MyVectorPDU2)
@set pdu.a = length(pdu.b)
end
This will ensure that field a
is populated correctly at time of encoding:
push!(pdu.b, 4.0) # add 4th element to b
@assert pdu.a == 3 # now pdu is inconsistent, since pdu.a == 3
bytes = PDU.encode(pdu)
@assert bytes[2] == 4 # encoded bytes show 4 elements correctly
pdu2 = PDU.decode(bytes, MyVectorPDU2)
@assert pdu2.a == 4 # decoded pdu also shows 4 elements correctly
@assert length(pdu2.b) == 4 # and it indeed contains 4 elements
PDUs with CRCs
Sometimes we may want to pre-process PDUs to compute CRC, or post-process them to modify their content or perform CRC checks. To see, how we can do this, let's go back to our example of EthernetFrame
and define a pre-encoding hook to compute CRC, and a post-decoding hook to check the CRC:
using CRC32
function PDU.preencode(pdu::EthernetFrame)
bytes = PDU.encode(pdu; hooks=false) # encode without computing CRC
crc = crc32(bytes[1:end-4]) # compute CRC
@set pdu.crc = crc # make a new frame with CRC filled in
end
function PDU.postdecode(pdu::EthernetFrame)
bytes = PDU.encode(pdu; hooks=false) # re-encode the frame for CRC computation
pdu.crc == crc32(bytes[1:end-4]) || throw(ErrorException("CRC check failed"))
pdu # return unaltered pdu if CRC OK
end
frame = EthernetFrame(
dstaddr = (0x01, 0x02, 0x03, 0x04, 0x05, 0x06),
srcaddr = (0x11, 0x12, 0x13, 0x14, 0x15, 0x16),
ethtype = 0x0800,
payload = [0x01, 0x02, 0x03, 0x04, 0x11, 0x12, 0x13, 0x14]
)
buf = PDU.encode(frame)
frame2 = EthernetFrame(buf)
@assert frame.payload == frame2.payload
However, if there was an error in the buffer, the CRC check would fail:
buf[5] += 1
EthernetFrame(buf) # should throw an exception
PDUs with union types
Consider a PDU with the first byte specifying the header length, which is followed by a header and then a payload. Two versions of headers may be used, depending on the application needs, with the header length allowing the receiver to differentiate between the two. We can define the PDU with a header field that uses a union type:
struct Header_v1 <: AbstractPDU
src::UInt32
dst::UInt32
port::UInt8
end
struct Header_v2 <: AbstractPDU
src::UInt64
dst::UInt64
port::UInt16
end
struct AppPDU <: AbstractPDU
hdrlen::UInt8
hdr::Union{Header_v1,Header_v2}
payload::Vector{UInt8}
end
# convenience constructors to auto-populate hdrlen
AppPDU(hdr::Header_v1, payload) = AppPDU(9, hdr, payload)
AppPDU(hdr::Header_v2, payload) = AppPDU(18, hdr, payload)
# hdr is v2 if hdrlen field matches it's size, otherwise default to v1
function PDU.fieldtype(::Type{AppPDU}, ::Val{:hdr}, info)
info.get(:hdrlen) == 18 && return Header_v2
Header_v1
end
# payload length is the frame length less the header
Base.length(::Type{AppPDU}, ::Val{:payload}, info) = info.length - info.get(:hdrlen) - 1
We can now create either type of PDU and decode it without having a priori knowledge of the header type:
# v1 header
pdu = AppPDU(Header_v1(1, 2, 3), UInt8[4, 5, 6])
bytes = PDU.encode(pdu)
@assert length(bytes) == 13
pdu2 = PDU.decode(bytes, AppPDU)
@assert pdu.hdr isa Header_v1
@assert pdu == pdu2
# v2 header
pdu = AppPDU(Header_v2(1, 2, 3), UInt8[4, 5, 6])
bytes = PDU.encode(pdu)
@assert length(bytes) == 22
pdu2 = PDU.decode(bytes, AppPDU)
@assert pdu.hdr isa Header_v2
@assert pdu == pdu2
PDUs as parametrized types
For type stability, it is often desirable not to use a union type as a field in the struct
, but instead use a parametrized struct
. We support parametrized PDUs too:
struct ParamAppPDU{T} <: AbstractPDU
hdrlen::UInt8
hdr::T
payload::Vector{UInt8}
end
# convenience constructors to auto-populate hdrlen
ParamAppPDU(hdr::Header_v1, payload) = ParamAppPDU{Header_v1}(9, hdr, payload)
ParamAppPDU(hdr::Header_v2, payload) = ParamAppPDU{Header_v2}(18, hdr, payload)
# hdr is v2 if hdrlen field matches it's size, otherwise default to v1
function PDU.fieldtype(::Type{<:ParamAppPDU}, ::Val{:hdr}, info)
info.get(:hdrlen) == 18 && return Header_v2
Header_v1
end
# payload length is the frame length less the header
Base.length(::Type{<:ParamAppPDU}, ::Val{:payload}, info) = info.length - info.get(:hdrlen) - 1
pdu = ParamAppPDU(Header_v1(1, 2, 3), UInt8[4, 5, 6])
bytes = PDU.encode(pdu)
@assert length(bytes) == 13
pdu2 = PDU.decode(bytes, ParamAppPDU)
@assert pdu.hdr isa Header_v1
@assert pdu == pdu2
PDUs with optional fields
Extending the idea of union fields, we can define PDUs with optional fields:
struct App2PDU <: AbstractPDU
hdrlen::UInt8
hdr::Union{Header_v1,Header_v2,Nothing}
payload::Vector{UInt8}
end
# convenience constructor to auto-populate hdrlen
function App2PDU(; hdr=nothing, payload=UInt8[])
hdrlen = 0
hdr isa Header_v1 && (hdrlen = 9)
hdr isa Header_v2 && (hdrlen = 18)
App2PDU(hdrlen, hdr, payload)
end
# hdr is v1, v2 or nothing, depending on hdrlen
function PDU.fieldtype(::Type{App2PDU}, ::Val{:hdr}, info)
info.get(:hdrlen) == 9 && return Header_v1
info.get(:hdrlen) == 18 && return Header_v2
Nothing
end
# payload length is the frame length less the header
Base.length(::Type{App2PDU}, ::Val{:payload}, info) = info.length - info.get(:hdrlen) - 1
and work PDUs with or without headers:
# v1 header
pdu = App2PDU(hdr=Header_v1(1, 2, 3), payload=UInt8[4, 5, 6])
bytes = PDU.encode(pdu)
@assert length(bytes) == 13
pdu2 = PDU.decode(bytes, App2PDU)
@assert pdu.hdr isa Header_v1
@assert pdu == pdu2
# no header
pdu = App2PDU(payload=UInt8[4, 5, 6, 7, 8, 9])
bytes = PDU.encode(pdu)
@assert length(bytes) == 7
pdu2 = PDU.decode(bytes, App2PDU)
@assert pdu.hdr === nothing
@assert pdu == pdu2