Message and connection semantics: UDP vs. TCP

The parsers have these stub implementations:

module foo;

public type Request = unit {
    payload: bytes &eod;
};

public type Response = unit {
    payload: bytes &eod;
};

We have used &eod to denote that we want to extract all data. The semantics of all data differ between TCP and UDP parsers:

UDP has no connection concept so Zeek synthesizes UDP "connections" from flows by grouping UDP messages with the same 5-tuple in a time window. UDP has no reassembly, so a new parser instance is created for each UDP packet; &eod means until the end of the current packet.
TCP: TCP supports connections and packet reassembly, so both sides of a connection are modelled as streams with reassembled data; &eod means until the end of the stream. The stream is unbounded.

For this reason one usually wants to model parsing of a TCP connection as a vector of protocol messages, e.g.,

public type Requests = unit {
    : Request[];
};

type Request = unit {
    # TODO: Parse protocol message.
};

the length of the vector of messages is unspecified so it is detected dynamically
to avoid storing an unbounded vector of messages we use an anonymous field for the vector
parsing of the protocol messages is responsible for detecting when a message ends

Keyboard shortcuts