Composable Object Streams
In my last post I introduced the pcap-socket module to help test against real, captured network data. I was rather happy with how that module turned out, so I decided to mock out dgram next in order to support testing UDP packets as well.
I almost immediately ran into a few issues:
- The
dgram
module does not implement a streams2 duplex API. It still provides an old-style “spew-stream”. - I wanted to share code with
pcap-socket
for parsing ethernet frames and IP headers. - I also wanted to implement some missing features such IP fragment reassembly. These types of features require operating across multiple packets and therefore fit the streaming model better than the simple utility function approach.
Using the basic rebuffering byte streams provided by the new streams2 API seemed problematic. For one, UDP packets are distinct and shouldn’t be summarily rebuffered. Also, I needed a way to extract packet header information and pass it along with the byte stream.
I was considering a couple ways to proceed when Raynos was kind enough to implement object streams.
This seemed to solve a lot of my problems. I could now turn off rebuffering and I could pass arbitrary objects around.
The new object mode, however, did create one new issue.
Now that streams are not just ordered bytes, how can I write general purpose composable streams other people could easily use? If every person uses their own object structure then it could be very difficult to put together separate stream modules in a useful way.
Over time I came up with an approach that seemed to work well for the network protocol domain. Essentially, I structured messages like this:
var msg = {
data: buf,
offset: 14,
ether: {
src: '01:02:03:04:05:06',
dst: '06:05:04:03:02:01',
type: 'ip',
length: 14
}
};
Each message is the combination of some binary Buffer
data and additional
meta-data. In this example I have an ethernet frame that’s been parsed off
the start of the msg.data
buffer.
After a full parsing chain:
var ipstream = new IpStream();
var udpstream = new UdpStream();
ipstream.pipe(udpstream);
ipstream.write(msg);
var out = udpstream.read();
The resulting out
message might look like this:
var out = {
data: buf,
offset: 42,
ether: {
src: '01:02:03:04:05:06',
dst: '06:05:04:03:02:01',
type: 'ip',
length: 14
},
ip: {
src: '1.1.1.1',
dst: '2.2.2.2',
flags: {
df: false,
mf: false
},
protocol: 'udp',
protocolCode: 17,
offset: 0,
id: 12345,
length: 20
},
udp: {
srcPort: 5432,
dstPort: 52,
dataLength: 500,
length 8
}
};
This lets us inspect all of the extracted information at the end of the processing pipeline.
This approach also lets different stream implementations work together. For
example, the IpStream
can inspect the msg.ether.type
property provided
by EtherStream
to see if msg
represents an IP packet or not.
This approach also allows streams to avoid stepping on each others toes. Both
the EtherStream
and IpStream
produce src
properties, but they don’t
conflict because they are namespaced under ether
and ip
.
To help solicit feedback on this approach I started a gist that outlines the approach. If you’re interested or have an opinion please check it out. I’d love to know if there is a better, more standard way to build these sorts of object streams.
Oh, and I did finally implement the dgram
mock object. See the
pcap-dgram module for examples. Both it and pcap-socket
are built on
top of the new ether-stream and ip-stream. And ip-stream
does
indeed now support fragmentation reassembly.
All of these composable object streams are implemented using a new base class module called object-transform. It makes it fairly easy to write these kinds of transformations.
Of course, this still leaves me with my first issue. The dgram core
module still does not provide a streams2 API. If this message structure makes
sense, however, it should now be possible to provide this API without losing
the rinfo
meta-data provided in the current 'message'
event. UDP wouldn’t
benefit from the back pressure improvements, but this would allow dgram
to
easily pipe()
into other composable stream modules.
Again, if you an opinion on if this is useful or how it can be improved, please comment on the gist or send me a tweet.
Thank you!