Generic Binary-Data Pack/Unpack Module?

Having just written yet another module to pack/unpack data into a particular binary format, I'm wondering if someone has a more effective method they'd like to share?  I'm thinking some module that can handle RLE/TLV type stuff, complex "encodings" such as utf-8 continued charcters, struct-like pack/unpack for static elements, efficient loading of arrays, and bit-field pack/unpack.  Some way to customize handling of a given element within the format (so you could write constraints or what have you), some way to provide defaults, both static and calculated, some way to specify derivative fields, checksums and the like, some way to...

Whenever I look at generalizing this pattern, it always seems to turn into "you should have the equivalent of a compiler/parser-generator" produce this code, but if someone has a good, generic solution, I'd love to hear about it.


  1. Dirkjan Ochtman

    Dirkjan Ochtman on 03/30/2010 1:35 a.m. #

    This kind of sounds like Hachoir:

  2. Charlie

    Charlie on 03/30/2010 2:53 a.m. #

    I'm not sure if it'll meet your requirements, but I worked with Construct a while ago and loved it. The way it outlines the data's structure so explicitly made the code 100% clearer.

  3. rdm

    rdm on 03/30/2010 5:37 a.m. #

    hdf5 library maybe?

  4. Mike C. Fletcher

    Mike C. Fletcher on 03/30/2010 9:56 a.m. #

    Construct looks like exactly the kind of project I was envisioning. Same basic interface as my current code (in a number of projects).

    Hachoir actually looks sexier for an "out of box" collection of primitives (given it has 70 defined languages already).

    Not sure what performance is like, but it looks like I've got some playing to do when I get the time :) .

  5. bs

    bs on 03/30/2010 11:25 a.m. #

    Sounds like you want protocol buffers, until you mention that you don't like code-generation. In that case, Avro might fit your bill [], however, you pay a performance penalty when you use Avro's dynamic packing/unpacking. Avro does have code generation as an optimization feature, though.

  6. Mike Fletcher

    Mike Fletcher on 03/30/2010 11:33 a.m. #

    I'm fine with code generation, as I mentioned, it looks like a compiler/parser would be the appropriate path :) . Avro and Protocol Buffers are, AFAIK, *particular* binary formats, I'm looking for a mechanism that lets you interact with already-defined binary formats (efficiently, with a reasonable API).

  7. Felix Schwarz

    Felix Schwarz on 04/10/2010 4:48 a.m. #

    As 'construct' is not a nice google keyword, I'd like to point interested visitors to the construct homepage:

Comments are closed.


Pingbacks are closed.