I've always been a fan of data-compression and encryption technologies, or as I think of them and numerous related areas, data-transformation systems since (for Lossless compression and encryption, the only thing I've had much interest in) that's all they really are is transformations.
Lately I've been most interested in a surprisingly little-researched area, namely the compression of hundreds or thousands of very small items, typically each less than a hundred bytes. This has obvious applications in on-line gaming, and for example to compress values in a large database of keys in the case of MU*'s such as Fuzzball.
Far too much research has been written on absolute compression ratios, but a LOT of those techniques actually decrease the compression-ratio on very small files because of the 'warm up' time before the techniques really shine. Very much akin to optimizing a vehicle for stop-and-go versus highway driving.
I've been stepping through a couple of various compression technologies, namely Dynamic Vitter Huffman compression lately, and ran across something I found useful and interesting. Namely a Bijective version Huffman compression. This has a very unique trait, namely that any bitstream is valid. Meaning this is very useful for things such as gaming traffic because you can always safely call the decompress routine on a packet of data regardless of content and it will transform without error.
And this led me to downloading the code, and trying to understand it. I... don't believe I've ever seen such horribly-mangled code before in my life, not even in horribly Pascal/x86 Assembler mashups from my demo-coding tutorial-reading days.
And this is leading me to rewrite the entire block of code, but while I was doing that I realized... I really don't know of any good repository to 'publish' example source code for things like Huffman Compression, or Range Coding. Google can find examples, but there's no single site to post Wiki-editable or even just commentable code chunks that can be used directly but are meant to be illustrative and readable more than fast or efficient.
Anyone think setting up such a site would actually be useful? Something half-way between Wikipedia, MathWorld, and SourceForge meant more for discussion and community editing of a single file of self-contained source code, but very heavilly biased towards functional source code that's easy to port (for example favoring straight C over C++ or C#, and Assembly being unwelcome) and not just commentary and mathematical equations.