The expanses of WolfWings' land
scratched on the wall for all to see


December 21st, 2006
December 21st, 2006
December 21st, 2006
December 21st, 2006
December 21st, 2006

[User Picture]01:22 am - Slowly rummaging through the code everywhere...

Stuck up in San Jose because there's nowhere for me to sleep at my grandmother's this week, or possibly next week either. I'll find out this weekend what's up.

On the coding front, I'm almost done re-writing (since I can't really call it a conversion anymore, it's a full re-write from base math/concepts now) the bijective Arithmetic encoder in straight C.

OldNew
void SimpleAdaptiveModel::AddP(int sym,U32 n)
  {
  for (sym+=symzeroindex;sym;sym>>=1)
    probheap[sym]+=n;
  prob1=probheap[1];
  }
void UpdateSymbol (heap *heap, signed int symbol, unsigned int weight) {
	/* -1 = No Symbol */
	if (symbol < 0)
		return;

	/* Set 'extra high' bit so we can bit-shift to update the tree. */
	symbol = symbol + heap->heapsizehalf;

	/* Root node, 2 child nodes, 4 grandchildren, etc, etc. */
	/* Actual update happens from the bottom up though.     */
	while (symbol) {
		heap->heap[symbol - 1]++; /* Increment this entry on the heap.       */
		symbol >>= 1;              /* Shift the counter one bit to the right. */
	}
}

I've realized one other nice trick that comes as a side-effect of Bijective compression. The routines, when properly written, by their very definition do not require 'end of file' markers, or anything except X markers for X symbols in an alphabet. I.E. If you don't have an explicit 'end of file' marker, and have to be able to have decompress(compress(file)) always return the same results as compress(decompress(file)) then you can't have things like file-length or original file-name tags inside the resulting files either. They are truly self-contained, opaque, compressed blocks of raw data. This makes them great for encapsulating in other formats, though it also makes them incredibly fragile because if any bits get changed the entire morass goes splat.

On a side-note, I've started looking into adding this compression to the Quake 3 networking protocol. This led me into looking up better methods of compressing floating-point numbers, and a rather straight-forward observation: IEEE floating-point numbers are not formatted in a way that's good to compress, but it is is a format that's easy to do comparison operations on. Amusingly, some simple bit-shuffling makes the numbers a lot more compressible byte-wise. Long story short, glue the sign bit onto the top of the Mantissa, and deal with the Exponent separately since it's already an 8-bit number. I'm sure that's in some publication out there somewhere, but I only had reason to figure it out for myself tonight for the first time ever.

The only complication will be the sheer quantity of encoder-heaps I'll have to track if I want to support ongoing heaps (the core bit of math-fu in the arithmetic encoder) between server frames instead of only one-shots for each frame. Each heap takes up about 2k, and I think I'll need four heaps tracked ideally. (Floating-point Exponent, Floating-point Sign/Mantissa, 32/16-bit Integers, and Text) So, about 8k per client connection per frame of sync delay... so roughly 8k per client per 50ms of latency between them and the server, on average. Worst-case, about 10MB of memory used if a fully-loaded 64-player Quake3 server had all the players >999 ping. At that point I believe some kind of 'skip the delta, here the full state' mode would be better. More realistically, I'll probably allocate a fixed buffer of 'heaps' ahead of time, and if no more can be allocated it'll force a client to 'zero state' frames, and flush all their heaps back into the pool. Just means I'll have to allocate one of the 'reserved' bits to flag a frame as 'heap based on delta-from frame' or 'heap based on zero' but that's pretty minor.

0 commentsLeave a comment
?

Log in

No account? Create an account