The expanses of WolfWings' land
scratched on the wall for all to see

August 26th, 2009
August 26th, 2009
August 26th, 2009
August 26th, 2009
August 26th, 2009

[User Picture]08:33 am - Diffie-Hellman Key Exchange...

I've been working on building an SSH client, focussing on only supporting SSH 2.0, and ignoring all unneeded features and backwards-compatability options so this will fundamentally be a Java ME MIDP 2.0/CIDP 1.1 SSH 2.0-only client; no port forwarding, no X11 forwarding, no strange keyboard or terminal mappings just multiple shells on Linux servers running recent distro's.

Anyways, I've gotten to the Diffie-Hellman Key Exchange part, and I hunted for hours to find a good implementation of the math that didn't gobble up scads of memory and run way too slow for use in Java on my BlackBerry. It was a deep-recursive function, that buried itself as many layers deep as the numbers it was working with were wide.

But... there was a saving grace here, I unwound the function-calls, and verified that it was actually a linear test. It just had to start at the MSB of one operand, and run down the line to the LSB performing a core loop with one optional branch. Huzzah, bit-wise operands can convert this to a loop!

Well... I built and tested the recursive function against this new version, woo, it worked. Then I realized this may be a bit of code that others would find useful. So... posting it here for now in case anyone else needs/wants it. It's fully C99-compliant, and I'll update this post when I make a Java version, since the built-in function for this isn't available in Java ME as far as I can find.

/* Non-recursively calculates X to the power of Y, modulus N. */
unsigned long long int XpowYmodN(
                unsigned long long int x,
                unsigned long long int y,
                unsigned long long int n) {
        unsigned long long int accum, slice;
        int i;

        /* Find the single highest bit set of Y.
         *      This is done with by bitshifting and boolean-ORing
         * Y against itself to 'bit fill' all the way to the LSB.
         *      We then increment the accumulator to convert to a
         * single-bit value we use as a bitfield mask from then on.
         *      Note that we initially bit-shift Y one bit to the
         * right to avoid an overflow condition, as we'd have to
         * bit-shift the result one to the right otherwise.
        slice = y >> 1;
        i = 1;
        for (;;) {
                /* Stop when I is large enough to zero out
                 * the number with a bitwise right-shift. */
                accum = (slice >> i);
                if (accum == 0) {
                slice |= accum;
                i += i;

        /* This is based on power chaining over modulus N.
         * Normally this is a recursive computation, this
         * construction reduces it to a tight inner loop
         * that most compilers can keep in-register. */
        accum = x % n; /* The 'Y == 1' deepest starting point. */
        for (;;) {
                slice >>= 1;
                if (slice == 0) {
                        break; /* Out of bits to process. */
                accum = ((accum * accum) % n);  /* Always a power-2 cycle. */
                if (y & slice) {                /* The optional *X cycle. */
                        accum = ((accum * x) % n);

        return accum;
3 commentsLeave a comment

Log in

No account? Create an account