Fast JavaScript / C++ version #11

jjrv · 2024-09-25T18:31:58Z

This branchless version in JavaScript:
https://gist.github.com/jjrv/b99d3c79eea6e7cc51bb148f309135db

maps the first 256 points along the curve to x, y coordinates and back on my 2.2 GHz CPU in about 1.8 ms.

I've tested it for all about 4 billion 32-bit indices. The test took 120 seconds on a single thread.

Here's a slightly different branchless version in C++:
https://github.com/rawrunprotected/hilbert_curves

becheran · 2024-09-29T18:04:35Z

Cool. Is it also a generic version which takes any kind of integer type as input (such as u8, u16, u32, and u64)? Or is it fixed to one specific length? I already have a branch with the cpp version of the hilber curve in the benchmark resulst: https://github.com/becheran/fast-hilbert/blob/cpp_cmp/benches/benchmark.rs

I just have no Idea how the algorithm works and if it can be extended to work with other integer types as fast hilbert does right now.

jjrv · 2024-09-30T10:09:41Z

Here's the C++ code ported to TypeScript and extended to support 64-bit output. Sorry, I just prefer it for dev and testing. It's not far from the original. I removed some temporary variables that could be avoided. Every round with the shift amounts doubled, adds support for twice the number of input / output bits. In JavaScript integers are 32 bits. In Rust you can use 64-bit integers and add another round (also to zip, which then needs longer masks) to support 128-bit output.

On the other hand every halving of bit count removes the need for one round. A round here is 6 lines of code. I've added separator comments between them. I'd expect that if inputs are short like only 8 bits, a lookup table probably makes more sense.

I mean at least for 4-bit inputs just:
return table[(y << 4) | x]
That's the entire algorithm. The table for 8-bit inputs isn't that big either (and the 4-bit case can use the same table, if you shift by 8). Should be pretty fast.

I'm using this for a Hilbert R-tree. Note that if the bit representation of a float is interpreted as an integer, sign bit flipped and all other bits also flipped for negative numbers, the order of the numbers is preserved. They can be radix sorted or meaningfully mapped on the Hilbert curve. So 128-bit curve indices make some sense for 2D 64-bit floating point coordinates used as-is, without scaling to any particular integer range.

/** Interleave bits with 0, so input bits are stored in odd bits and even bits are zeroed.
  * Equivalent to Morton code with one coordinate as the input and the other zero. */

export function zip(t: number): number {
	t = (t | (t << 8)) & 0x00ff00ff;
	t = (t | (t << 4)) & 0x0f0f0f0f;
	t = (t | (t << 2)) & 0x33333333;
	return (t | (t << 1)) & 0x55555555;
}

/** Calculate 64-bit position along 2D Hilbert curve at coordinates (x, y).
  * @param x x coordinate.
  * @param y y coordinate.
  * @return Position along curve. */

export function fromHilbert(x: number, y: number): number {
	const odd = x ^ y;

	let a = x & ~y;
	let b = ~(x | y);

	let t = a >>> 1;
	a ^= (odd & (b >>> 1)) ^ t;
	b ^= (b >>> 1) ^ (t & ~odd);

	let c = (odd >>> 1) ^ odd;
	let d = odd | ~odd >>> 1;

	t = a >>> 2;
	a ^= ((c & (b >>> 2)) ^ (t & (d ^ c)));
	b ^= ((d & (b >>> 2)) ^ (t & c));

	// --- Cut rounds below to support output only up to 8 bits (but a table lookup would be better then) ---

	t = c;
	c = ((d & (c >>> 2)) ^ (c & ((d ^ c) >>> 2)));
	d = ((d & (d >>> 2)) ^ (t & (t >>> 2)));

	t = a >>> 4;
	a ^= ((c & (b >>> 4)) ^ (t & (d ^ c)));
	b ^= ((d & (b >>> 4)) ^ (t & c));

	// --- Cut rounds below to support output only up to 16 bits (but a table lookup might be better then) ---

	t = c;
	c = ((d & (c >>> 4)) ^ (c & ((d ^ c) >>> 4)));
	d = ((d & (d >>> 4)) ^ (t & (t >>> 4)));

	t = a >>> 8;
	a ^= ((c & (b >>> 8)) ^ (t & (d ^ c)));
	b ^= ((d & (b >>> 8)) ^ (t & c));

	// --- Cut rounds below to support output only up to 32 bits ---

	t = c;
	c = ((d & (c >>> 8)) ^ (c & ((d ^ c) >>> 8)));
	d = ((d & (d >>> 8)) ^ (t & (t >>> 8)));

	t = a >>> 16;
	a ^= ((c & (b >>> 16)) ^ (t & (d ^ c)));
	b ^= ((d & (b >>> 16)) ^ (t & c));

	// --- End of rounds ---

	// Add another round with shifts 16 and 32 to support 128-bit output.
	// Then even and odd will be 64-bit numbers, and can be interleaved
	// into low and high halves of 128-bit output at the end.

	const even = (a ^ (a >>> 1)) | ~(odd | (b ^ (b >>> 1)));

	// All intermediates are 32 bits until here. Interleave them into 64-bit output
	// (technically float with 53 mantissa bits in JS).
	return (
		(
			zip(even >> 16) * 2 +
			zip(odd >> 16)
		) * 0x100000000 +
		zip(even & 0xffff) * 2 +
		zip(odd & 0xffff)
	);
}

jjrv changed the title ~~Faster JavaScript version~~ Fast JavaScript version Sep 25, 2024

jjrv changed the title ~~Fast JavaScript version~~ Fast JavaScript / C++ version Sep 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fast JavaScript / C++ version #11

Fast JavaScript / C++ version #11

jjrv commented Sep 25, 2024 •

edited

Loading

becheran commented Sep 29, 2024

jjrv commented Sep 30, 2024 •

edited

Loading

Fast JavaScript / C++ version #11

Fast JavaScript / C++ version #11

Comments

jjrv commented Sep 25, 2024 • edited Loading

becheran commented Sep 29, 2024

jjrv commented Sep 30, 2024 • edited Loading

jjrv commented Sep 25, 2024 •

edited

Loading

jjrv commented Sep 30, 2024 •

edited

Loading