Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fast JavaScript / C++ version #11

Open
jjrv opened this issue Sep 25, 2024 · 2 comments
Open

Fast JavaScript / C++ version #11

jjrv opened this issue Sep 25, 2024 · 2 comments

Comments

@jjrv
Copy link

jjrv commented Sep 25, 2024

This branchless version in JavaScript:
https://gist.github.com/jjrv/b99d3c79eea6e7cc51bb148f309135db

maps the first 256 points along the curve to x, y coordinates and back on my 2.2 GHz CPU in about 1.8 ms.

I've tested it for all about 4 billion 32-bit indices. The test took 120 seconds on a single thread.

Here's a slightly different branchless version in C++:
https://github.com/rawrunprotected/hilbert_curves

@jjrv jjrv changed the title Faster JavaScript version Fast JavaScript version Sep 25, 2024
@jjrv jjrv changed the title Fast JavaScript version Fast JavaScript / C++ version Sep 26, 2024
@becheran
Copy link
Owner

Cool. Is it also a generic version which takes any kind of integer type as input (such as u8, u16, u32, and u64)? Or is it fixed to one specific length? I already have a branch with the cpp version of the hilber curve in the benchmark resulst: https://github.com/becheran/fast-hilbert/blob/cpp_cmp/benches/benchmark.rs

I just have no Idea how the algorithm works and if it can be extended to work with other integer types as fast hilbert does right now.

@jjrv
Copy link
Author

jjrv commented Sep 30, 2024

Here's the C++ code ported to TypeScript and extended to support 64-bit output. Sorry, I just prefer it for dev and testing. It's not far from the original. I removed some temporary variables that could be avoided. Every round with the shift amounts doubled, adds support for twice the number of input / output bits. In JavaScript integers are 32 bits. In Rust you can use 64-bit integers and add another round (also to zip, which then needs longer masks) to support 128-bit output.

On the other hand every halving of bit count removes the need for one round. A round here is 6 lines of code. I've added separator comments between them. I'd expect that if inputs are short like only 8 bits, a lookup table probably makes more sense.

I mean at least for 4-bit inputs just:
return table[(y << 4) | x]
That's the entire algorithm. The table for 8-bit inputs isn't that big either (and the 4-bit case can use the same table, if you shift by 8). Should be pretty fast.

I'm using this for a Hilbert R-tree. Note that if the bit representation of a float is interpreted as an integer, sign bit flipped and all other bits also flipped for negative numbers, the order of the numbers is preserved. They can be radix sorted or meaningfully mapped on the Hilbert curve. So 128-bit curve indices make some sense for 2D 64-bit floating point coordinates used as-is, without scaling to any particular integer range.

/** Interleave bits with 0, so input bits are stored in odd bits and even bits are zeroed.
  * Equivalent to Morton code with one coordinate as the input and the other zero. */

export function zip(t: number): number {
	t = (t | (t << 8)) & 0x00ff00ff;
	t = (t | (t << 4)) & 0x0f0f0f0f;
	t = (t | (t << 2)) & 0x33333333;
	return (t | (t << 1)) & 0x55555555;
}

/** Calculate 64-bit position along 2D Hilbert curve at coordinates (x, y).
  * @param x x coordinate.
  * @param y y coordinate.
  * @return Position along curve. */

export function fromHilbert(x: number, y: number): number {
	const odd = x ^ y;

	let a = x & ~y;
	let b = ~(x | y);

	let t = a >>> 1;
	a ^= (odd & (b >>> 1)) ^ t;
	b ^= (b >>> 1) ^ (t & ~odd);

	let c = (odd >>> 1) ^ odd;
	let d = odd | ~odd >>> 1;

	t = a >>> 2;
	a ^= ((c & (b >>> 2)) ^ (t & (d ^ c)));
	b ^= ((d & (b >>> 2)) ^ (t & c));

	// --- Cut rounds below to support output only up to 8 bits (but a table lookup would be better then) ---

	t = c;
	c = ((d & (c >>> 2)) ^ (c & ((d ^ c) >>> 2)));
	d = ((d & (d >>> 2)) ^ (t & (t >>> 2)));

	t = a >>> 4;
	a ^= ((c & (b >>> 4)) ^ (t & (d ^ c)));
	b ^= ((d & (b >>> 4)) ^ (t & c));

	// --- Cut rounds below to support output only up to 16 bits (but a table lookup might be better then) ---

	t = c;
	c = ((d & (c >>> 4)) ^ (c & ((d ^ c) >>> 4)));
	d = ((d & (d >>> 4)) ^ (t & (t >>> 4)));

	t = a >>> 8;
	a ^= ((c & (b >>> 8)) ^ (t & (d ^ c)));
	b ^= ((d & (b >>> 8)) ^ (t & c));

	// --- Cut rounds below to support output only up to 32 bits ---

	t = c;
	c = ((d & (c >>> 8)) ^ (c & ((d ^ c) >>> 8)));
	d = ((d & (d >>> 8)) ^ (t & (t >>> 8)));

	t = a >>> 16;
	a ^= ((c & (b >>> 16)) ^ (t & (d ^ c)));
	b ^= ((d & (b >>> 16)) ^ (t & c));

	// --- End of rounds ---

	// Add another round with shifts 16 and 32 to support 128-bit output.
	// Then even and odd will be 64-bit numbers, and can be interleaved
	// into low and high halves of 128-bit output at the end.

	const even = (a ^ (a >>> 1)) | ~(odd | (b ^ (b >>> 1)));

	// All intermediates are 32 bits until here. Interleave them into 64-bit output
	// (technically float with 53 mantissa bits in JS).
	return (
		(
			zip(even >> 16) * 2 +
			zip(odd >> 16)
		) * 0x100000000 +
		zip(even & 0xffff) * 2 +
		zip(odd & 0xffff)
	);
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants