Skip to content

Commit

Permalink
Cleanup the testing, the readme, and use a different mix function in …
Browse files Browse the repository at this point in the history
…the hashmap.
  • Loading branch information
sheredom committed Mar 20, 2023
1 parent 9744b1a commit a3ed0ef
Show file tree
Hide file tree
Showing 8 changed files with 365 additions and 986 deletions.
51 changes: 35 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,17 +16,9 @@ The current supported platforms are Linux, macOS and Windows.

### Fundamental Design

The hashmap is made to work with UTF-8 string slices - sections of strings that
are passed with a pointer and an explicit length. The reason for this design
choice was that the hashmap is being used, by the author, to map symbols that
are already resident in memory from a source file of a programming language. To
save from causing millions of additional allocations to move these UTF-8 string
slices into null-terminated strings, an explicit length is always passed.

Note also that while the API passes char* pointers as the key - these keys are
never used with the C API string functions. Instead `memcmp` is used to compare
keys. This allows us to use UTF-8 strings in place of regular ASCII strings with
no additional code.
The hashmap is made to work with any arbitrary data keys - you just provide a
pointer and size, and it'll hash that data. Comparison is done using `memcmp`,
so zeroing out padding data is advised in structs.

### Create a Hashmap

Expand All @@ -41,8 +33,26 @@ if (0 != hashmap_create(initial_size, &hashmap)) {
```

The `initial_size` parameter only sets the initial size of the hashmap - which
can grow if multiple keys hit the same hash entry. The initial size must be a
power of two, and creation will fail if it is not.
can grow if multiple keys hit the same hash entry. The size of the hashmap is
rounded up to the nearest power of two above the provided `initial_size`.

There is also an extended creation function `hashmap_create_ex`:

```c
struct hashmap_s hashmap;
struct hashmap_create_options_s options;
memset(&options, 0, sizeof(options));

// You can set a custom hasher that the hashmap should use.
options.hasher = &my_hasher;

// You can also specify the initial capacity of the hashmap.
options.initial_capacity = 42;

if (0 != hashmap_create_ex(options, &hashmap)) {
// error!
}
```
### Put Something in a Hashmap
Expand Down Expand Up @@ -169,6 +179,15 @@ To get the number of entries that have been put into a hashmap use the
unsigned num_entries = hashmap_num_entries(&hashmap);
```

### Get the Capcity of a Hashmap

To get the actual number of buckets allocated in the hashmap (the capacity) use
the `hashmap_capacity` function:

```c
unsigned num_entries = hashmap_capacity(&hashmap);
```

### Destroy a Hashmap

To destroy a hashmap when you are finished with it use the `hashmap_destroy`
Expand All @@ -189,10 +208,10 @@ by Elliott Back. The authors have applied the following further changes:
external projects).
- Used an explicitly public domain license for the code - the
[unlicense](https://unlicense.org/).
- Changed the API to take string slices (pointer & length) instead of null
terminated strings.
- Changed the API to take arbitrary data pointers and length (it was originally
solely for UTF-8 string slices).
- Did a pass to clean up the comments and function signatures.
- Added second iterator, tests and documentation. (Samuel D. Crow)
- Added second iterator, tests and documentation. (Samuel D. Crow)
## License
Expand Down
21 changes: 10 additions & 11 deletions hashmap.h
Original file line number Diff line number Diff line change
Expand Up @@ -270,8 +270,8 @@ HASHMAP_ALWAYS_INLINE hashmap_uint32_t hashmap_clz(const hashmap_uint32_t x);
#define HASHMAP_PTR_CAST(type, x) reinterpret_cast<type>(x)
#define HASHMAP_NULL NULL
#else
#define HASHMAP_CAST(type, x) ((type)x)
#define HASHMAP_PTR_CAST(type, x) ((type)x)
#define HASHMAP_CAST(type, x) ((type)(x))
#define HASHMAP_PTR_CAST(type, x) ((type)(x))
#define HASHMAP_NULL 0
#endif

Expand Down Expand Up @@ -569,15 +569,14 @@ hashmap_uint32_t hashmap_crc32_hasher(const hashmap_uint32_t seed,
}
#endif

/* Robert Jenkins' 32 bit Mix Function */
crc32val += (crc32val << 12);
crc32val ^= (crc32val >> 22);
crc32val += (crc32val << 4);
crc32val ^= (crc32val >> 9);
crc32val += (crc32val << 10);
crc32val ^= (crc32val >> 2);
crc32val += (crc32val << 7);
crc32val ^= (crc32val >> 12);
// Use the mix function from murmur3.
crc32val ^= len;

crc32val ^= crc32val >> 16;
crc32val *= 0x85ebca6b;
crc32val ^= crc32val >> 13;
crc32val *= 0xc2b2ae35;
crc32val ^= crc32val >> 16;

return crc32val;
}
Expand Down
46 changes: 0 additions & 46 deletions test/main.c
Original file line number Diff line number Diff line change
Expand Up @@ -24,51 +24,5 @@
// For more information, please refer to <http://unlicense.org/>

#include "utest.h"
#include "hashmap.h"

UTEST_MAIN()

UTEST(main, one_byte) {
unsigned char data[256];
int i;
struct hashmap_s hashmap;

for (i = 0; i < 256; i++) {
data[i] = (unsigned char)i;
}

ASSERT_EQ(0, hashmap_create(1, &hashmap));

for (i = 0; i < 256; i++) {
ASSERT_EQ(0, hashmap_put(&hashmap, &data[i], 1, NULL));
}

ASSERT_EQ(hashmap_num_entries(&hashmap), 256u);
ASSERT_LE(hashmap_capacity(&hashmap), 2048u);

hashmap_destroy(&hashmap);
}

UTEST(main, two_bytes) {
unsigned short *data;
int i;
struct hashmap_s hashmap;

data = (unsigned short *)malloc(sizeof(unsigned short) * 16384);

for (i = 0; i < 16384; i++) {
data[i] = (unsigned short)i;
}

ASSERT_EQ(0, hashmap_create(1, &hashmap));

for (i = 0; i < 16384; i++) {
ASSERT_EQ(0, hashmap_put(&hashmap, &data[i], 2, NULL));
}

ASSERT_EQ(hashmap_num_entries(&hashmap), 16384u);
ASSERT_LE(hashmap_capacity(&hashmap), 65536u);

hashmap_destroy(&hashmap);
free(data);
}
Loading

0 comments on commit a3ed0ef

Please sign in to comment.