Off heap large collections is a group of collections that are stored directly off heap, this is suitable for very large data structures that might not work with regular situations due to GC.
Also, as a big advantage off heap large collections (so far) has no dependencies at all, that means it's super light-weight and would nott cause an compatibility issues with other packages.
The goal of this project is to support varied collections similar to the JDK default collections.
Currently supported collections:
- LargeHashSet: an open-addressing based hash set
- LargeHashMap: an open-addressing based hash map
Planned to support for version 1.1:
- LargeTreeSet
- LargeTreeMap
// Create key and value serializers
StringSerializer stringSerializer = new StringSerializer();
IntSerializer intSerializer = new IntSerializer();
// Initialize and fill map
LargeMap<String, Integer> nameToAge = LargeHashMap.of(stringSerializer, intSerializer);
try (Stream<String> lines = Files.lines(Paths.get("nameToAge.csv"))) {
lines.map(line -> line.split(","))
.forEach(parts -> nameToAge.put(parts[0], Integer.parseInt(parts[1])));
}
// Do lookups with map, possibly load it as a server side cache
// ...
// Do not forgot to close the map to release the allocated memory
nameToAge.close();
Off heap large collections requires the use of serializers to serialize and deserialize the data into and out of memory; there are two types of serializers:
- Variable size serializers: As the name implies these should be used to store variable sized data, the downside here is using an additional 4 bytes per object storing it's size in bytes, there are two implemented in the framework
StringSerializer
andArraySerializer
, you can easily implement your own by implementing theObjectSerializer
interface - Fixed size serializers: These store fixed width data, hence it doesn't use the additional 4 bytes required by the variable serializers, the framework currently implements all the basic types (
BooleanSerializer
,ByteSerializer
,CharSerializer
,DoubleSerializer
,DoubleSerializer
,FloatSerializer
,IntSerializer
,LongSerializer
,ShortSerializer
) using this method, you can also easily implement your own by extending theFixedSizeObjectSerializer
abstract base class
<dependencies>
<dependency>
<groupId>com.github.mina-asham</groupId>
<artifactId>offheap-largecollections</artifactId>
<version>1.0</version>
</dependency>
</dependencies>
com.github.mina-asham:offheap-largecollections:1.1-SNAPSHOT
At the moment this project is compatible with JDK8+, at the moment the project will not compile with JDK9 as lombok does not support the JDK9 compiler
Clone:
git clone https://github.com/mina-asham/OffHeapLargeCollections.git
Download dependencies:
mvn clean install
Run tests:
mvn clean test
The off heap allocation, reading, and writing heavily relies on the sun.misc.Unsafe
module, this module will be available in Java 9 but might require special flags to enable, this will be updated when Java 9 is released.
Coming soon.
MIT License
Copyright (c) 2017 Mina Asham
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.