Elsa is object graph serialization framework for Java. It has good compatibility with Java Serialization, but is faster and more space efficient. Elsa is great for storing objects on disk, network transfer, deep cloning etc..
Elsa handles cyclic references and Java Serialization features such as Externalizable
or
writeReplace()
.
Elsa was originally part of MapDB database engine, but was moved into separate library.
Manual is hosted on gitbooks.
TODO once it is finished, make readme.md
shorter.
Elsa is available in Maven repository. Jar files can be downloaded here, currently Elsa has no dependencies and requires Java6. Maven snipped is bellow, latest VERSION is
<dependency>
<groupId>org.mapdb</groupId>
<artifactId>elsa</artifactId>
<version>VERSION</version>
</dependency>
Code examples are on github.
Here is simple Hello World example:
// import org.mapdb.elsa.*;
// data to be serialized
String data = "Hello World";
// Construct Elsa Serializer
// Elsa uses Maker Pattern to configure extra features
ElsaSerializer serializer = new ElsaMaker().make();
// Elsa Serializer takes DataOutput and DataInput.
// Use streams to create it.
ByteArrayOutputStream out = new ByteArrayOutputStream();
DataOutputStream out2 = new DataOutputStream(out);
// write data into OutputStream
serializer.serialize(out2, data);
// Construct DataInput
DataInputStream in = new DataInputStream(
new ByteArrayInputStream(out.toByteArray()));
// now deserialize data using DataInput
String data2 = (String)serializer.deserialize(in);
Bug reports go to Issue tracker.
For questions and suggestions use MapDB support channels (chat, mailing list, subreddit). We also provide professional support and consulting.
Documentation is provided in form of examples. TODO javadoc on web.
To speedup serialization Elsa comes with serializers for well known java.lang
and java.util
classes.
Serializers are recursive and will continue graph traversal,
for example Map
serializer will continue graph traversal over keys and values.
Users can also install their own serializers.
For objects with no serializer Elsa will use slower field traversal to dive into Object Graph.
By default Elsa has serializers for following classes:
-
All primitive types and their arrays:
double
,long
,int
,byte[]
... -
All primitive wrappers:
Double
,Long
,Integer
... -
Generic array
Object[]
-
Collections:
ArrayList
,LinkedList
,HashSet
,LinkedHashSet
andTreeSet
-
Maps:
HashMap
,LinkedHashMap
,TreeMap
andProperties
-
BigDecimal
,BigInteger
,UUID
andDate
-
java.lang.Class
It is possible to register custom serializers. Those are part of graph traversal, and are applied on objects inside graph (collections entries and field values).
TODO better documentation for custom serializers
Consider following example:
List list = ArrayList();
list.add(list);
Object a = "some huge object";
list.add(a);
list.add(a);
That is Cyclic Reference and could send graph traversal into infinitive loop.
Object a
is in graph twice and could cause space overhead if serialized twice.
To prevent that Elsa on serialization tracks already visited objects in IdentityHashMap
.
Secondary visit will only write number as reference.
On deserialization references are restored and identity is preserved.
Reference tracking also works for user defined serializers, and for collection serializers.
Maintaining IdentityHashMap
has some overhead.
So there is an option to disable this feature completely. Use ElsaMaker.referenceDisable()
to disable reference tracking
Or IdentityHashMap
can be replaced with simple Object[]
where for-loop with identity ==
check on each item.
That is faster on very small graphs with only a few items. Use ElsaMaker.referenceArrayEnable()
to enable identity array checks.
Finally there is an option to deduplicate references by replacing IdentityHashMap
with regular HashMap
. In this case two equal objects which are not identical,
will become identical after deserialization. This adds some overhead on serialization for hashing
and equality check, but has no overhead on deserialization.
Use ElsaMaker.referenceHashMapEnable()
to enable it.
There is a reference handling example with all configuration options.
Elsa tries to be compatible with Java Serialization. We require all classes
to implement Serializable
. We handle Externalizable
interfaces correctly.
Elsa also provides hacked java.io.ObjectInputStream
and
java.io.ObjectOutputStream
. And finally it handles less known writeReplace
methods and so on.
In some cases Elsa will fallback into using Java Serialization.
TODO
Use serializer.clone(object)
.
TODO
Serialization format usually stores class structure metadata (field names, field order, data types) together with serialized data. Size of serialized data can be greatly reduced by externalizing class structure information. In example bellow it is 5 bytes versus 55 bytes.
Elsa can store class structure information outside of serialized data. There are more ways. MapDB p Class Catalog to handle class format versions, renamed fields and so on.
Simpler and more accessible way assumes that class format never changes. That serialization and deserialization share classes with exactly the same structure (no renamed fields etc). In that case we can use simple class registration:
Simplest way to externalize class structure metadata
is to register classes in ElsaMaker
.
Each registered class is parsed into structural information
and added into Class Catalog.
An example howto register classes, here is shorter version:
ElsaSerializer serializer = new ElsaMaker()
//this registers Bean class into class catalog
.registerClasses(Bean.class, Bean2.class)
.make();
In binary format the class is represented by its index in an array. So its critical to register classes at the same order every time. Otherwise you will be unable to deserialize data.
Elsa has callback to notify user about classes not presented in Class Catalog. This way you assemble list of all classes used in an object graph.
TODO provide an example. ElsaMaker unknownClassNotification(ClassCallback callback)
Some special instances can be treated as singletons.
Those do not have to be serializable, Elsa just uses instance supplied by user.
In example bellow we serialize Thread.currentThread()
in binary format.
Elsa does not try to serialize singleton into binary form,
it just writes singleton ID. On deserialization
it finds ID and uses Singleton instance.
Singletons have reference equality (==
) preserved even after binary deserialization.
It is easy to register singleton in ElsaMaker
. Here is shorter example:
ElsaSerializer s = new ElsaMaker()
// Current thread is singleton
.singletons(Thread.currentThread())
.make();
In binary format the singleton is represented by its index in an array. So its critical to register singletons at the same order every time. Otherwise you will be unable to deserialize data.