Skip to content
colestewart edited this page May 7, 2015 · 1 revision

##Introduction##

###License

MercuryDB (HgDB) is licensed under the MIT Open Source License

###Note about Examples in This Document

The Weborders example is located in the root folder of the repository. We will use an HgDB instance built from the weborders package from this example project to demonstrate every aspect of the API.

#Single Table Queries#

Every query on an HgDB instance is done via static methods in the HgDB class in the org.mercurydb.queryutils package. For single table queries, we use the following method:

public static <T> HgStream<T>
query(AbstractFieldExtractablePredicate<T,?>... fieldPredicates)

As an example, we can query the Customer table like so to retrieve all Customer instances where the name (cname) is equal to “Bartholomew Jameson”:

HgStream<Customer> stream =
HgDB.query(CustomerTable.eq.cname(“Bartholomew Jameson”);

for (Customer c : stream) {
    // do something with c
}

By default 7 relations are statically defined in org.mercurydb.HgRelation:

EQ : a==b
NE : a!=b
LT : a<b
LE : a<=b
GT : a>b
GE : a>=b
IN : a∈b

Every public field observed in the database has each of these relations defined for use in queries. EQ and NE require that the field’s type overloads equals() as appropriate (unless the default implementation, object address identity, is the desired implementation). Inequality relations additionally require that the field type TYPE implements the Comparator<TYPE> interface to define the desired “natural order” of the elements. The IN relation requires that the type of the element on the right hand side of the relation is a Java Collection, and that equals() is overloaded as appropriate for the test. For maximum flexibility, the == operator is not used for this test, but we rather use equals().

Note: we explicitly do not provide a mechanism for using these relations on fields of type TYPE along with a Comparator<TYPE> parameter because we will optimize tables for the built-in comparisons specified by equal() and the natural ordering defined by implementing Comparable<TYPE>. Since the optimizations would be lost, we provide a different API which accepts predicates (as defined below) to provide alternate comparisons which don’t use the generated indexes based on the natural ordering.

Users are also free to define their own predicates. Note that in the following code segments we will mostly adhere to the Java 8 lambda syntax for simplicity. However, all HgDB code is backwards compatible down to Java 6 by using alternate syntax (implementing anonymous classes instead of using lambdas). Observe the following stream declarations that result in equivalent streams as above:

// Java 7 syntax
HgPredicate<Customer> cPred = new HgPredicate<Customer> {
    public boolean test(Customer value) {
        return value.cname.equals(“Bartholomew Jameson”);
    }
}

// Java 8 Syntax
// Using Customer predicate, same as above
HgPredicate<Customer> cPred = c -> c.cname.equals(“Bartholomew Jameson”);

HgStream<Customer> stream = HgDB.query(CustomerTable.predicate(cPred));

// Using String predicate on field cname
HgPredicate<String> nPred = cname -> cname.equals(“Bartholomew Jameson”);
stream = HgDB.query(CustomerTable.predicates.cname(nPred));

The above code demonstrates that we may define a predicate over a field specifically via the predicates class member of the table, or we may define a more general predicate over the containing class of the field via the predicate method.

Note: for simplicity and readability, it is not necessary to store the HgPredicate in a variable. As with all Java code, the choice of whether to do this with a temporary variable or in-line in the method call is left up to the author.

Looking back at the prototype of the query method, we see that the method accepts a variable number of predicates. There is an implicit and relation between all of the arguments, such that an entry in the table must pass each of the predicates passed as arguments to be included in the resulting stream. The query method will seed the result with the smallest indexed argument, if there is one, and then perform a filter operation for the rest of the arguments. In order to or the results of two queries, we can either define a custom predicate or concatenate query results using the HgStream’s concat method. Both methods are shown below:

HgStream<Order> stream =
HgDB.query(OrderTable.predicates.ono(o -> o == 1020 || o == 1021));

stream =
HgDB.query(OrderTable.eq.ono(1020))
.concat(HgDB.query(OrderTable.eq.ono(1020));

One other way of further filtering results manually is to use HgStream’s filter method. As an example we will retrieve an unfiltered stream of Order instances using the Order table’s stream() method explicitly and filter on records in which the order number is equal to 1020.

HgStream<Order> stream = 
OrderTable.stream().filter(OrderTable.eq.ono(1020);

#Multi-Table Queries (Joins)#

Multi-Table Queries, or table “joins”, are performed using a set of statically defined methods in the org.mercurydb.HgDB class. We will start by observing a simple join of the Order table and Order Detail table on the order number field present in both classes:

HgTupleStream stream =
HgDB.join(OrderTable.on.ono(), OdetailTable.on.ono());

For those accustomed to relational database queries in SQL, this is functionally equivalent to:

SELECT * FROM ORDERS INNER JOIN ODETAILS ON ORDERS.ONO=ODETAILS.ONO

All join operations in HgDB are inner joins. That is, joins return all records where the results from table 1 intersect with table 2. The default relation for the join is equality, and in the above query the relation was omitted. As a third argument to the join, we can use any of the 7 defined relations in HgRelation, or we can also define our own custom predicate. We will rewrite the above query in a couple of ways:

// With the explicit equality relation as the third argument
HgTupleStream stream =
HgDB.join(OrderTable.on.ono(), OdetailTable.on.ono(), HgRelation.EQ);

// With a custom predicate
HgBiPredicate<Integer, Integer> eqPred = (o, od) -> o == od;
stream = HgDB.join(OrderTable.on.ono(), OdetailTable.on.ono(), eqPred);

Like the HgPredicate class for predicates involving records from single table queries, we define an HgBiPredicate class for defining custom predicates involving records from two tables.

Joins involving more than two fields must be performed by wrapping the join arguments in JoinPredicate instances or by feeding the resulting HgTupleStream of one join result into another join. In the first case HgDB will try and perform the joins in an optimal order. In the second case the order is defined by the user. The first case is shown below:

// Using the HgDB optimizer
HgTupleStream stream =
HgDB.join(
    new JoinPredicate(
        OrderTable.on.ono(), 
        OdetailTable.on.ono()),
    new JoinPredicate(
        CustomerTable.on.zipcode(), 
        OdetailTable.on.storezip()),
    new JoinPredicate(
        ZipcodeTable.on.zipcode(),
        CustomerTable.on.zipcode()));

This is functionally equivalent to the following SQL:

SELECT * FROM ORDERS, ODETAILS, CUSTOMERS, ZIPCODES
  WHERE ORDERS.ONO=ODETAILS.ONO
    AND CUSTOMERS.ZIPCODE=ODETAILS.STOREZIP
    AND ZIPCODES.ZIPCODE=CUSTOMERS.ZIPCODE

Notice that in the three JoinPredicate instances in the query, the third implicit argument to specify the relation or predicate is omitted. Other relations and predicates could be specified as well.

Also notice that the JoinPredicates must unify and be compatible with one another for the method to work correctly, otherwise the method will throw an exception to the user.

Now we must discuss how elements are retrieved from the HgTupleStream instances returned by join operations in HgDB. An HgTupleStream is essentially an HgStream<HgTuple> with additional functionality for defining join metadata, like the field we are joining on. There is an ID system in place for retrieving elements from HgTuple instances. By default every table has a publicly defined TableID instance named ID that is globally unique. Using these id’s we can retrieve elements from a tuple like so:

HgTupleStream stream =
HgDB.join(OrderTable.on.ono(), OdetailTable.on.ono());

for (HgTuple t : stream) {
    Order o = t.get(OrderTable.ID);
    Odetail od = t.get(OdetailTable.ID);
}

#Join Aliases#

There is also an aliasing system in place that is necessary for performing self joins. An example is shown below:

TableID<Part> partAlias = PartTable.createAlias();
HgTupleStream stream =
HgDB.join(
    PartTable.on.price(),               // standard PartTable.ID used
    PartTable.as(partAlias).on.price(), // created partAlias used
    HgRelation.LT);

for (HgTuple t : stream) {
    Part p1 = t.get(PartTable.ID);
    Part p2 = t.get(partAlias);
    assert(p1.price < p2.price);
}

We allow the client to continue to make permanent identifiers, or “names”, until the first alias is created. In this way it is encouraged that if an alias is to be used often another permanent identifier A name can be created using the TableID.createName() method, which will reserve an id for the life of the program.

TableID<Part> partID = TableID<Part>.createName();

#Database Indexes#

Indexes are created in the source code by annotating the desired field in the source class with the @HgIndex annotation. Placing an index on a field in HgDB results in a Map<FieldType, TableType> for that index in the table classes. Note that currently all indexes and fields monitored by HgDB must be publicly visible, but we are working on other solutions. Observe the Order class from the weborders example and the generated table class by the HgDB bootstrap process.

public class Order {
    … 
    @HgIndex
    public int ono;
    …
}


public class OrderTable {
    …
    TreeMap<Integer, Order> onoIndex = new TreeMap<Integer, Order>();
    …
}

The HgIndex annotation has one parameter, ordered. By default this value is true. If ordered is true, and the indexed field implements the Comparable interface, then the resulting index will be a TreeMap. Otherwise the index will be a HashMap. The benefits for using an ordered index are clear if joins using an inequality relation are used. The following relations are supported by ordered indexes in HgDB:

HgRelation.EQ  // =
HgRelation.NE  // !=
HgRelation.LT  // <
HgRelation.LE  // <=
HgRelation.GT  // > 
HgRelation.GE  // >=

Only the equality relations, HgRelation.EQ and HgRelation.NE, are defined for non-ordered joins, as it is not possible to implement the other ordered inequality relations using a HashMap. However, the equality relation will be slightly faster if a non-ordered index is used instead of an ordered index.

#Join Algorithms#

4 join algorithms are currently supported in HgDB:

Nested Loops
Index Scan
Temporary Index Join
Index Intersection

If an index is present, and the join relation is known to be supported by the index, an index scan is performed. If there is no index present, and it is possible to index the join relation, then a temporary index is built and used to perform the join operation. If a custom predicate is defined by the client, then nested loops must be used. In our tests, it has not proven worthwhile to perform an index intersection if an index is present on both fields in the join relation. There may be some cases where an index intersection is faster than an index scan, but we have not yet explored these cases.

Clone this wiki locally