-
Notifications
You must be signed in to change notification settings - Fork 5
GumTree Data Model_141623313
nxi edited this page Apr 9, 2015
·
1 revision
Created by Tony Lam, last modified by Norman Xiong on Mar 16, 2010
This section instructs on how to become members of CodeHaus and check out the GDM project.
- Apply for user account at codehau
- Visit http://xircles.codehaus.org/signup. Fill in the details to apply for an account.
- Apply to join as a GumTree Developer
- Check out the following projects:
- https://svn.codehaus.org/gumtree/datamodel/trunk/ncsa.hdf
- https://svn.codehaus.org/gumtree/datamodel/trunk/ucar.netcdf
- https://svn.codehaus.org/gumtree/datamodel/trunk/org.freehep.jas.jas3
- https://svn.codehaus.org/gumtree/datamodel/trunk/org.freehep.jas.jas3.win32.x86(optional)
- https://svn.codehaus.org/gumtree/datamodel/trunk/org.freehep.jas.jas3.linux.x86(optional)
- https://svn.codehaus.org/gumtree/datamodel/trunk/org.jdom
- https://svn.codehaus.org/gumtree/datamodel/trunk/org.gumtree.data.core
- https://svn.codehaus.org/gumtree/datamodel/trunk/org.gumtree.data.nexus
- https://svn.codehaus.org/gumtree/datamodel/trunk/org.gumtree.data.test
- Or simply check out all projects under https://svn.codehaus.org/gumtree/datamodel/trunk.
- Find test cases in the org.gumtree.data.test plugins. Run the following JUnit test cases:
TestWriteToRoot.java (read and write hdf files into GDM objects)
TestReadNexusFile.java (read nexus files)
TestCopyNexusFile.java (write nexus files) - In Eclipse environment, you can run the JUnit test cases in two mode:
Run as JUnit test. In the run configuration, add VM arguments: -Djava.library.path=${workspace_loc}/ncsa.hdf. You also need to include log4j logging library in your Classpath.
Run as JUnit plug-in test. Please set run mode in headless mode. - There are example NeXus files in the org.gumtree.data.test/storage folder for testing purpose. There are also common HDF files which are generated by the test cases.
- Packaging: The GDM project is packed in Eclipse plug-ins.
org.gumtree.data.core plugin:
org.gumtree.data – The GDM interface package. All the classes for the tree structure used in the GDM model are implementing the interfaces in this package.
org.gumtree.data.exception – Exceptions.
org.gumtree.data.io – IWriter interface which performs exporting. A default implementation in HDF file exporting is included.
org.gumtree.data.math – The meths library for GDM arrays.
org.gumtree.data.netcdf – The default (Netcdf) implementations of the GDM interfaces.
org.gumtree.data.utils – The utilities for GDM model.
org.gumtree.data.nexus plugin:
org.gumtree.data.nexus – The interfaces for NeXus file format.
org.gumtree.data.nexus.netcdf – The default (Netcdf) implementations of the NeXus interface.
org.gumtree.data.nexus.utils – The utilities class for NeXus file format.
- Package dependency diagram
The design of the GDM project is to allow different implementations for the model. The default implementation which uses Netcdf library is included in the plugin. All the utility logics, such as i/o, fitting, and maths are going to be decoupled from the default implementations, so that they can be valid for other implementations.
The NeXus model project is in a separate plugin for options. The way a NeXus file is mapped into the model is only for ANSTO usage at this moment. It is open for discussion that whether we should provide a more generic model.
- The class diagram of the Gumtree data model (GDM) is show in the below picture.
-
The concrete types: When a physical data set is loaded into the java virtual machine, Gumtree data model can map them into four types of objects, Dataset, Group, DataItem or Attribute. We call them concrete types.
Dataset – is mapped to a physical data file or memory section.
Group – is a logical collection of other groups and data items.
DataItem – is a logical container of data.
Attribute – is the metadata of groups and data items.
Array – is an abstract of array data in different types, ranks and shapes.
Object – is the abstraction of Group and DataItem. -
The functional type: The rest of the types are used for performing certain functions.
Dictionary – is used to map an x-path to key name when looking for a sub-group or data item.
Dimension – is used to associate data items that share similar dimension information.
Index – is used to locate a unit value in an array of data.
ArrayIterator – is used to iterate through an array of data.
SliceIterator – is used to iterate through the slices of an array.
Range – is used to describe a section of an array.
-
The concrete types: When a physical data set is loaded into the java virtual machine, Gumtree data model can map them into four types of objects, Dataset, Group, DataItem or Attribute. We call them concrete types.
- The original data model used in the default implementation is show as below. When it is mapped to the above interfaces used in GDM, extended classes are used for implementing the Dataset, Group and DataItem. Array type is simplified and wrapped in the default implementation. More information about the CDM model is available at http://www.unidata.ucar.edu/software/netcdf-java/CDM/ .
Image:attachments/141623313/141754372.png
Copyright of this diagram belongs to the UCAR.
- You can use the GDM project to perform the following tasks:
- Opening an hdf file.
- Map the contents of the file into GDM objects.
- Find certain data item by the key name.
- Read the data as arrays.
- Carry out maths or other logic.
- Write arrays as data items in existing files or new files.
- Generate new data structures in the memory only.
- Below is the sample code of reading an hdf file. The file gets mapped into a dataset with a group tree structure. When the Dataset instance is created, the file is not open yet. You need to call dataset.open() to make the root of the file accessible.
URI fileURI = new URI("file:/C:/example.hdf"); IDataset dataset = Factory.createDatasetInstance(fileURI); dataset.open(); IGroup root = dataset.getRootGroup(); … dataset.close();
- Once the dataset is open, you can use the root group to access any sub-group or data items in those groups.
- Use get() method to access any object that is directly attached to a group or data item, for example, use
String groupName = "sub_group_name"; IGroup subGroup = root.getGroup(groupName); String itemName = "data_item_name"; IDataItem item = root.getDataItem(itemName);
- Use find() method to access any object that is referenced by a key name in the dictionary. For example,
String keyGroupName = "key_name_of_a_group"; IGroup aGroup = group.findGroup(keyGroupName); String keyItemName = "key_name_of_a_data_item"; IDataItem anItem = group.findDataItem(keyItemName);
- Use the code below to get a metadata of a group or dataitem.
String attributeName = "name_of_attribute"; IAttribute attribute = gdmObject.getAttribute(attributeName);
- Use get() method to access any object that is directly attached to a group or data item, for example, use
- When you find your data item, use the code below to read the data out as an Array.
IArray array = item.getData();
int[] origin = new int[]{0, 0, 0}; int[] shape = new int[]{3, 5, 5}; IArray array = item.getData(origin, shape);
- To access values of Array object, use SliceIterator, Iterator or Index for the help.
- SliceIterator helps to iterate through array as slices. For example, in order to iterate a 3x2x2 array as 3 array slices, each with 2x2 shape, use the following code:
ISliceIterator sliceIterator = array.getSliceIterator(2); while (sliceIterator.hasNext()) { Array slice = sliceIterator.getArrayNext(); … }
- ArrayIterator helps to iterate through each individual value of an array. For example, for a double type array, use the following code to access each value:
IArrayIterator iterator = array.getIterator(); while (iterator.hasNext()) { double value = iterator.getNextDouble(); }
- Index helps to locate a value in an arbitrary location of the array. For example, the following code helps to read a value out from an array at the [1,] coordinate position, assume the shape of the array is larger than this.
IIndex index = array.getIndex(); index.set(1, 1, 2); double value = array.getDouble(index);
- SliceIterator helps to iterate through array as slices. For example, in order to iterate a 3x2x2 array as 3 array slices, each with 2x2 shape, use the following code:
- To carry out maths logic of GDM arrays, please read Javadoc of
org.gumtree.data.Array,
org.gumtree.data.math.GMath, and,
org.gumtree.data.math.EMath (for error propagation). - To carry out other logic, please read Javadoc of
org.gumtree.data.Array,
org.gumtree.data.utils.Utilities, and,
org.gumtree.data.fitting.Fitter. - To create empty Dataset, Group in the memory, use the factory method. Here are the example of using the Factory of the default implementation:
IDataset newDataset = Factory.createEmptyDatasetInstance(); IGroup newGroup = Factory.createGroup(newDataset.getRootGroup(), "name_of_new_group", true);
IGroup = Factory.createGroup("name_of_new_group");
- The i/o package provides interface and default implementation of writing GDM objects into hdf files. An example below shows how to write a group to a root of a hdf file:
File file = new File("C:/test.hdf"); IWriter writer = new NcHdfWriter(file); writer.writeToRoot(group);
Image:images/icons/bullet_blue.gif
(image/x-png)
Image:images/icons/bullet_blue.gif
(image/png)
Image:images/icons/bullet_blue.gif
(image/png)
Image:images/icons/bullet_blue.gif
(image/jpeg)

Image:images/icons/bullet_blue.gif

Image:images/icons/bullet_blue.gif

Image:images/icons/bullet_blue.gif

Document generated by Confluence on Apr 01, 2015 00:11
Home | Developer Guide | Copyright © 2013 ANSTO