Skip to content

DataSource Manipulation

gourlaysama edited this page Aug 13, 2012 · 4 revisions

Gdms offers a vast API to create, edit, transform data.

The most important interface in Gdms is the DataSource, which is used to access the content of any data set.

Accessing the content of a DataSource

The main goal of Gdms is to provide a common way to access the content of many datasources, whatever their formats. We'll see here some ways to read the content of a data-source using Gdms.

Discover the data

If we get a DataSource object from the DataSourceFactory:

DataSourceFactory dsf = new DataSourceFactory();

DataSource ds = dsf.getDataSource(new File("/home/me/somefile.shp");
ds.open();

We can then get its metadata:

Metadata md = ds.getMetadata();

An instance of Metadata, or rather of one of its realization, will manage a set of couples. Each couple contain:

  • The type of the data that will be contained in the associated column.
  • The name of the column in the table.

The following methods allows to retrieve the number of fields, the type of a field and its name:

int number = md.getFieldCount();
Type t = md.getFieldType(0);
String name = md.getFieldName(0);

Note that we can directly know the type or name of a column by asking the data source:

Type t = ds.getFieldType(0);
String s = ds.getFieldName(0);

Handling spatial data

The DataSource interface provides methods to directly access the (first) spatial column of the table. For example, to access the geometry of the first row, we can do:

Geometry geom = sds.getGeometry(0);

Traveling through a source

We've seen how we can inspect the type of the data, and how to open a data source or a spatial data source. Finally, we need to travel through the source to perform the operations we need. In order not to exceed the size of the table, we retrieve it and we can directly access the values in the table:

int size = sds.getRowCount();
for (int i = 0; i < size; i++) {
  Value v = sds.getFieldValue(i,1);
  // ... use the value
}

The getFieldValue method needs two arguments :

  • The row index (line number).
  • The field index (column number).

A word about values

The data is encapsulated in instances of the Value interface. The getFieldValue(int, int) method we've just seen retrieves one of these raw values. It is up to us to manage it the right way, for instance by asking it which type of Value it actually is. The interface Type defines a set of constants that are used to describe the nature of a value. The constants are:

        int BINARY = 1;
        int BOOLEAN = 2;
        int BYTE = 4;
        int DATE = 8;
        int DOUBLE = 16;
        int FLOAT = 32;
        int INT = 64;
        int LONG = 128;
        int SHORT = 256;
        int STRING = 512;
        int TIMESTAMP = 1024;
        int TIME = 2048;
        int GEOMETRY = 4096;
        int RASTER = 8192;
        int NULL = -1;
        int COLLECTION = 16384;
        int POINT = 32768 | Type.GEOMETRY;
        int LINESTRING = 65536| Type.GEOMETRY;
        int POLYGON = 131072 | Type.GEOMETRY;
        int MULTIPOLYGON = 262144 | Type.GEOMETRYCOLLECTION;
        int MULTILINESTRING = 524288 | Type.GEOMETRYCOLLECTION;
        int MULTIPOINT = 1048576 | Type.GEOMETRYCOLLECTION;
        int GEOMETRYCOLLECTION = 2097152 | Type.GEOMETRY;

A value implements the method getType(), which returns an int corresponding to the associated Type (for example 64, i.e. Type.INT, for an IntValue). In addition, all values contained in a column always share the same type or have a common parent. Testing for inheritance is just as easy:

if (v.getType() & Type.GEOMETRYCOLLECTION == Type.GEOMETRYCOLLECTION) {
  // v is a GEOMETRYCOLLECTION or one of its subtypes, like MULTIPOLYGON
}

Edition

Adding or removing rows in a DataSource is just as easy as reading them.

When is a DataSource editable?

There is two points to check to get a DataSource in edition:

  • Is the underlying format of the source read/write and not just read?
  • Is the current DataSource object allowed to actually edit the content? By default a DataSource returned by the DataSourceFactory does not allow edition:
final DataSourceFactory dsf = new DataSourceFactory();
final DataSource ds = dsf.getDataSource(new File('/home/me/myFile.shp'));
ds.open();

// the following call will throw an UnsupportedOperationException
ds.deleteRow(0);

...

An additional flag is needed to be able to edit a DataSource:

final DataSourceFactory dsf = new DataSourceFactory();
final DataSource ds = dsf.getDataSource(new File('/home/toto/myFile.shp'), DataSourceFactory.EDITABLE);
ds.open();

long count = ds.getRowCount();

// this now deletes the first row of the DataSource
ds.deleteRow(0);

// this flushes the change we just made to the actual file
ds.commit();

ds.close();

What can be changed in a DataSource?

A DataSource opened in edition allows both the manipulation:

  • of the data
  • of the metadata

Manipulating data

Given a DataSource opened in edition, it is possible to:

  • add a row
// this adds an empty row a the end of the table
ds.insertEmptyRow();

// this adds an empty row at a specific index; the rows below are moved 1 index down
ds.insertEmptyRowAt(18);

// this adds a filled row at the end of the table
ds.insertFilledRow(new Value[] {ValueFactory.createValue("a string"), 
    ValueFactory.createValue(14.15E-2)});

// this adds a filled row at a specific index; the rows below are moved 1 index down
ds.insertFilledRowAt(18, new Value[] {ValueFactory.createValue("another string"),
    ValueFactory.createValue(0.0001)});
  • delete a row
// this deletes the row at the given index
ds.deleteRow(4);
  • change the content of a row
long rowIndex = 14;
int column = 0;

// this sets the value at row 14, column 0 to "hey!"
ds.setFieldValue(rowIndex, column, ValueFactory.createValue("hey!"));

Manipulating metadata

Given a DataSource opened in edition, it is possible to:

  • add a column
// this adds a column named "price" of type SHORT as the last column of the table
ds.addField("price", TypeFactory.SHORT);
// by default it only contains NULL values
  • remove a column
// this remove the third column (columns, like rows, start at index 0)
ds.removeField(2);
  • rename a column
// this changes the name of the 5th column to "the_column"
ds.setFieldName(4, "the_column");

To commit or not to commit?

After making multiple changes to a DataSource, it is possible to cancel some/all changes:

  • to go back to the original state of the source
// this completely resets the source. All changes will be lost!
ds.syncWithSource();
  • to undo the last edition action
ds.undo();

It can then be redone with

ds.redo();

The methods canUndo(), canRedo() allow to check if undo/redo will work. If this returns false, a call to the corresponding action will throw an IllegalStateException.

Full example:

ds.syncWithSource();
// there is no change

// let's change a thing twice
ds.setFieldValue(1,0, ValueFactory.createNullValue());
ds.setFieldValue(1,0, ValueFactory.createValue("test");

// undo an action
ds.undo();

// now the value is back to its previous NULL state
boolean isNull = ds.getFieldValue(1, 0).isNull(); // isNull == true

// this will put it back to the state even before that
ds.undo()

// this will redo the NULL action
ds.redo();

// this will redo the setting to "test"
ds.redo();

The DataSource in edition keeps a stack of commands that can be undo/redo. The stack can contain 41 commands (this number currently cannot be changed). When doing a 42nd action, the oldest one will be removed from the undo/redo stack. Then the only way to go back to the original state of the source is using syncWithSource().

Getting a DataSource

The main entry point of Gdms is the DataSourceFactory class. An instance of this class allows the creation of DataSource instances for common sources, like files or databases.

From a File

Getting a DataSource from a file is very easy. For example to open a file at /home/me/myFile.shp:

// we need a DataSourceFactory
DataSourceFactory dsf = new DataSourceFactory();

File f = new File("/home/me/myFile.shp");
DataSource ds = dsf.getDataSource(f);
ds.open();
// use the ds
...

ds.close();

This way of manipulating file inplies that there is a file driver for the specific file format that allows in place reads (and optionally writes). This is different from importing the content of a file: this works directly on the specified file, in the original format.

From a Database Management System

GDMS can be used to work with a Database Management System (DBMS) as well as with file data. Connecting to a DBMS starts with the creation of an object describing the database, how to connect to it, and what table is needed: a DBSource.

final DataSourceFactory dsf = new DataSourceFactory();

// let's create a DBSource describing the table we want to import.
final DBSource source = new DBSource("myserver.mycompany.com", 5432, "mydb",
    "myuser", "mypassword", "mytable", "jdbc:postgresql");

// create a DataSource
DataSource ds = dsf.getDataSource(source);

// that's all! We can transparently use the data from the remote table
...