Skip to content

MrGeo Build Instructions

Dave Johnson edited this page Sep 25, 2017 · 19 revisions

Building MrGeo

Before attempting to build MrGeo, please be sure you have set up its dependencies as described in Dependencies.

MrGeo uses Maven as its build system. The Maven POMs are fairly complex and need many different arguments. To make building simpler, there is a build script, scripts/mvn-build, to make development easier.

Here is the result of scripts/mvn-build --help:

Usage: scripts/mvn-build [module] [build type] [phase] <args>
-----------------------------
module:
  all          - all modules
  core         - mrgeo-core
  vector       - mrgeo-core/mrgeo-vector
  mapalgebra   - mrgeo-core/mrgeo-vector/mrgeo-mapalgebra
  cmd          - mrgeo-core/mrgeo-vector/mrgeo-cmd
  dataprovider - mrgeo-core/mrgeo-dataprovider
  services     - mrgeo-core/mrgeo-vector/mrgeo-services
  python       - mrgeo-core/mrgeo-vector/mrgeo-dataprovider/mrgeo-mapalgebra/mrgeo-python
  ogc          - mrgeo-core/mrgeo-services-(core, wms, wcs, tms)
build type:
  apache    - apache hadoop
  apache273 - Apache hadoop, version 2.7.3
  apache272 - Apache hadoop, version 2.7.2
  apache271 - Apache hadoop, version 2.7.1
  apache260 - Apache hadoop, version 2.6.0
  apache252 - Apache hadoop, version 2.5.2
  apache251 - Apache hadoop, version 2.5.1
  apache250 - Apache hadoop, version 2.5.0
  apache241 - Apache hadoop, version 2.4.1
  apache240 - Apache hadoop, version 2.4.0
  cdh5121   - Cloudera hadoop 5.12.1
  cdh5120   - Cloudera hadoop 5.12.0
  cdh5111   - Cloudera hadoop 5.11.1
  cdh5110   - Cloudera hadoop 5.11.0
  cdh5101   - Cloudera hadoop 5.10.1
  cdh5100   - Cloudera hadoop 5.10.0
  cdh591    - Cloudera hadoop 5.9.1
  cdh590    - Cloudera hadoop 5.9.0
  cdh582    - Cloudera hadoop 5.8.2
  cdh580    - Cloudera hadoop 5.8.0
  cdh571    - Cloudera hadoop 5.7.1
  cdh570    - Cloudera hadoop 5.7.0
  cdh560    - Cloudera hadoop 5.6.0
  cdh552    - Cloudera hadoop 5.5.2
  cdh551    - Cloudera hadoop 5.5.1
  cdh550    - Cloudera hadoop 5.5.0
  emr580 - Amazon EMR 5.8.0 with hadoop 2.7.3
  emr530 - Amazon EMR 5.3.0 with hadoop 2.7.3
  emr500 - Amazon EMR 5.0.0 with hadoop 2.7.2
  emr472 - Amazon EMR 4.7.2 with hadoop 2.7.2
  emr471 - Amazon EMR 4.7.1 with hadoop 2.7.2
  emr470 - Amazon EMR 4.7.0 with hadoop 2.7.2
  emr460 - Amazon EMR 4.6.0 with hadoop 2.7.2
  emr450 - Amazon EMR 4.5.0 with hadoop 2.7.2
  emr440 - Amazon EMR 4.4.0 with hadoop 2.7.1
  mapr      - mapr hadoop
phase:
  build     - build a deployable version
  test      - build, then run unit tests
  verify    - run integration tests
  deploy    - build and deploy
  clean     - clean the build
  version   - change the version within all poms (use 'revert' to revert to previously saved version, if any)
  eclipse   - build eclipse files for the project
args:
  -c  --conf <path>              - location of hadoop conf files (/usr/local/hadoop/conf)
  -f  --failfast                 - fail fast tests (immediately stop on test failure)
  -g  --geowave                  - build the GeoWave data provider
  -gj --generate-javadocs        - generate javadoc jars
  -gs --generate-sources         - generate source jars
  -j  --javadocs                 - include javadocs of dependencies (if available)
  -jv  --javaversion <version>   - java version to use (1.7, 1.8)
  -l  --license                  - generate licenses (normall off)
  -p  --profile                  - turn on leak detection profiling
  -s  --source                   - include source jars of dependencies (if available)
  -sh --shade                    - generate the shaded (jar with dependencies) jars
  -y  --yarn                     - use hadoop YARN (for hadoop 2+, instead of mr1)
  -q  --quiet                    - quiet (no prints from this script)
other:
  buildtype  - return the build type (apache273, cdh5121, etc.), no further processing is done
 
  all other args will be passed to maven directly

NOTE: The Hadoop versions supported expands all the time. Run with the --help option to see the complete, up-to-date versions

NOTE: The mvn-build script prints the Maven command it is executing so you can reference what is actually happening in the build system.

For even more convenience, there are scripts in the top-level directory for building MrGeo, running unit tests, running integration tests, generating Eclipse projects, etc. in the top-level MrGeo directory. There are "build", "test", "verify", and "eclipse" respectively.

Run the build script as follows from the top-level directory of MrGeo. Note that the cdh5121 argument can be replaced based on the version of hadoop you run against. Available options are shown above in the usage output. The arguments for skipping the dataprovider tests bypass the testing of the Accumulo data provider. The -jv argument tells the script to use Java 8.

./build cdh5121 -jv 1.7 -Dskip.mrgeo.dataprovider.integration.tests=true -Dskip.mrgeo.dataprovider.tests=true

If you are using YARN, then include an additional argument to the above command, "--yarn".

The "test", "verify" and "eclipse" scripts all take the same set of arguments as the "build" script.

For even more convenience, the mvn-build script checks for the existence of a MRGEO_BUILD_OPTIONS environment variable for obtaining the arguments to use. This enables running the scripts without re-typing all of the arguments each time.

export MRGEO_BUILD_OPTIONS="cdh5121 -jv 1.7 -Dskip.mrgeo.dataprovider.integration.tests=true -Dskip.mrgeo.dataprovider.tests=true"

Building MrGeo Web Services

MrGeo contains code that implements a subset of OGC web services. It can be run on the command line for testing and development. It can also be hosted within a web server.

To build the stand-alone web server, from the MrGeo top-level directory, run the following (changing settings as required for your environment):

./build ogc apache272 -Pstandalone-webserver -jv 1.8 -Dgdal.version=1.11.4

To also include the building a WAR file for deploying MrGeo to a web server, add "-Pinclude-war" to the build command line. The WAR file will be in the distribution/distribution-war/target subdirectory.

If you are using AWS and your MrGeo images are stored in S3, you can add -Penable-s3a on the build command line to enable access to S3 from your web service.

When the build is complete, in the distribution/distribution-tgz/target subdirectory, there is a tarball containing the install footprint of the MrGeo web services.

build

build builds the specified MrGeo modules (defaults to all). This command only builds, no tests are run.

examples

build apache220

Builds for Apache Hadoop 2.2.0, uses defaults for all other options. Note: uses MapReduce v1, not MapReduce v2 (YARN)

build cdh590 --yarn

Builds for Cloudera Hadoop 5.9.0, using YARN.

build apache272 --yarn --jv 1.8 --shade

Builds for Apache Hadoop 2.7.2, using YARN, forcing Java 8, and creating shaded (jar with dependencies) jars as well

clean

clean removes all existing build artifacts

examples

clean

Removes all build artifacts

deploy

deploy runs the Maven deploy phase, which build and deploys MrGeo to a Maven Repository. This can be used to release MrGeo into internal artifactory repositories

examples

eclipse

eclipse generated the necessary .project and .classpath files for eclipse

examples

integration

integration builds and runs the MrGeo integration tests

rebuild

rebuild combines the clean and build commands

release

release does what?

shade

test

test builds and runs unit tests only

verify

verify builds and runs unit and integration tests

version

Clone this wiki locally