Skip to content

JDBC plugin feeder mode as an alternative to the deprecated Elasticsearch River API

Jörg Prante edited this page Mar 31, 2015 · 2 revisions

How to set up a JDBC feeder script for Elasticsearch

Now that rivers are deprecated, I want to show how to set up an alternative method for indexing data from a JDBC source into Elasticsearch.

The setup is very easy.

In this example, I assume you are on Linux and want to connect to a MySQL DB.

  1. Install Elasticsearch as usual

  2. Install JDBC plugin with the ./bin/plugin tool and add a JDBC driver jar as usual. This means, you must install JDBC plugin on all nodes of the cluster and you must restart all nodes to activate the plugin cluster-wide.

  3. Now we decide not to use the river API, but instead the feeder mechanism, which works by starting a separate JVM.

  4. Set a bash variable ES_HOME to the home directory of Elasticsearch (if not already exists)

  5. Create a JDBC feed script directory

    mkdir -p $ES_HOME/bin/feeder/
    
  6. Create JDBC feed script $ES_HOME/bin/feeder/mysql.sh

    #!/bin/bash
    
    # ES_HOME required to detect elasticsearch jars
    
    export ES_HOME=~es/elasticsearch-1.4.4
    
    echo '
    {
    "elasticsearch" : {
    	 "cluster" : "elasticsearch",
    	 "host" : "localhost",
    	 "port" : 9300
    },
    "type" : "jdbc",
    "jdbc" : {
    	"url" : "jdbc:mysql://localhost:3306/test",
    	"user" : "",
    	"password" : "",
    	"sql" :  "select *, order_id as _id from orders"
      }
    }
    ' | java \
    -cp "${ES_HOME}/lib/*:${ES_HOME}/plugins/jdbc/*" \
    org.xbib.elasticsearch.plugin.jdbc.feeder.Runner \  
    org.xbib.elasticsearch.plugin.jdbc.feeder.JDBCFeeder
    
  7. Set execution attribute with

    chmod +x $ES_HOME/bin/feeder/mysql.sh
    
  8. Create a log4j.properties file and add it to the classpath, e.g. $ES_HOME/lib/log4j.properties

    log4j.rootLogger=DEBUG, out
    log4j.appender.out=org.apache.log4j.ConsoleAppender
    log4j.appender.out.layout=org.apache.log4j.PatternLayout
    log4j.appender.out.layout.ConversionPattern=[%d{ABSOLUTE}][%-5p][%-25c][%t] %m%n
    
  9. Start feeder with

    $ES_HOME/bin/feeder/mysql.sh
    

and now you will see debug messages how the feeder starts up and tries to connect to the specified JDBC source and Elasticsearch node, performing the heavy work.

Voilà!

Clone this wiki locally