Skip to content
This repository has been archived by the owner on Oct 5, 2021. It is now read-only.

Configuring SparkGraphComputer for OLAP #269

Open
sandeepdoctily opened this issue Apr 14, 2018 · 2 comments
Open

Configuring SparkGraphComputer for OLAP #269

sandeepdoctily opened this issue Apr 14, 2018 · 2 comments

Comments

@sandeepdoctily
Copy link

sandeepdoctily commented Apr 14, 2018

Hi All

I am trying to configure sparkGraphComputer with DYnamodb local.
Please find below the configuration. Kindly help me out.

TinkerPop Hadoop Graph for OLAP

gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph

Set the default OLAP computer for graph.traversal().withComputer()

gremlin.hadoop.defaultGraphComputer=org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer

gremlin.hadoop.graphInputFormat=org.apache.hadoop.dynamodb.read.DynamoDBInputFormat

gremlin.hadoop.graphOutputFormat=org.apache.hadoop.dynamodb.write.DynamoDBOutputFormat

####################################

SparkGraphComputer Configuration

####################################

spark.master=local[*]

spark.executor.memory=200m

spark.serializer=org.apache.spark.serializer.KryoSerializer

spark.akka.timeout=500000

#spark.kryo.registrationRequired=false

spark.storage.memoryFraction=0.2

spark.eventLog.enabled=true

spark.eventLog.dir=/tmp/spark-event-logs

spark.ui.killEnabled=true

spark.dynamicAllocation.enabled=false

spark.network.timeout=60000

spark.rpc.askTimeout=80000

spark.sql.broadcastTimeout=90000

#spark.serializer=org.apache.spark.serializer.KryoSerializer

#janusgraphmr.ioformat.conf.storage.backend==com.amazon.janusgraph.diskstorage.dynamodb.DynamoDBStoreManager

#janusgraphmr.ioformat.conf.storage.dynamodb.client.credentials.class-name=com.amazonaws.auth.BasicAWSCredentials

#janusgraphmr.ioformat.conf.storage.dynamodb.client.credentials.constructor-args=access,secret

#janusgraphmr.ioformat.conf.storage.dynamodb.client.signing-region=us-east-1

#janusgraphmr.ioformat.conf.storage.dynamodb.client.endpoint=http://localhost:8000

#gremlin.graph=org.janusgraph.core.JanusGraphFactory

#metrics.enabled=true

#metrics.prefix=j

#metrics.csv.interval=1000

#metrics.csv.directory=metrics

storage.write-time=1 ms

storage.read-time=1 ms

storage.backend=com.amazon.janusgraph.diskstorage.dynamodb.DynamoDBStoreManager

storage.dynamodb.client.credentials.class-name=com.amazonaws.auth.BasicAWSCredentials

storage.dynamodb.client.credentials.constructor-args=access,secret

storage.dynamodb.client.signing-region=us-east-1

storage.dynamodb.client.endpoint=http://localhost:8000

When I run a query I get the below expection:
gremlin> g.V().count()

java.lang.RuntimeException: class org.apache.hadoop.dynamodb.read.DynamoDBInputFormat not org.apache.hadoop.mapreduce.InputFormat
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2221)
at org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer.lambda$submitWithExecutor$0(SparkGraphComputer.java:177)

@sandeepdoctily sandeepdoctily changed the title Using SparkGraphComputer for OLAP Configuring SparkGraphComputer for OLAP Apr 14, 2018
@amcp
Copy link
Contributor

amcp commented May 9, 2018

DynamoDBInputFormat is not implemented yet, but could be implemented by copying from or depending on the DynamoDB EMR connector. https://github.com/awslabs/emr-dynamodb-connector/blob/master/emr-dynamodb-hadoop/src/main/java/org/apache/hadoop/dynamodb/read/DynamoDBInputFormat.java

@danielwhatmuff
Copy link

Is DynamoDBInputFormat now implemented?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants