Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encounter kyro serialization error when calculating maximum cliques #17

Open
HongHuangNeu opened this issue Jul 16, 2019 · 2 comments
Open

Comments

@HongHuangNeu
Copy link

When I launch maximum clique computation on spark, I encounter the following errors(I bump spark version from 2.0.0 to 2.1.0):

Exception,java.lang.NullPointerException
Serialization trace:
vertexPositions (io.arabesque.pattern.JBlissPattern),[Ljava.lang.StackTraceElement;@106a45fa,com.esotericsoftware.kryo.KryoException: java.lang.NullPointerException
Serialization trace:
vertexPositions (io.arabesque.pattern.JBlissPattern)
at com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:144)
at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
at com.twitter.chill.Tuple2Serializer.read(TupleSerializers.scala:41)
at com.twitter.chill.Tuple2Serializer.read(TupleSerializers.scala:33)
at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
at org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:244)
at org.apache.spark.serializer.DeserializationStream.readKey(Serializer.scala:157)
at org.apache.spark.serializer.DeserializationStream$$anon$2.getNext(Serializer.scala:189)
at org.apache.spark.serializer.DeserializationStream$$anon$2.getNext(Serializer.scala:186)
at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
at org.apache.spark.util.collection.ExternalAppendOnlyMap.insertAll(ExternalAppendOnlyMap.scala:154)
at org.apache.spark.Aggregator.combineCombinersByKey(Aggregator.scala:50)
at org.apache.spark.shuffle.BlockStoreShuffleReader.read(BlockStoreShuffleReader.scala:85)
at org.apache.spark.rdd.ShuffledRDD.compute(ShuffledRDD.scala:109)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
at org.apache.spark.scheduler.Task.run(Task.scala:99)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NullPointerException
at com.koloboke.collect.impl.hash.MutableLHashParallelKVIntIntMapGO.put(MutableLHashParallelKVIntIntMapGO.java:474)
at com.koloboke.collect.impl.hash.MutableLHashParallelKVIntIntMapGO.put(MutableLHashParallelKVIntIntMapGO.java:45)
at com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:162)
at com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:39)
at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)
at com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
... 30 more
,Some(org.apache.spark.ThrowableSerializationWrapper@5d417398),Vector(AccumulableInfo(1974,Some(internal.metrics.executorRunTime),Some(354),None,true,true,None), AccumulableInfo(1976,Some(internal.metrics.resultSize),Some(0),None,true,true,None), AccumulableInfo(1977,Some(internal.metrics.jvmGCTime),Some(27),None,true,true,None)),Vector(LongAccumulator(id: 1974, name: Some(internal.metrics.executorRunTime), value: 354), LongAccumulator(id: 1976, name: Some(internal.metrics.resultSize), value: 0), LongAccumulator(id: 1977, name: Some(internal.metrics.jvmGCTime), value: 27))),org.apache.spark.scheduler.TaskInfo@ab7ac0e,org.apache.spark.executor.TaskMetrics@2c4b63fc)

Does the spark version of Arabesque work with spark 2.1.0?

@ghanemabdo
Copy link
Collaborator

Arabesque is tested only with Spark 2.0. We faced several stability and memory issues when we tried upgrading to Spark 2.1+ (this NPE is one of them). We strongly recommend sticking to Spark 2.0 in order to get the expected outcomes.

@HongHuangNeu
Copy link
Author

HongHuangNeu commented Jul 25, 2019

I fixed the issue by forcing Spark to use Java serializer. In Spark 2.1+ kyro serializer is used by default in some of the shuffling stages, which destroys the serialization process. In the implementation most classes inherits Java serializer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants