我试图使用Spark Streaming与Kafka(版本1.1.0)但由于此错误,Spark作业不断崩溃:
14/11/21 12:39:23 ERROR TaskSetManager: Task 3967.0:0 failed 4 times; aborting job org.apache.spark.SparkException: Job aborted due to stage failure: Task 3967.0:0 failed 4 times, most recent failure: Exception failure in TID 43518 on host ********: java.lang.Exception: Could not compute split, block input-0-1416573258200 not found at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1017) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1015) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1015) Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 3967.0:0 failed 4 times, most recent failure: Exception failure in TID 43518 on host ********: java.lang.Exception: Could not compute split, block input-0-1416573258200 not found at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1017) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1015) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1015)
我从日志中获得的唯一相关信息是:
14/11/21 12:34:18 INFO MemoryStore: Block input-0-1416573258200 stored as bytes to memory (size 85.8 KB, free 2.3 GB) 14/11/21 12:34:18 INFO BlockManagerMaster: Updated info of block input-0-1416573258200 14/11/21 12:34:18 INFO BlockGenerator: Pushed block input-0-1416573258200 org.apache.spark.SparkException: Error sending message to BlockManagerMaster [message = GetLocations(input-0-1416573258200)] java.lang.Exception: Could not compute split, block input-0-1416573258200 not found 14/11/21 12:37:35 INFO BlockManagerInfo: Added input-0-1416573258200 in memory on ********:43117 (size: 85.8 KB, free: 2.3 GB) org.apache.spark.SparkException: Error sending message to BlockManagerMaster [message = GetLocations(input-0-1416573258200)] java.lang.Exception: Could not compute split, block input-0-1416573258200 not found org.apache.spark.SparkException: Job aborted due to stage failure: Task 3967.0:0 failed 4 times, most recent failure: Exception failure in TID 43518 on host ********: java.lang.Exception: Could not compute split, block input-0-1416573258200 not found java.lang.Exception: Could not compute split, block input-0-1416573258200 not found Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 3967.0:0 failed 4 times, most recent failure: Exception failure in TID 43518 on host ********: java.lang.Exception: Could not compute split, block input-0-1416573258200 not found java.lang.Exception: Could not compute split, block input-0-1416573258200 not found
示例代码:
SparkConf conf = new SparkConf(); JavaSparkContext sc = new JavaSparkContext(conf); JavaStreamingContext jssc = new JavaStreamingContext(sc, new Duration(5000)); jssc.checkpoint(checkpointDir); HashMaptopics = new HashMap (); topics.put(KAFKA_TOPIC, 1); HashMap kafkaParams = new HashMap (); kafkaParams.put("group.id", "spark-streaming-test"); kafkaParams.put("zookeeper.connect", ZOOKEEPER_QUORUM); kafkaParams.put("zookeeper.connection.timeout.ms", "1000"); kafkaParams.put("auto.offset.reset", "smallest"); JavaPairReceiverInputDStream kafkaStream = KafkaUtils.createStream(jssc, String.class, String.class, StringDecoder.class, StringDecoder.class, kafkaParams, topics, StorageLevels.MEMORY_AND_DISK_SER); JavaPairDStream streamPair = kafkaStream.flatMapToPair(...).reduceByKey(...);
我不确定这个问题的原因是什么.