mardi 1 août 2017

Test spark failover

In my project I perform some side effects inside RDD's actions and transformations. I want to test that my business logic works even if Spark engine had to retry computations for some partitions. So I'm trying to simulate failures during the computation.

object Test extends App {
  val conf = new SparkConf()
  conf.setMaster("local[4]")
  conf.setAppName(getClass.getName)
  val sc = new SparkContext(conf)
  try {
    val data = sc.parallelize(1 to 10)
    val res = data.map(n => {
      if (TaskContext.get().attemptNumber() == 0 && n==5) {
        sys.error("boom!")
      }
      n * 2
    }).collect()
  }
  finally {
    sc.stop()
  }
}

But it doesn't work: exception is propagated to a driver. Seems that the Spark tries to failover only its internal errors. Is there any way to test it?

Aucun commentaire:

Enregistrer un commentaire