mercredi 19 juillet 2017

What's the best way to test Spark Java applications?

Currently, I'm creating classes that use some Datasets from the java Spark API. These datasets are populated from a hive table, using the spark.sql() method.

So, after performing some sql operations (like joins), I have a final dataset.

Right now, I'm struggling on how to write unit test cases for this classes. I can provide the following example of one method of that class:

public Dataset<Row> loadDataSetA() {

    final SparkSession sc; // This has been initialized in another class

    final Dataset<Row> dataSetA = sc.sql("QUERY")
                                .where(upper(col(COL_A)).isin(TYPES));

    final Dataset<Row> dataSetAFinal = dataSetA.select(col(COL_A));

    return dataSetAFinal;
}

I wanted to know if there's a framework or any examples that i can consult to write good test cases for these kind of methods and classes.

Regards

Aucun commentaire:

Enregistrer un commentaire