mercredi 8 juillet 2015

Using a text file as Spark streaming source for testing purpose

I want to write a test for my spark streaming application that consume a flume source.

http://ift.tt/1dzuqET suggests using ManualClock but for the moment reading a file and verifying outputs would be enough for me.

So I wish to use :

JavaStreamingContext streamingContext = ...
JavaDStream<String> stream = streamingContext.textFileStream(dataDirectory);
stream.print();
streamingContext.awaitTermination();
streamingContext.start();

Unfortunately it does not print anything.

I tried:

  • dataDirectory = "http://hdfsnode:port/absolute/path/on/hdfs/"
  • dataDirectory = "http://fileC:\\absolute\\path\\on\\windows\\"
  • adding the text file in the directory BEFORE the program begins
  • adding the text file in the directory WHILE the program run

Nothing works.

Any suggestion to read from text file?

Thanks,

Martin

Aucun commentaire:

Enregistrer un commentaire