We’re pleased to announce version 0.3.0 of Spark Kafka Writer.
Spark Kafka Writer is a library that lets tou save your Spark RDD
s and
DStream
s to Kafka seamlessly.
The repository is on GitHub and you can find the latest version on maven central.
In this post we’ll cover the new Callback
API as well as the other small
updates this release brings.
The major update for this release is the possibility to have a Kafka Callback
called whenever a message is produced.
As an example, we could log a message whenever the production of a message failed:
import com.github.benfradet.spark.kafka010.writer._
import org.apache.kafka.clients.producer.{Callback, ProducerRecord, RecordMetadata}
@transient lazy val log = org.apache.log4j.Logger.getLogger("spark-kafka-writer")
val producerConfig: java.util.Properties = ...
// with a RDD
val rdd: RDD[String] = ...
rdd.writeToKafka(
producerConfig,
s => new ProducerRecord[String, String](topic, s),
Some(new Callback with Serializable {
override def onCompletion(metadata: RecordMetadata, e: Exception): Unit = {
if (Option(e).isDefined) {
log.warn("error sending message", e)
} else {
log.info(s"everything went fine, record offset was ${metadata.offset()}")
}
}
})
)
// with a DStream
val dStream: DStream[String] = ...
dStream.writeToKafka(
producerConfig,
s => new ProducerRecord[String, String](topic, s),
Some(new Callback with Serializable {
override def onCompletion(metadata: RecordMetadata, e: Exception): Unit = {
if (Option(e).isDefined) {
log.warn("error sending message", e)
} else {
log.info(s"everything went fine, record offset was ${metadata.offset()}")
}
}
})
)
Refer to the Kafka javadoc for sending a message with a Callback through the Kafka producer to know more about callbacks in Kafka.
Thanks a lot to Lawrence Carvalho for this cool new feature!
Version 0.3.0 brings another couple of updates:
For version 0.4.0, we’re aiming to provide an API to write DataFrams and Datasets to Kafka.
If you’d like to get involved, there are different ways you can contribute to the project:
You can also ask questions and discuss the project on the Gitter channel and check out the Scaladoc.