Spark uses micro-batching to deal with streams where Flink natively uses streams for all type of workloads (Garcı́a-Gil et al. 2017).

Spark plans computations as a sequence of stateless deterministic operators that can run on any node. For each iteration a new sequence is planned and scheduled (Garcı́a-Gil et al. 2017; Marcu et al. 2016) . In Flink the operators are only planned and scheduled once which allows for stateful operators (Marcu et al. 2016).

Many existing streaming systems are based on a continuous operator model where each node processes its given streaming data one record at a time and outputs to other operators in the pipeline. Often recovery is performed through either replications, where there are two copies of each node, or upstream backup, where the sender maintains a backup of the outputted data to be sent again to a new copy of the failed node. This causes increased resource consumption during normal operation and increased latencies at failure. (Zaharia et al. 2013)

Spark implements Discretized Streams which is an unbounded data processing model that aims to overcome the challenge of expensive fault recovery and slow nodes. It is based on fault-tolerant Resilient Distributed Datasets (RDDs), an in-memory data structure used to store intermediate results whilst allowing efficient recovery. Operations on RDDs are tracked so they can be reconstructed in case of failure. All nodes may participate in its recovery. (Zaharia et al. 2013)

Recovery in Flink happens via checkpoints on regular intervals and partial re-executions. These snapshots include the state of all operators. In case in failure the operators are set back to the state in the last snapshot and the stream is reprocessed. (Carbone et al. 2015)

Spark and Flink initially were build to process different data types. Spark was build to process static data, and hence it now processes streams using batch processing, while Flink was build for processing streaming data (Garcı́a-Gil et al. 2017).

Considering performance, (Lopez and Vieru 2016) state Flink processes streams at higher throughputs with consistently low latencies compared to Spark; while Marcu et al. (2016) conclude Spark performs 1.5 to 1.7 times faster than Flink when processing graphs of different sizes. So we can conclude Spark and Flink are not out performed by each other in the goal they were build to achieve.


Carbone, Paris, Asterios Katsifodimos, Stephan Ewen, Volker Markl, Seif Haridi, and Kostas Tzoumas. 2015. "Apache Flink: Stream and Batch Processing in a Single Engine." Bulletin of the IEEE Computer Society Technical Committee on Data Engineering 36 (4).

Garcı́a-Gil, Diego, Sergio Ramı́rez-Gallego, Salvador Garcı́a, and Francisco Herrera. 2017. "A Comparison on Scalability for Batch Big Data Processing on Apache Spark and Apache Flink." Big Data Analytics 2 (1): 1.

Lopez, Javier, and Mihail Vieru. 2016. "Apache Showdown: Flink Vs. Spark -- Zalando Tech Blog." Zalando Jobs. https://jobs.zalando.com/tech/blog/apache-showdown-flink-vs.-spark/?gh_src=4n3gxh1.

Marcu, Ovidiu-Cristian, Alexandru Costan, Gabriel Antoniu, and Marı́a S Pérez-Hernández. 2016. "Spark Versus Flink: Understanding Performance in Big Data Analytics Frameworks." In Cluster Computing (Cluster), 2016 Ieee International Conference on, 433--42. IEEE.

Zaharia, Matei, Tathagata Das, Haoyuan Li, Timothy Hunter, Scott Shenker, and Ion Stoica. 2013. "Discretized Streams: Fault-Tolerant Streaming Computation at Scale." In Proceedings of the Twenty-Fourth Acm Symposium on Operating Systems Principles, 423--38. ACM.