Discretized streams

Zaharia, Matei; Das, Tathagata; Li, Haoyuan; Hunter, Timothy; Shenker, Scott; Stoica, Ion

doi:10.1145/2517349.2522737

articleOct 8, 2013GOLD OA

Discretized streams

MZMatei Zaharia TDTathagata Das HLHaoyuan Li THTimothy Hunter SSScott Shenker

University of California, Berkeley

Indexed incrossref

Abstract

Many "big data" applications must act on data in real time. Running these applications at ever-larger scales requires parallel platforms that automatically handle faults and stragglers. Unfortunately, current distributed stream processing models provide fault recovery in an expensive manner, requiring hot replication or long recovery times, and do not handle stragglers. We propose a new processing model, discretized streams (D-Streams), that overcomes these challenges. D-Streams enable a parallel recovery mechanism that improves efficiency over traditional replication and backup schemes, and tolerates stragglers. We show that they support a rich set of operators while attaining high per-node throughput similar…

Citation impact

961

total citations

FWCI: 231.60
Percentile: 100%
References: 39

Citations per year

Authors

6

Topics & keywords

Topics

Keywords

Computer science
Stream processing
Backup
Data stream mining
Distributed computing
Throughput
Fault tolerance
Big data

No related works found for this paper.