In-Stream Big Data Processing

The shortcomings and drawbacks of batch-oriented data processing were widely recognized by the Big Data community quite a long time ago. It became clear that real-time query processing and in-stream processing is the immediate need in many practical applications. In recent years, this idea got a lot of traction and a whole bunch of solutions like Twitter’s Storm, Yahoo’s S4, Cloudera’s Impala, Apache Spark, and Apache Tez appeared and joined the army of Big Data and NoSQL systems. This article is an effort to explore techniques used by developers of in-stream data processing systems, trace the connections of these techniques to massive batch processing and OLTP/OLAP databases, and discuss how one unified query engine can support in-stream, batch, and OLAP processing at the same time.

At Grid Dynamics, we recently faced a necessity to build an in-stream data processing system that aimed to crunch about 8 billion events daily providing…

Xem bài viết gốc 5 219 từ nữa

Bài này đã được đăng trong Nhật ký. Đánh dấu đường dẫn tĩnh.

In-Stream Big Data Processing

About kunlqt

Bình luận về bài viết này Hủy trả lời

Bài viết mới

Bình luận mới nhất

Thư viện

Chuyên mục

Meta

In-Stream Big Data Processing

Chia sẻ:

Có liên quan

About kunlqt

Bình luận về bài viết này Hủy trả lời

Bài viết mới

Bình luận mới nhất

Thư viện

Chuyên mục

Meta