官术网_书友最值得收藏!

Overview of Storm

Storm is an open source, distributed, resilient, real-time processing engine. It was started by Nathan Marz in late 2010. He was working at BackType. On his blog, he mentioned the challenges he faced while building Storm. It is a must read: http://nathanmarz.com/blog/history-of-apache-storm-and-lessons-learned.html.

Here is the crux of the whole blog: initially, real-time processing was implemented like pushing messages into a queue and then reading the messages from it using Python or any other language and processing them one by one. The challenges with this approach are:

  • In case of failure of the processing of any message, it has to be put back into the queue for reprocessing
  • Keeping queues and the worker (processing unit) up and running all the time

What follows are two sparking ideas by Nathan that make Storm capable of being a highly reliable and real-time engine:

  • Abstraction: Storm is a distributed abstraction in the form of streams. Streams can be produced and processed in parallel. Spouts can produce new streams and a bolt is a small unit of processing in a stream. Topology is top-level abstraction. The advantage of abstraction here is that nobody need worry about what is going on internally, such as serialization/deserialization, sending/receiving messages between different processes, and so on. The user can focus on writing the business logic.
  • A guaranteed message processing algorithm is the second idea. Nathan developed an algorithm based on random numbers and XORs that would only require about 20 bytes to track each spout tuple, regardless of how much processing was triggered downstream.

In May 2011 BackType was acquired by Twitter. After becoming popular in public forums, Storm started to be called "real-time Hadoop". In September, 2011, Nathan officially released Storm. In September, 2013, he officially proposed Storm in Apache Incubator. In September, 2014, Storm became a top-level project in Apache.

主站蜘蛛池模板: 水城县| 多伦县| 清镇市| 虞城县| 北流市| 朝阳市| 阿瓦提县| 米脂县| 和平县| 巩义市| 九龙县| 上杭县| 丰顺县| 静宁县| 金华市| 安宁市| 芷江| 木兰县| 靖江市| 武夷山市| 旌德县| 安图县| 中超| 宁津县| 罗源县| 额济纳旗| 平山县| 孝昌县| 交口县| 庄浪县| 和龙市| 田阳县| 望城县| 芦山县| 随州市| 资中县| 新泰市| 高碑店市| 三河市| 高要市| 潮州市|