官术网_书友最值得收藏!

Logical flow of processing

In the current era of data explosion and requirement for an always-connected paradigm, organizations are collecting colossal volumes of data on a continuous basis in real or near-real-time basis. The value of this data surge depends on the ability to extract actionable and contextual insights in a timely fashion. Streaming applications have a very strong mandate to derive real-time actionable insights from massive data ingestion pipelines. They have to react to data in real time. For instance, as a data stream arrives, it should trigger a multitude of dependent actions and capture reactions. The most critical part of building streaming solutions is to understand the interlude between input, output and query processing at scale. Do also note streaming applications never exist in siloed mode, but part of the larger ecosystem of applications.

The following illustration provides a high-level conceptual view of various interplay with different components.  Starting with a stream of data, reference data included to enrich the arriving streaming data,  queries are executed and responses pushed out, followed by notifications to end users and storage of the final results in the data store for future references: 

Logical view of streaming flow processing 

If you take a traditional transactional data processing workload, all the data is collected before the start of processing. In the stream, processing queries are run against that data in flight as illustrated as follows:

 

Queries executed on streaming data

When data is continually in motion keeping the state of the data is challenging or difficult, the state is stored in in-memory that is working memory and that is limited. Additionally, networking challenges will creep in turn resulting late arrival of data or missing data sets.  Patterns like Command Query Responsibility Segregation is used to scale out read and writes separately. 

Command Query Responsibility Segregation (CQRS) is an architecture pattern for separating concerns, the reads and writes are separated and provides the ability to read and write faster in separate streams.  The event stored in the event store is immutable with a timestamp.

In the preceding architecture, immutable events with data stamp are sent through the event pipe and split between immediate event action and long-term data retention.  Events are stored with a timestamp and it gives the ability to determine the state of the system at any previous point in time by querying the Events. By splitting the data streams into multiple channels higher throughput is achieved.

主站蜘蛛池模板: 本溪| 扬中市| 龙川县| 济阳县| 呼玛县| 九寨沟县| 上虞市| 雅安市| 泽普县| 长顺县| 宁武县| 松溪县| 彭山县| 淳化县| 江阴市| 古浪县| 富蕴县| 彝良县| 阿图什市| 墨脱县| 安西县| 通江县| 扎囊县| 苍梧县| 任丘市| 繁峙县| 托里县| 资源县| 巴彦淖尔市| 西林县| 武宁县| 阿荣旗| 汨罗市| 铅山县| 酒泉市| 来凤县| 长治市| 霍城县| 嘉善县| 襄樊市| 建昌县|