官术网_书友最值得收藏!

Understanding emerging cloud-based application architectures

In this section, we will describe common architecture patterns and deployment of some of the main processing models being used for batch processing, streaming applications, and machine learning pipelines. The underlying architecture for these processing models are required to support ingesting very large volumes of various types of data arriving at high velocities at one end, while making the output data available for use by analytical tools, reporting and modeling software, at the other.

The software platforms supporting such applications have the necessary features and support the key mechanisms required to access data across a diverse set of data sources and formats, and prepare it for downstream applications, either as low-latency streaming data or high-throughput historical data stores. For example, Apache Spark is an emerging platform that leverages distributed storage and processing frameworks to support querying, reporting, analytics and intelligent applications at scale.

For more details on Apache Spark-based architectures, refer to Learning Spark SQLAurobindo Sarkar, Packt Publishing.

The following figure shows a high-level architecture that incorporates these requirements in typical Spark-based batch and streaming applications:

主站蜘蛛池模板: 石首市| 壤塘县| 铜鼓县| 汽车| 阳原县| 耒阳市| 纳雍县| 蓬溪县| 崇州市| 平利县| 巴南区| 莲花县| 叙永县| 柏乡县| 南木林县| 桐柏县| 乌苏市| 黄浦区| 武威市| 台江县| 伊春市| 安乡县| 芦溪县| 双鸭山市| 鹤壁市| 旌德县| 闽侯县| 万安县| 阜宁县| 中方县| 秭归县| 南宫市| 罗定市| 会泽县| 大连市| 木里| 栾川县| 兰溪市| 云龙县| 楚雄市| 鄂托克前旗|