官术网_书友最值得收藏!

Summary

We live in a connected digital era and are witnessing unprecedented growth of data. Organizations that are able to analyze Big Data are demonstrating significant return on investment by detecting fraud, improved operations, and reduced time to analyze a scale-out architecture. Apache Hadoop is the leading open source big data platform with strong and diverse ecosystem projects that enable organizations to build a modern data architecture. At the core, Hadoop has two key components—Hadoop Distributed File System also known as HDFS and a cluster resource manager known as YARN. YARN has enabled Hadoop to be a true multiuse data platform that can handle batch processing, real-time streaming, interactive SQL, and others.

Microsoft HDInsight is an enterprise-ready distribution of Hadoop on the cloud that has been developed in partnership with Hortonworks and Microsoft. Key benefits of HDInsight include: scale up/down as required, analysis using Excel, connect on-premise Hadoop cluster with the cloud, and flexible programming and support for NoSQL transactional database.

In the next chapter, we will take a look at how to build an Enterprise Data Lake using HDInsight.

主站蜘蛛池模板: 咸丰县| 同仁县| 敦化市| 齐齐哈尔市| 东兰县| 永昌县| 修文县| 荥经县| 兰考县| 且末县| 舞阳县| 札达县| 襄城县| 正镶白旗| 延吉市| 阜城县| 余江县| 抚松县| 茂名市| 清原| 惠来县| 天祝| 黄大仙区| 子洲县| 玉田县| 广西| 澎湖县| 方正县| 石河子市| SHOW| 孙吴县| 边坝县| 尉氏县| 金山区| 泊头市| 浠水县| 抚宁县| 云南省| 昂仁县| 莱阳市| 大竹县|