官术网_书友最值得收藏!

The fundamentals of Hadoop

In 2006, Doug Cutting, the creator of Hadoop, was working at Yahoo!. He was actively engaged in an open source project called Nutch that involved the development of a large-scale web crawler. A web crawler at a high level is essentially software that can browse and index web pages, generally in an automatic manner, on the internet. Intuitively, this involves efficient management and computation across large volumes of data. In late January of 2006, Doug formally announced the start of Hadoop. The first line of the request, still available on the internet at https://issues.apache.org/jira/browse/INFRA-700, was The Lucene PMC has voted to split part of Nutch into a new subproject named Hadoop. And thus, Hadoop was born.

At the onset, Hadoop had two core components : Hadoop Distributed File System (HDFS) and MapReduce. This was the first iteration of Hadoop, also now known as Hadoop 1. Later, in 2012, a third component was added known as YARN (Yet Another Resource Negotiator) which decoupled the process of resource management and job scheduling. Before we delve into the core components in more detail, it would help to get an understanding of the fundamental premises of Hadoop:

Doug Cutting's post at https://issues.apache.org/jira/browse/NUTCH-193 announced his intent to separate Nutch Distributed FS (NDFS) and MapReduce to a new subproject called Hadoop.

主站蜘蛛池模板: 水富县| 东海县| 庄浪县| 绥宁县| 忻州市| 昭平县| 柘荣县| 盐津县| 湾仔区| 乌兰浩特市| 泗洪县| 白银市| 慈利县| 西畴县| 遵义市| 乌鲁木齐市| 江源县| 绍兴县| 滨海县| 阳江市| 成都市| 博爱县| 株洲市| 保靖县| 绿春县| 贞丰县| 庆云县| 鹰潭市| 东乡县| 霍城县| 西藏| 唐海县| 吉木萨尔县| 江津市| 仙游县| 金平| 长岛县| 闵行区| 库伦旗| 临沭县| 太白县|