官术网_书友最值得收藏!

Dataset

The first dataset is a table of cities containing the city ID and the name of the City:

Id,City
1,Boston
2,New York
3,Chicago
4,Philadelphia
5,San Francisco
7,Las Vegas

This file, cities.csv, is available as a download, and, once downloaded, you can move it into hdfs by running the command, as shown in the following code:

hdfs dfs -copyFromLocal cities.csv /user/normal

The second dataset is that of daily temperature measurements for a city, and this contains the Date of measurement, the city ID, and the Temperature on the particular date for the specific city:

Date,Id,Temperature
2018-01-01,1,21
2018-01-01,2,22
2018-01-01,3,23
2018-01-01,4,24
2018-01-01,5,25
2018-01-01,6,22
2018-01-02,1,23
2018-01-02,2,24
2018-01-02,3,25

This file, temperatures.csv, is available as a download, and, once downloaded, you can move it into hdfs by running the command, as shown in the following code:

hdfs dfs -copyFromLocal temperatures.csv /user/normal

The following are the programming components of a MapReduce program:

主站蜘蛛池模板: 北宁市| 富源县| 寻甸| 文成县| 永川市| 屏东市| 建昌县| 潮安县| 阳西县| 桂林市| 三门县| 宜黄县| 南城县| 通化县| 吐鲁番市| 满洲里市| 玛纳斯县| 曲麻莱县| 河南省| 远安县| 绍兴市| 昂仁县| 全椒县| 通道| 定陶县| 宜城市| 邯郸市| 陕西省| 河曲县| 潜江市| 通城县| 武隆县| 叶城县| 武清区| 宿州市| 开封市| 连云港市| 高雄市| 吴江市| 漳平市| 林甸县|