官术网_书友最值得收藏!

Summary

This chapter focused on how to fetch and process data directly from the Web, including some problems with downloading files, processing XML and JSON formats, parsing HTML tables, applying XPath selectors to extract data from HTML pages, and interacting with RESTful APIs.

Although some examples in this chapter might appear to have been an idle struggle with the Socrata API, it turned out that the RSocrata package provides production-ready access to all those data. However, please bear in mind that you will face some situations without ready-made R packages; thus, as a data hacker, you will have to get your hands dirty with all the JSON, HTML and XML sources.

In the next chapter, we will discover how to filter and aggregate the already acquired and loaded data with the top, most-used methods for reshaping and restructuring data.

主站蜘蛛池模板: 康马县| 浦北县| 内江市| 海伦市| 昌黎县| 侯马市| 陈巴尔虎旗| 阳朔县| 汝州市| 南丰县| 突泉县| 汤原县| 平湖市| 门头沟区| 湟中县| 辉县市| 耿马| 磐安县| 太仆寺旗| 鸡泽县| 旺苍县| 江都市| 卓资县| 潮安县| 晋州市| 岑溪市| 施甸县| 岢岚县| 建平县| 阳春市| 左云县| 霍城县| 长武县| 齐齐哈尔市| 望谟县| 故城县| 舒城县| 南澳县| 武隆县| 甘肃省| 精河县|