- Hands-On Data Science with R
- Vitor Bianchi Lanzetta Nataraj Dasgupta Ricardo Anjoleto Farias
- 376字
- 2021-06-10 19:12:34
Data Wrangling with R
– Daniel Keys Moran
Data wrangling has been one of the core strengths of R, given its capabilities of relatively fast in-memory processing on demand and a wide array of packages that facilitate the fast data curation processes that data wrangling involves.
R is especially invaluable when working with datasets in excess of 1 million rows—the limit in Microsoft Excel—or when working with files that are in the order of gigabytes. Due to several easy-to-use functions for common day-to-day tasks such as aggregations, joins, and pivots, R is also arguably much simpler to use relative to some of the GUI-based tools that are available for similar tasks.
At a high level, the core categories of data wrangling with R include data extraction, data cleansing, data transformation, and data consolidation. This is a simplified categorization of the basic tenets of data wrangling and we'll delve deeper into these individual subject areas in the next few sections. The challenge emanates largely due to the fact that data comes in a range of data types and data formats from a diverse pool of data sources. Here, data type refers to the characteristics of the contents of the files, format refers to the file format in which data is delivered, and source refers to the systems from when you receive data. There is no common universal convention for these—the data may exist in a CSV file or a binary SAS file or be present in a database, each of which can have its own nuances and challenges.
In this chapter, we will cover the following topics:
- Introduction to data wrangling with R
- The foundational tools of data wrangling: dplyr, data.table, and others
- ETL with R data extraction
- ETL with R data transformation
- ETL with R data load
- Helpful data wrangling tools for everyday use
- Tutorial
- 工業機器人虛擬仿真實例教程:KUKA.Sim Pro(全彩版)
- Dreamweaver CS3+Flash CS3+Fireworks CS3創意網站構建實例詳解
- 圖解PLC控制系統梯形圖和語句表
- Julia 1.0 Programming
- 來吧!帶你玩轉Excel VBA
- 大數據時代的數據挖掘
- 機器人編程實戰
- 計算機網絡技術基礎
- Splunk Operational Intelligence Cookbook
- LMMS:A Complete Guide to Dance Music Production Beginner's Guide
- R Machine Learning Projects
- 學練一本通:51單片機應用技術
- 貫通開源Web圖形與報表技術全集
- Hands-On Business Intelligence with Qlik Sense
- 機器人制作入門(第4版)