- Statistics for Data Science
- James D. Miller
- 273字
- 2021-07-02 14:58:54
Munging and wrangling
The terms munging and wrangling are buzzwords or jargon meant to describe one's efforts to affect the format of data, recordset, or file in some way in an effort to prepare the data for continued or otherwise processing and/or evaluations.
With data development, you are most likely familiar with the idea of Extract, Transform, and Load (ETL). In somewhat the same way, a data developer may mung or wrangle data during the transformation steps within an ETL process.
Common munging and wrangling may include removing punctuation or HTML tags, data parsing, filtering, all sorts of transforming, mapping, and tying together systems and interfaces that were not specifically designed to interoperate. Munging can also describe the processing or filtering of raw data into another form, allowing for more convenient consumption of the data elsewhere.
Munging and wrangling might be performed multiple times within a data science process and/or at different steps in the evolving process. Sometimes, data scientists use munging to include various data visualization, data aggregation, training a statistical model, as well as much other potential work. To this point, munging and wrangling may follow a flow beginning with extracting the data in a raw form, performing the munging using various logic, and lastly, placing the resulting content into a structure for use.
Although there are many valid options for munging and wrangling data, preprocessing and manipulation, a tool that is popular with many data scientists today is a product named Trifecta, which claims that it is the number one (data) wrangling solution in many industries.
- 大數據導論:思維、技術與應用
- Mastering Mesos
- Introduction to DevOps with Kubernetes
- 三菱FX3U/5U PLC從入門到精通
- 輕松學Java
- Expert AWS Development
- 深度學習中的圖像分類與對抗技術
- Photoshop CS3圖像處理融會貫通
- Implementing Oracle API Platform Cloud Service
- 工業機器人運動仿真編程實踐:基于Android和OpenGL
- 貫通Java Web開發三劍客
- Godot Engine Game Development Projects
- 空間機器人
- 電腦故障排除與維護終極技巧金典
- 中國戰略性新興產業研究與發展·數控系統