- Learning pandas(Second Edition)
- Michael Heydt
- 227字
- 2021-07-02 20:36:57
Data manipulation
Data is distributed all over the planet. It is stored in different formats. It has widely varied levels of quality. Because of this there is a need for tools and processes for pulling data together and into a form that can be used for decision making. This requires many different tasks and capabilities from a tool that manipulates data in preparation for analysis. The features needed from such a tool include:
- Programmability for reuse and sharing
- Access to data from external sources
- Storing data locally
- Indexing data for efficient retrieval
- Alignment of data in different sets based upon attributes
- Combining data in different sets
- Transformation of data into other representations
- Cleaning data from cruft
- Effective handling of bad data
- Grouping data into common baskets
- Aggregation of data of like characteristics
- Application of functions to calculate meaning or perform transformations
- Query and slicing to explore pieces of the whole
- Restructuring into other forms
- Modeling distinct categories of data such as categorical, continuous, discrete, and time series
- Resampling data to different frequencies
There are many data manipulation tools in existence. Each differs in support for the items on this list, how they are deployed, and how they are utilized by their users. These tools include relational databases (SQL Server, Oracle), spreadsheets (Excel), event processing systems (such as Spark), and more generic tools such as R and pandas.
- C++面向對象程序設計(第三版)
- Functional Python Programming
- Mastering JavaScript Functional Programming
- ASP.NET MVC4框架揭秘
- 移動界面(Web/App)Photoshop UI設計十全大補
- 編程與類型系統(tǒng)
- Spring Security Essentials
- IDA Pro權威指南(第2版)
- Java EE 程序設計
- MySQL數(shù)據(jù)庫應用技術及實戰(zhàn)
- Python Natural Language Processing
- Visual Basic.NET程序設計
- 測試架構師修煉之道:從測試工程師到測試架構師(第2版)
- Scratch 3.0少兒游戲趣味編程
- 網(wǎng)頁設計理論與實踐