官术网_书友最值得收藏!

What this book covers

Chapter 1, Introducing Data Analysis and Libraries, describes the typical steps involved in a data analysis task. In addition, a couple of existing data analysis software packages are described.

Chapter 2, NumPy Arrays and Vectorized Computation, dives right into the core of the PyData ecosystem by introducing the NumPy package for high-performance computing. The basic data structure is a typed multidimensional array which supports various functions, among them typical linear algebra tasks. The data structure and functions are explained along with examples.

Chapter 3, Data Analysis with Pandas, introduces a prominent and popular data analysis library for Python called Pandas. It is built on NumPy, but makes a lot of real-world tasks simpler. Pandas comes with its own core data structures, which are explained in detail.

Chapter 4, Data Visualizaiton, focuses on another important aspect of data analysis: the understanding of data through graphical representations. The Matplotlib library is introduced in this chapter. It is one of the most popular 2D plotting libraries for Python and it is well integrated with Pandas as well.

Chapter 5, Time Series, shows how to work with time-oriented data in Pandas. Date and time handling can quickly become a difficult, error-prone task when implemented from scratch. We show how Pandas can be of great help there, by looking in detail at some of the functions for date parsing and date sequence generation.

Chapter 6, Interacting with Databases, deals with some typical scenarios. Your data does not live in vacuum, and it might not always be available as CSV files either. MongoDB is a NoSQL database and Redis is a data structure server, although many people think of it as a key value store first. Both storage systems are introduced to help you interact with data from real-world systems.

Chapter 7, Data Analysis Application Examples, applies many of the things covered in the previous chapters to deepen your understanding of typical data analysis workflows. How do you clean, inspect, reshape, merge, or group data – these are the concerns in this chapter. The library of choice in the chapter will be Pandas again.

Chapter 8, Machine Learning Models with scikit-learn, would like to make you familiar with a popular machine learning package for Python. While it supports dozens of models, we only look at four models, two supervised and two unsupervised. Even if this is not mentioned explicitly, this chapter brings together a lot of the existing tools. Pandas is often used for machine learning data preparation and matplotlib is used to create plots to facilitate understanding.

主站蜘蛛池模板: 阳谷县| 宝应县| 八宿县| 门源| 仪陇县| 浦北县| 九寨沟县| 榆林市| 西林县| 咸阳市| 中西区| 新野县| 余姚市| 蒙山县| 金溪县| 霍邱县| 祥云县| 木兰县| 长沙市| 格尔木市| 颍上县| 宜州市| 二手房| 富蕴县| 高碑店市| 沙坪坝区| 措美县| 四子王旗| 潢川县| 丰镇市| 玛沁县| 达日县| 濮阳市| 苍溪县| 永州市| 永嘉县| 西乌| 大荔县| 连云港市| 黑河市| 元阳县|