官术网_书友最值得收藏!

Getting started with EDA

As mentioned earlier, we are going to use Python as the main tool for data analysis. Yay! Well, if you ask me why, Python has been consistently ranked among the top 10 programming languages and is widely adopted for data analysis and data mining by data science experts. In this book, we assume you have a working knowledge of Python. If you are not familiar with Python, it's probably too early to get started with data analysis. I assume you are familiar with the following Python tools and packages:


        

Fundamental concepts of variables, string, and data types

Conditionals and functions

Sequences, collections, and iterations

Working with files

Object-oriented programming

Create arrays with NumPy, copy arrays, and divide arrays

Perform different operations on NumPy arrays

Understand array selections, advanced indexing, and expanding

Working with multi-dimensional arrays

Linear algebraic functions and built-in NumPy functions

Understand and create DataFrame objects

Subsetting data and indexing data 

Arithmetic functions, and mapping with pandas

Managing index

Building style for visual analysis

Loading linear datasets

Adjusting axes, grids, labels, titles, and legends

Saving plots

Importing the package

Using statistical packages from SciPy

Performing descriptive statistics

Inference and data analysis

 

Before diving into details about analysis, we need to make sure we are on the same page. Let's go through the checklist and verify that you meet all of the prerequisites to get the best out of this book:

 

Next, let's look at the basic operations of EDA using the NumPy library.

Python programming NumPy pandas Matplotlib SciPy
主站蜘蛛池模板: 三门县| 新野县| 蕉岭县| 同江市| 涞水县| 灵璧县| 德州市| 景宁| 隆德县| 白银市| 侯马市| 永宁县| 株洲市| 灌云县| 时尚| 板桥市| 浪卡子县| 阳东县| 云霄县| 岗巴县| 柞水县| 武强县| 博罗县| 时尚| 诸暨市| 开鲁县| 安陆市| 镶黄旗| 随州市| 临泽县| 洪泽县| 文水县| 德庆县| 民乐县| 渝北区| 吉林省| 桐乡市| 什邡市| 北海市| 安岳县| 南郑县|