官术网_书友最值得收藏!

Introduction

So far, we have taken a glance at the data science ecosystem and jumped into learning about Jupyter, the tool that we'll be using throughout this book for our coding exercises and activities. Now, we'll shift our focus away from learning about Jupyter and start actually using it for analysis.

Data visualization and exploration are important steps in the data science process. This is how you can learn about your data and make sure you understand it completely. Visualizations can be used as a means of discovering unusual records in datasets and presenting that information to others.

In addition to understanding and gaining fundamental trust in data, your analysis may lead to the discovery of patterns and insights in the data. In some cases, these patterns can prompt further research and ultimately be very beneficial to your business.

Applied knowledge of a high-level programming language such as Python or R will make datasets accessible to you, from top-level aggregations to granular details. However, it's also possible to learn a lot from data with tools that are easier to pick up and use, such as Tableau or Microsoft Power BI.

In addition to learning about the tools to create them, it's important to have a conceptual understanding of different types of visualizations and their uses. Similarly, there are a handful of important techniques relating to data exploration, such as aggregation and filtering for outliers or missing samples.

In this chapter, we'll start out by learning about some of the basics of working with datasets in Jupyter by using pandas DataFrames. Then, we'll learn about exploring datasets with the Seaborn visualization library and do basic modeling with scikit learn.

主站蜘蛛池模板: 垫江县| 双柏县| 锡林郭勒盟| 拉孜县| 偃师市| 徐汇区| 正阳县| 民和| 齐齐哈尔市| 定边县| 托克逊县| 慈溪市| 浑源县| 满洲里市| 三门县| 临桂县| 扶绥县| 永定县| 洞口县| 格尔木市| 师宗县| 柘荣县| 韶关市| 垣曲县| 来凤县| 宜兴市| 新化县| 叙永县| 北京市| 体育| 思南县| 浦城县| 定日县| 阿瓦提县| 揭东县| 油尖旺区| 阿拉善右旗| 崇州市| 眉山市| 荣成市| 革吉县|