官术网_书友最值得收藏!

In the last 20 years the amount of data created has grown massively.The need to understand this data,communicate what it means and use it to make better decisions has also grown.What has not changed is the human biology,so our brains must make sense of this ever-increasing amount information.As pictures are easier to understand than numbers,good visualisations have become more important as data grows in quantity,size and complexity.

(在過去的20年中,隨著社會產生數據的大量增加,對數據的理解、解釋與決策的需求也隨之增加。而固定不變是人類本身,所以我們的大腦必須學會理解這些日益增加的數據信息。所謂“一圖勝千言”,對于數量、規模與復雜性不斷增加的數據,優秀的數據可視化也變得愈加重要。)

Data comes in different kinds so it demands different methods to make sense of it.It is not possible to have a single tool/program that will work for all datasets,so we must be flexible.Many times we have to manipulate data before we can visualise it.In fact,a visualisation is typically part of a wider analysis,so we must learn to write code to analyse and visualise the data.Programming is the means by which we bring out the flexibility.

(數據來源各不同,這也導致我們需要不同的方法去理解它們。想使用一種工具或者編程語言就適用于所有數據,這是天方夜譚。所以,我們必須隨機應變。在很多情況下,我們不得不在操作數據前先可視化數據。實際上,數據可視化是數據分析的一個特別部分。所以,我們必須學會編程去分析與可視化數據。編程可以給我們帶來各種靈活性的方法。)

Now comes the first choice,in what programming language shall we write the code?We have to choose at least one and the authors of this book have chosen the Python programming language.

(現在面臨的第一個選擇就是我們將使用什么樣的語言編程。我們不得不選擇一種編程語言,而這本書選擇Python作為編程語言。)

Python is a widely used general programming language that is easy to learn and it has been embraced by a large scientific computing community who have created an open ecosystem of packages for anlaysing and visualising data.By choosing Python these packages become available to you—free of charge.For example,key packages like NumPy and Pandas which are covered in Chapter 2,make it possible to represent data in sequences and in tables,and they provide many useful methods to act on this data.

(Python是一種廣泛使用的編程語言,易于學習,而且一個巨大的科學計算社區開發了一個擁有許多數據分析與可視化包的開源生態圈。如果選擇Python作為編程語言,這些包就可以供你免費使用。比如,本書第2章講解的Python核心包NumPy和Pandas,可以使用序列和表格表示數據,同時還提供了許多有用的數據操作方法。)

The next choice is,what package(s) to use for visualisation?The authors have three choices for you;Matplotlib,Seaborn and Plotnine.Are they good choices?Yes,they are.

(接下來的選擇就是我們該使用何種包實現數據可視化。本書作者提供了三個選擇:Matplotlib、Seaborn和Plotnine。那它們是不是好的選擇?是的,非常正確。)

Matplotlib is the most widely used package for data visualisation in Python.Powerful and versatile,it can be used to create figures for publication or to create interactive environments.In 1999 Leland Wilkinson in the book"The Grammar of Graphics"introduced an elegant way with which to think about data visualisation.This"Grammar"gives us a structured way with which to transform data into to a visualisation and it makes it easy to create many kinds of complicated plots.This is where the Seaborn and plotnine packages come in,they are built on top of matplotlib and are inspired by ggplot2-an implementation of"The Grammar of Graphics"by Hadley Wickham.

(在Python中使用最為廣泛的數據可視化包是matplotlib。它功能強大且齊全,可以用于制作出版物中的圖表,也可以用于制作交互式圖表。Leland Wilkinson于1999年撰寫的書籍《圖形語法》介紹了一種實現數據可視化的優秀方法。這種語法給了我們一種將數據轉換成圖表的結構性方法,而且使繪制各種復雜圖表變得更加容易。這就是Seaborn和plotnine包的由來。它們建立在matplotlib包的基礎上,而且啟發于R語言的ggplot2包-Hadley Wickham基于《圖形語法》開發的數據可視化包。)

The programming language and key packages are choices made for you,but making beautiful visualisations requires many more choices.These choices change depending on the data,display medium and audience;they are what this book will help you learn to make.In here,you will get exposed to a variety of plots,you will learn about the advantages of different plots for the same data,you will learn about*The Grammar of Graphics*,you will learn how to create visualisations with multiple plots and you will learn how to customize the visualisations and ultimately you will learn how to make beautiful visualisations.

(編程語言和相應的核心包已經幫你選擇,但是制作優美的圖表仍需更多技能。這些技能的選擇取決于你的數據、展示媒介與受眾,這就是這本書將要幫助你學習的內容。在這里,你會接觸到各種各樣的圖表,會學習到同一數據不同可視化方法的優勢,會學習到“圖形語法”,還會學習到如何使用各種圖表實現數據可視化,學習到如何定制化圖表,最終你會學習到如何制作優美的數據可視化。)

Now you have no choice but to proceed.

(在這里,你別無選擇,唯有勇往直前!)

Hassan Kibirige

Author/Maintainer of plotnine

(plotnine包的開發者/維護者)

2020年1月9日

主站蜘蛛池模板: 南通市| 土默特右旗| 新建县| 吉木萨尔县| 湘潭市| 黄骅市| 通道| 东阿县| 黄山市| 丰城市| 合阳县| 明溪县| 长兴县| 勐海县| 五大连池市| 进贤县| 新郑市| 德昌县| 仁寿县| 西昌市| 高安市| 嫩江县| 栾川县| 塔河县| 珲春市| 肥西县| 凤凰县| 巴东县| 原平市| 来凤县| 富锦市| 遂昌县| 长垣县| 公主岭市| 和硕县| 陆丰市| 抚顺市| 黎平县| 新河县| 都江堰市| 芒康县|