官术网_书友最值得收藏!

About the dataset

The dataset that we will be focusing on throughout this chapter is the Auto.MPG dataset, which is used predominantly with the R language. This dataset gives the complete details of fuel economy data for the years 1999 and 2008 for 38 popular car models. This dataset also comes with the ggplot2 package, which we will cover in the coming chapters.

For now, we will focus on importing the dataset from the CSV file, which you can download from the following link: 

https://github.com/PacktPublishing/Hands-On-Exploratory-Data-Analysis-with-R/tree/master/ch03

For more details pertaining to the dataset, you can refer to the following link:

https://archive.ics.uci.edu/ml/datasets/auto+mpg

Once the download is complete, we can import the CSV file into the dataset. With this conversion, we can include the dataset in the R workspace:

> mpg <-read.csv("highway_mpg.csv", stringsAsFactors = FALSE)
> View(mpg)

From this, we get the following output:

As shown in the preceding screenshot, the Auto.MPG dataset includes various attributes, as follows:

The dataset, which is represented in tabular format, is as follows:

The description, including data types for each attribute of the dataset, can be achieved with the following command:

> str(mpg)   
'data.frame':  234 obs. of  11 variables:   
 $ manufacturer: chr  "audi" "audi"   "audi" "audi" ...   
 $ model       : chr  "a4" "a4"   "a4" "a4" ...   
 $ displ       : num  1.8 1.8 2 2 2.8 2.8   3.1 1.8 1.8 2 ...   
 $ year        : int  1999 1999 2008 2008   1999 1999 2008 1999 1999 2008 ...   
 $ cyl         : int  4 4 4 4 6 6 6 4 4 4   ...   
 $ trans       : chr  "auto(l5)"   "manual(m5)" "manual(m6)" "auto(av)" ...   
 $ drv         : chr  "f" "f"   "f" "f" ...   
 $ cty         : int  18 21 20 21 16 18   18 18 16 20 ...   
 $ hwy         : int  29 29 31 30 26 26   27 26 25 28 ...   
 $ fl          : chr  "p" "p"   "p" "p" ...   
 $ class       : chr  "compact"   "compact" "compact" "compact" ...   

The str function is declared as an alternative to the summary function. It displays the internal structure of an R object in a compact manner.

主站蜘蛛池模板: 寿光市| 淮南市| 乳山市| 洪江市| 历史| 崇明县| 马公市| 青河县| 克山县| 田东县| 澄迈县| 宜兴市| 同心县| 扶风县| 浦北县| 曲阳县| 漠河县| 秭归县| 中江县| 吉木萨尔县| 谢通门县| 芮城县| 新巴尔虎右旗| 洛南县| 云霄县| 壶关县| 凉山| 陆河县| 思茅市| 彭山县| 博野县| 岑溪市| 北宁市| 罗平县| 高唐县| 玉门市| 宣化县| 舒城县| 南漳县| 隆尧县| 福鼎市|