官术网_书友最值得收藏!

  • Machine Learning with R
  • Brett Lantz
  • 313字
  • 2021-07-23 15:49:47

Vectors

The fundamental R data structure is the vector, which stores an ordered set of values called elements. A vector can contain any number of elements. However, all the elements must be of the same type; for instance, a vector cannot contain both numbers and text.

There are several vector types commonly used in machine learning: integer (numbers without decimals), numeric (numbers with decimals), character (text data), or logical (TRUE or FALSE values). There are also two special values: NULL, which is used to indicate the absence of any value, and NA, which indicates a missing value.

It is tedious to enter large amounts of data manually, but simple vectors can be created by using the combine function c(). The vector can also be given a name using the arrow <- operator, which is R's assignment operator, used in a similar way to the = assignment operator in many other programming languages.

For example, let's construct a set of vectors containing data on three medical patients. We'll create a character vector named subject_name, which contains the three patient names, a numeric vector named temperature containing each patient's body temperature, and a logical vector flu_status containing each patient's diagnosis; TRUE if he or she has influenza, FALSE otherwise. As shown in the following listing, the three vectors are:

> subject_name <- c("John Doe", "Jane Doe", "Steve Graves")
> temperature <- c(98.1, 98.6, 101.4)
> flu_status <- c(FALSE, FALSE, TRUE)

Because R vectors are inherently ordered, the records can be accessed by counting the item's number in the set, beginning at 1, and surrounding this number with square brackets (for example, [ and ]) after the name of the vector. For instance, to obtain the body temperature for patient Jane Doe, or element 2 in the temperature vector simply type:

> temperature[2]
[1] 98.6

R offers a variety of convenient methods for extracting data from vectors. A range of values can be obtained using the colon operator. For instance, to obtain the body temperature of Jane Doe and Steve Graves, type:

> temperature[2:3]
[1] 98.6 101.4

Items can be excluded by specifying a negative item number. To exclude Jane Doe's temperature data, type:

> temperature[-2]
[1] 98.1 101.4

Finally, it is also sometimes useful to specify a logical vector indicating whether each item should be included. For example, to include the first two temperature readings but exclude the third, type:

> temperature[c(TRUE, TRUE, FALSE)]
[1] 98.1 98.6

As you will see shortly, the vector provides the foundation for many other R data structures. Therefore, knowing the various vector operations is crucial for working with data in R.

Tip

Downloading the example code

You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

主站蜘蛛池模板: 台南市| 福鼎市| 龙井市| 三河市| 融水| 五台县| 南木林县| 莆田市| 莒南县| 青岛市| 苗栗市| 钦州市| 广灵县| 辽阳县| 正镶白旗| 曲沃县| 长葛市| 沙洋县| 梅州市| 洛南县| 东辽县| 临邑县| 界首市| 阳城县| 蓬溪县| 台东市| 古丈县| 紫金县| 东安县| 紫金县| 黄骅市| 孙吴县| 临桂县| 凌海市| 仁怀市| 鹤岗市| 绵竹市| 万荣县| 上高县| 嘉峪关市| 桂东县|