956棋牌最新官网

書名： R for Data Science Cookbook
作者名： Yu Wei Chiu (David Chiu)
本章字數： 514字
更新時間： 2021-07-14 10:51:29

Adding new records

For those of you familiar with databases, you may already know how to perform an insert operation to append a new record to the dataset. Alternatively, you can use an alter operation to add a new column (attribute) into a table. In R, you can also perform insert and alter operations but much more easily. We will introduce the rbind and cbind function in this recipe so that you can easily append a new record or new attribute to the current dataset with R.

Getting ready

Refer to the Converting data types recipe and convert each attribute of imported data into the proper data type. Also, rename the columns of the employees and salaries datasets by following the steps from the Renaming the data variable recipe.

How to do it…

Perform the following steps to add a new record or new variable into the dataset:

First, use rbind to insert a new record to employees:

> employees <- rbind(employees, c(10011, '1960-01-01', 'Jhon', 'Doe', 'M', '1988-01-01'))

We can then reassign the combined results of the data frame employees and new records back to employees:
```
> employees <- rbind(employees, c(10011, '1960-01-01', 'Jhon', 'Doe', 'M', '1988-01-01'))
```
Besides adding a new record to the original dataset, we can add a new position attribute with NA as the default value:
```
> cbind(employees, position = NA)
```

Furthermore, we can add a new age attribute, based on a calculation using the current date and birth_date of each employee:

> span <- interval(ymd(employees$birth_date), now())
> time_period <- as.period(span)
> employees$age <- year(time_period)

Alternatively, we can use the transform function to add multiple variables:

> transform(employees, age = year(time_period), position = "RD", marrital = NA)

How it works…

Similar to database operations, we can add a new record to the data frame by the schema of the dataset (the number of attributes and data type of each attribute). Here, we first introduced how to use the rbind function to add a new record to a data frame. As the employees dataset consists of six columns, we can add a record with six values to the employees dataset with the rbind function. In the first column, emp_no is in integer format. Thus, we do not have to wrap the input value with single quotes. For the first_name and last_name attributes, we can freely input any character string as a value because we already converted their type to character type. For the last gender attribute, which is in factor type, we can only input either M or F as a value.

In addition to adding a new record to a target dataset, we can add a new variable with the cbind function. To add a new variable, we can assign a variable with a default value while calling cbind. Here, we use NA as the default value for a new position variable. We can also assign the calculated results from other columns as the value of the new variable. In this demonstration, we first computed each employee's age from the current date to their birthday. Then, we used the dollar sign to assign the computed value to a new attribute, age. Besides using the dollar sign to assign a new variable, we can use the transform function to create age, position, and marital variables in the employees dataset.

There's more…

Besides using the dollar sign and transform function, we can use the with function to create new variables:

> with(employees, year(birth_date))
 [1] 1953 1964 1959 1954 1955 1953 1957 1958 1952 1963
> employees $birth_year <- with(employees, year(birth_date))

官术网_书友最值得收藏!

R for Data Science Cookbook

Adding new records

Getting ready

How to do it…

How it works…

There's more…