Excel is another popular tool used to store and analyze data. Of course, one can convert Excel files to CSV files or other text formats by using Excel. Alternatively, to simplify the process, you can use install and load the xlsx package to read and process Excel data in R.
Getting ready
In this recipe, you need to prepare your environment with R installed and a computer that can access the Internet.
How to do it…
Please perform the following steps to read Excel documents:
In this recipe, we covered how to read and write an Excel file containing world development indicators with the xlsx package. In the first step, we needed to install and load the xlsx package, which enables the user to read and write Excel files in the R command prompt through the use of a Java POI package. Thus, to utilize the Java POI package, the installation process will also install rJava and xlsxjars at the same time. You can find the Java POI .jar file under <R Installed Path>\library\xlsx]jars\java. Using the author's computer as an example, which has the Windows 7 operating system installed, the .jar files are located at the C:\Program Files\R\R-3.2.1\library\xlsxjars\java path.
Next, we downloaded world economy indicator data from the link (http://data.worldbank.org/topic/economy-and-growth) with the download.file function. By default, download.file downloads the file in ASCII mode. To download the file in binary mode, we need to set download mode to wb.
After the Excel file is downloaded, we can examine it with Excel. The screenshot of the Excel file shows that the economy indicator starts from row 4 in Sheet 1. Therefore, we can use the read.xlsx2 function to read world economy indicators from this location. The xlsx package provides two functions to read data from Excel: read.xlsx and read.xlsx2. As the read.xlsx2 function mainly processes data in Java, read.xlsx2 performs better (in particular, read.xlsx2 processes data considerably faster on sheets with more than 100,000 cells).
After we have read the contents of the worksheet into an R data frame, we can select variables Country.Name, Country.Code, Indicator.Name, Indicator.Code, and X2014 out of the extracted R data frame. Next, we can use the dim function to examine the dimensions of the data frame. Finally, we can use write.xlsx2 to write transformed data to an Excel file, 2014wbdata.xlsx.