官术网_书友最值得收藏!

Preface

Big data, the Internet of Things, and artificial intelligence have become the hottest technology buzzwords in recent years. Although there are many different terms used to define these technologies, the common concept is that they're all driven by data. Simply having data is not enough; being able to unlock its value is essential. Therefore, data scientists have begun to focus on how to gain insights from raw data.

Data science has become one of the most popular subjects among academic and industry groups. However, as data science is a very broad discipline, learning how to master it can be challenging. A beginner must learn how to prepare, process, aggregate, and visualize data. More advanced techniques involve machine learning, mining various data formats (text, image, and video), and, most importantly, using data to generate business value. The role of a data scientist is challenging and requires a great deal of effort. A successful data scientist requires a useful tool to help solve day-to-day problems.

In this field, the most widely used tool by data scientists is the R language, which is open source and free. Being a machine language, it provides many data processes, learning packages, and visualization functions, allowing users to analyze data on the fly. R helps users quickly perform analysis and execute machine learning algorithms on their dataset without knowing every detail of the sophisticated mathematical models.

R for Data Science Cookbook takes a practical approach to teaching you how to put data science into practice with R. The book has 12 chapters, each of which is introduced by breaking down the topic into several simple recipes. Through the step-by-step instructions in each recipe, you can apply what you have learned from the book by using a variety of packages in R.

The first section of this book deals with how to create R functions to avoid unnecessary duplication of code. You will learn how to prepare, process, and perform sophisticated ETL operations for heterogeneous data sources with R packages. An example of data manipulation is provided that illustrates how to use the dplyr and data.table packages to process larger data structures efficiently, while there is a section focusing on ggplot2 that covers how to create advanced figures for data exploration. Also, you will learn how to build an interactive report using the ggvis package.

This book also explains how to use data mining to discover items that are frequently purchased together. Later chapters offer insight into time series analysis on financial data, while there is detailed information on the hot topic of machine learning, including data classification, regression, clustering, and dimension reduction.

With R for Data Science Cookbook in hand, I can assure you that you will find data science has never been easier.

主站蜘蛛池模板: 达日县| 界首市| 武穴市| 喀什市| 镇巴县| 青川县| 兴宁市| 长白| 永清县| 右玉县| 高州市| 宜兴市| 武义县| 南木林县| 姚安县| 淮南市| 屏南县| 泾源县| 准格尔旗| 高邑县| 镇宁| 三都| 康马县| 巴马| 读书| 栾城县| 湟源县| 卢氏县| 隆化县| 榆林市| 万州区| 云霄县| 朝阳区| 义乌市| 四平市| 西充县| 敦化市| 五指山市| 墨脱县| 新营市| 沙河市|