官术网_书友最值得收藏!

Preface

As data scientists and machine learning professionals, our jobs are to build models for detecting frauds, predicting customer churns, or turning data into insights in a broad sense; for this, we sometimes need to process huge amounts of data and handle complicated computations. Therefore, we are always excited to see new computing tools, such as Spark, and spend a lot of time learning about them. To learn about these new tools, a lot of learning materials are available, but they are from a more computing perspective, and often written by computer scientists.

We, the data scientists and machine learning professionals, as users of Spark, are more concerned about how the new systems can help us build models with more predictive accuracy and how these systems can make data processing and coding easy for us. This is the main reason why this book has been developed and why this book has been written by a data scientist.

At the same time, we, as data scientists and machine learning professionals, have already developed our frameworks and processes as well as used some good model building tools, such as R and SPSS. We understand that some of the new tools, such as MLlib of Spark, may replace certain old tools, but not all of them. Therefore, using Spark together with our existing tools is essential to us as users of Spark and becomes one of the main focuses for this book, which is also one of the critical elements, making this book different from other Spark books.

Overall, this is a Spark book written by a data scientist for data scientists and machine learning professionals to make machine learning easy for us with Spark.

主站蜘蛛池模板: 明水县| 丁青县| 阳曲县| 仙居县| 闻喜县| 霸州市| 和静县| 盐亭县| 分宜县| 新乐市| 仁怀市| 光泽县| 左权县| 宣威市| 新宁县| 保靖县| 大名县| 菏泽市| 丹凤县| 宜黄县| 海丰县| 乌鲁木齐县| 自治县| 哈密市| 阳新县| 比如县| 西乌珠穆沁旗| 同德县| 贵港市| 夏河县| 新河县| 锦州市| 虹口区| 安乡县| 平湖市| 徐州市| 吉林省| 稷山县| 吕梁市| 康保县| 卢龙县|