官术网_书友最值得收藏!

Summary

In this chapter, we wanted to give you a brief glimpse into the life of a data scientist, what this entails, and some of the challenges that data scientists consistently face. In light of these challenges, we feel that the Apache Spark project is ideally positioned to help tackle these topics, which range from data ingestion and feature extraction/creation to model building and deployment. We intentionally kept this chapter short and light on verbiage because we feel working through examples and different use cases is a better use of time as opposed to speaking abstractly and at length about a given data science topic. Throughout the rest of this book, we will focus solely on this process while giving best-practice tips and recommended reading along the way for users who wish to learn more. Remember that before embarking on your next data science project, be sure to clearly define the problem beforehand, so you can ask an intelligent question of your data and (hopefully) get an intelligent answer!

One awesome website for all things data science is KDnuggets (http://www.kdnuggets.com). Here's a great article on the language all data scientists must learn in order to be successful (http://www.kdnuggets.com/2015/09/one-language-data-scientist-must-master.html).

主站蜘蛛池模板: 迁安市| 仲巴县| 小金县| 四平市| 德兴市| 上蔡县| 石柱| 德钦县| 澳门| 凤城市| 泰来县| 朝阳县| 徐水县| 吉林省| 油尖旺区| 方山县| 信阳市| 阿拉善右旗| 丹东市| 都匀市| 五原县| 惠安县| 昭觉县| 图们市| 惠东县| 广宁县| 霸州市| 霍林郭勒市| 赣榆县| 拜泉县| 陵水| 江油市| 张家港市| 哈尔滨市| 呼伦贝尔市| 石楼县| 石林| 兴义市| 娱乐| 突泉县| 双柏县|