官术网_书友最值得收藏!

What this book covers

Chapter 1, Data Science Overview, covers what the term data science means, the need for data science, the difference compared with traditional BI/DWH, and the competencies and knowledge required in order to be a data scientist.

Chapter 2, SQL Server 2017 as a Data Science Platform, explains the architecture of SQL Server from a data science perspective: in-memory OLTP for data acquisition; integration services as a transformation feature set; reporting services for visualization of input as well as output data; and, probably most importantly of all, T-SQL as a language for data exploration and transformation and machine learning services for making models themselves. 

Chapter 3, Data Sources for Analytics, covers relational databases and NoSQL concepts side-by-side as valuable sources of data with a different approach to use. It also provides an overview of technologies such as HDInsight, Apache Hadoop, and Cosmos DB, and querying against such data sources.

Chapter 4, Data Transforming and Cleaning with T-SQL, demonstrates T-SQL techniques that are useful for making data consumable and complete for further utilization in data science, along with database architectures that are useful for transform/cleansing tasks.

Chapter 5, Data Exploration and Statistics with T-SQL, takes a deep dive into T-SQL capabilities, including common grouping and aggregations, framing/windowing, running aggregates, and (if needed) features such as custom CLR aggregates (with performance considerations).

Chapter 6, Custom Aggregations on SQL Server, explains how to create your own aggregations in order to enhance core T-SQL functionality.

Chapter 7, Data Visualization, explains the importance of visualizing data to reveal hidden patterns therein, along with examples of reporting services, PowerView, and PowerBI. By way of an alternative, an overview of R/Python visualization features is also provided (as these languages will play a vital role later in the book).

Chapter 8, Data Transformations with Other Tools, explains how to use integration services, probably R or Python, to transform data into a useful format, replacing missing values, detecting mistakes in datasets, normalization and its purpose, categorization, and finally data denormalization for better analytic purposes using views.

Chapter 9, Predictive Model Training and Evaluation, concerns a wide set of predictive models (clustering, N-point Bayes machines, recommenders) and their implementations via Machine Learning Studio, R, or Python.

Chapter 10Making Predictions, explains how to use models created, evaluated, and scored in previous chapters. We will also learn how to make the model self-learning from the predictions made.

Chapter 11Getting It All Together – a Real-World Exampledemonstrates how to use certain features to grab, transform, and analyze data for a successful data science case.

Chapter 12Next Steps with Data Science and SQLsummarizes the main points of all the preceding chapters and concludes outcomes. The chapter also provides ideas of how to continue working with data science, which trends are probably awaited in the future, and which other technologies will play strong roles in data science.

主站蜘蛛池模板: 富顺县| 丹江口市| 博客| 武汉市| 比如县| 商水县| 东阳市| 嘉定区| 交口县| 城步| 甘孜县| 辉县市| 金坛市| 通海县| 安庆市| 阜南县| 东莞市| 股票| 南昌市| 重庆市| 黄浦区| 新郑市| 临夏市| 乌拉特后旗| 江油市| 通州区| 交口县| 伊通| 安塞县| 曲阳县| 乐都县| 拉萨市| 当阳市| 大悟县| 庆安县| 隆昌县| 谷城县| 弥勒县| 景宁| 麟游县| 京山县|