舉報

會員
The Data Science Workshop
最新章節(jié):
Summary
Wherethere’sdata,there’sinsight.Withsomuchdatabeinggenerated,thereisimmensescopetoextractmeaningfulinformationthat’llboostbusinessproductivityandprofitability.Bylearningtoconvertrawdataintogame-changinginsights,you’llopennewcareerpathsandopportunities.TheDataScienceWorkshopbeginsbyintroducingdifferenttypesofprojectsandshowingyouhowtoincorporatemachinelearningalgorithmsinthem.You’lllearntoselectarelevantmetricandevenassesstheperformanceofyourmodel.Totunethehyperparametersofanalgorithmandimproveitsaccuracy,you’llgethands-onwithapproachessuchasgridsearchandrandomsearch.Next,you’lllearndimensionalityreductiontechniquestoeasilyhandlemanyvariablesatonce,beforeexploringhowtousemodelensemblingtechniquesandcreatenewfeaturestoenhancemodelperformance.Inabidtohelpyouautomaticallycreatenewfeaturesthatimproveyourmodel,thebookdemonstrateshowtousetheautomatedfeatureengineeringtool.You’llalsounderstandhowtousetheorchestrationandschedulingworkflowtodeploymachinelearningmodelsinbatch.Bytheendofthisbook,you’llhavetheskillstostartworkingondatascienceprojectsconfidently.Bytheendofthisbook,you’llhavetheskillstostartworkingondatascienceprojectsconfidently.
目錄(121章)
倒序
- 封面
- 版權(quán)信息
- Preface
- About the Book
- 1. Introduction to Data Science in Python
- Introduction
- Application of Data Science
- Overview of Python
- Python for Data Science
- Scikit-Learn
- Summary
- 2. Regression
- Introduction
- Simple Linear Regression
- Multiple Linear Regression
- Conducting Regression Analysis Using Python
- Multiple Regression Analysis
- Assumptions of Regression Analysis
- Explaining the Results of Regression Analysis
- Summary
- 3. Binary Classification
- Introduction
- Understanding the Business Context
- Feature Engineering
- Data-Driven Feature Engineering
- Correlation Matrix and Visualization
- Summary
- 4. Multiclass Classification with RandomForest
- Introduction
- Training a Random Forest Classifier
- Evaluating the Model's Performance
- Maximum Depth
- Minimum Sample in Leaf
- Maximum Features
- Summary
- 5. Performing Your First Cluster Analysis
- Introduction
- Clustering with k-means
- Interpreting k-means Results
- Choosing the Number of Clusters
- Initializing Clusters
- Calculating the Distance to the Centroid
- Standardizing Data
- Summary
- 6. How to Assess Performance
- Introduction
- Splitting Data
- Assessing Model Performance for Regression Models
- Assessing Model Performance for Classification Models
- The Confusion Matrix
- Receiver Operating Characteristic Curve
- Area Under the ROC Curve
- Saving and Loading Models
- Summary
- 7. The Generalization of Machine Learning Models
- Introduction
- Overfitting
- Underfitting
- Data
- Random State
- Cross-Validation
- cross_val_score
- LogisticRegressionCV
- Hyperparameter Tuning with GridSearchCV
- Hyperparameter Tuning with RandomizedSearchCV
- Model Regularization with Lasso Regression
- Ridge Regression
- Summary
- 8. Hyperparameter Tuning
- Introduction
- What Are Hyperparameters?
- Finding the Best Hyperparameterization
- Tuning Using Grid Search
- GridSearchCV
- Random Search
- Summary
- 9. Interpreting a Machine Learning Model
- Introduction
- Linear Model Coefficients
- RandomForest Variable Importance
- Variable Importance via Permutation
- Partial Dependence Plots
- Local Interpretation with LIME
- Summary
- 10. Analyzing a Dataset
- Introduction
- Exploring Your Data
- Analyzing Your Dataset
- Analyzing the Content of a Categorical Variable
- Summarizing Numerical Variables
- Visualizing Your Data
- Boxplots
- Summary
- 11. Data Preparation
- Introduction
- Handling Row Duplication
- Converting Data Types
- Handling Incorrect Values
- Handling Missing Values
- Summary
- 12. Feature Engineering
- Introduction
- 13. Imbalanced Datasets
- Introduction
- Understanding the Business Context
- Challenges of Imbalanced Datasets
- Strategies for Dealing with Imbalanced Datasets
- Generating Synthetic Samples
- Summary
- 14. Dimensionality Reduction
- Introduction
- Creating a High-Dimensional Dataset
- Strategies for Addressing High-Dimensional Datasets
- Comparing Different Dimensionality Reduction Techniques
- Summary
- 15. Ensemble Learning
- Introduction
- Ensemble Learning
- Simple Methods for Ensemble Learning
- Advanced Techniques for Ensemble Learning
- Summary 更新時間:2021-06-11 18:27:53
推薦閱讀
- 極簡算法史:從數(shù)學(xué)到機器的故事
- Objective-C應(yīng)用開發(fā)全程實錄
- 神經(jīng)網(wǎng)絡(luò)編程實戰(zhàn):Java語言實現(xiàn)(原書第2版)
- 軟件測試工程師面試秘籍
- Python金融數(shù)據(jù)分析
- Visual C++串口通信技術(shù)詳解(第2版)
- Java深入解析:透析Java本質(zhì)的36個話題
- SEO實戰(zhàn)密碼
- 精通網(wǎng)絡(luò)視頻核心開發(fā)技術(shù)
- Visual C#.NET程序設(shè)計
- Android系統(tǒng)原理及開發(fā)要點詳解
- C/C++程序員面試指南
- 新一代SDN:VMware NSX 網(wǎng)絡(luò)原理與實踐
- 詳解MATLAB圖形繪制技術(shù)
- 區(qū)塊鏈項目開發(fā)指南
- Struts 2.x權(quán)威指南
- ActionScript 3.0從入門到精通(視頻實戰(zhàn)版)
- Visual C++開發(fā)寶典
- Java從入門到精通(視頻實戰(zhàn)版)
- .NET應(yīng)用架構(gòu)設(shè)計:原則、模式與實踐
- 設(shè)計模式之禪
- Mahout實踐指南
- 軟件測試技術(shù)實戰(zhàn):設(shè)計、工具及管理
- 面向?qū)ο蠹夹g(shù)與工具(第2版)
- 常用工具軟件(第4版)
- VEX IQ機器人設(shè)計入門與編程實例
- 實用卷積神經(jīng)網(wǎng)絡(luò):運用Python實現(xiàn)高級深度學(xué)習(xí)模型
- Android模塊化開發(fā)項目式教程(Android Studio)
- 小程序,大未來:微信小程序開發(fā)
- Akka實戰(zhàn):快速構(gòu)建高可用分布式應(yīng)用