- Python:Advanced Predictive Analytics
- Ashish Kumar Joseph Babcock
- 213字
- 2021-07-02 20:09:27
Summary
In this chapter, we skimmed through the basic concepts of statistics. Here is a brief summary of the concepts we learned:
- Hypothesis testing is used to test the statistical significance of a hypothesis. The one which already exists or is assumed to be true is a null hypothesis, the one which someone is not sure about or is being proposed as an alternate premise is an alternate hypothesis.
- One needs to calculate a statistic and the associated p-value to conduct the test.
- Hypothesis testing (p-values) is used to test the significance of the estimates of the coefficients calculated by the model.
- The chi-square test is used to test the causal relationship between a predictor and an input variable. It can also be used to check whether the data is fair or fake.
- The correlation coefficient can range from -1 to 1. The closer it is to the extremes, the stronger is the relationship between the two variables.
Linear regression is part of the family of algorithms called supervised algorithms as the dataset on which they are built has an output variable. In a sense, one can say that this output variable governs or supervises the development of the model and hence the name. More on this is covered in the next chapter.
推薦閱讀
- 計(jì)算機(jī)組成原理與接口技術(shù):基于MIPS架構(gòu)實(shí)驗(yàn)教程(第2版)
- Hands-On Machine Learning with Microsoft Excel 2019
- MongoDB管理與開發(fā)精要
- Python金融大數(shù)據(jù)分析(第2版)
- 大數(shù)據(jù):規(guī)劃、實(shí)施、運(yùn)維
- 達(dá)夢(mèng)數(shù)據(jù)庫性能優(yōu)化
- 大數(shù)據(jù)營銷:如何讓營銷更具吸引力
- OracleDBA實(shí)戰(zhàn)攻略:運(yùn)維管理、診斷優(yōu)化、高可用與最佳實(shí)踐
- 辦公應(yīng)用與計(jì)算思維案例教程
- Instant Autodesk AutoCAD 2014 Customization with .NET
- Spark分布式處理實(shí)戰(zhàn)
- Oracle數(shù)據(jù)庫管理、開發(fā)與實(shí)踐
- R Object-oriented Programming
- 利用Python進(jìn)行數(shù)據(jù)分析(原書第2版)
- 數(shù)據(jù)之美:一本書學(xué)會(huì)可視化設(shè)計(jì)