官术网_书友最值得收藏!

The statistical approach versus the machine learning approach

In 2001, Leo Breiman published a paper titled Statistical Modeling: The Two Cultures (http://projecteuclid.org/euclid.ss/1009213726) that underlined the differences between the statistical approach focused on validation and explanation of the underlying process in the data and the machine learning approach, which is more concerned with the results.

Roughly put, a classic statistical analysis follows steps such as the following:

  1. A hypothesis called the null hypothesis is stated. This null hypothesis usually states that the observation is due to randomness.
  2. The probability (or p-value) of the event under the null hypothesis is then calculated.
  3. If that probability is below a certain threshold (usually p < 0.05), then the null hypothesis is rejected, which means that the observation is not a random fluke.

p> 0.05 does not imply that the null hypothesis is true. It only means that you cannot reject it, as the probability of the observation happening by chance is not large enough.

This methodology is geared toward explaining and discovering the influencing factors of the phenomenon. The goal here is to establish/build a somewhat static and fully known model that will fit observations as well as possible and, therefore, will be able to predict future patterns, behaviors, and observations.

In the machine learning approach, in predictive analytics, an explicit representation of the model is not the focus. The goal is to build the best model for the prediction period, and the model builds itself from the observations. The internals of the models are not explicit. This machine learning approach is called a black box model.

By removing the need for explicit modeling of the data, the ML approach has a stronger potential for predictions. ML is focused on making the most accurate predictions possible by minimizing the prediction error of a model at the expense of explainability. 

主站蜘蛛池模板: 松滋市| 措勤县| 绥化市| 长治县| 绍兴县| 潞城市| 依兰县| 安岳县| 巍山| 巴里| 蕲春县| 涟源市| 于都县| 德化县| 平安县| 山阳县| 简阳市| 丽水市| 北川| 阳江市| 资溪县| 贵港市| 工布江达县| 宣城市| 皮山县| 多伦县| 杭州市| 白沙| 德惠市| 宝清县| 安达市| 蒙山县| 桐城市| 冀州市| 松桃| 西平县| 西峡县| 兴业县| 宝清县| 西充县| 清河县|