書名： Statistics for Data Science
作者名： James D. Miller
本章字數： 163字
更新時間： 2021-07-02 14:58:53

Statistical comparison

Simply put, when you hear the term statistical comparison, one is usually referring to the act of a data scientist performing a process of analysis to view the similarities or variances of two or more groups or populations (or recordsets).

As a data developer, one might be familiar with various utilities such as FC Compare, UltraCompare, or WinDiff, which aim to provide the developer with a line-by-line comparison of the contents of two or more (even binary) files.

In statistics (data science), this process of comparing is a statistical technique to compare populations or recordsets. In this method, a data scientist will conduct what is called an Analysis of Variance (ANOVA), compare categorical variables (within the recordsets), and so on.

ANOVA is an assortment of statistical methods that are used to analyze the differences among group means and their associated procedures (such as variations among and between groups, populations, or recordsets). This method eventually evolved into the Six Sigma dataset comparisons.

官术网_书友最值得收藏!

Statistics for Data Science

Statistical comparison