The chi-square statistics are defined by the following formula:
Here, n is the size of the sample, s is the standard deviation of the sample, and σ is the standard deviation of the population.
If we repeatedly take samples and define the chi-square statistics, then we can form a chi-square distribution, which is defined by the following probability density function:
Here, Y0 is a constant that depends on the number of degrees of freedom, Χ2 is the chi-square statistic, v = n - 1 is the number of degrees of freedom, and e is a constant equal to the base of the natural logarithm system.
Y0 is defined so that the area under the chi-square curve is equal to one.
Chi-square for the goodness of fit
The Chi-square test can be used to test whether the observed data differs significantly from the expected data. Let's take the example of a dice. The dice is rolled 36 times and the probability that each face should turn upwards is 1/6. So, the expected distribution is as follows:
The first value is the chi-square value and the second value is the p-value, which is very high. This means that the null hypothesis is valid and the observed value is similar to the expected value.