官术网_书友最值得收藏!

Comparing two proportions

Sometimes, we may want to compare two proportions from two populations. Crucially, we will assume that they are independent of each other. It's difficult to analytically compute the probability that one proportion is less than another, so we often rely on Monte Carlo methods, otherwise known as simulation or random sampling.

We randomly generate the two proportions from their respective posterior distributions, and then track how often one is less than the other. We use the frequency we observed in our simulation to estimate the desired probability.

So, let's see this in action; we have two parameters: θA and θB. These correspond to the proportion of individuals who click on an ad from format A or format B. Users are randomly assigned to one format or the other, and the website tracks how many viewers click on the ad in the different formats.

516 visitors saw format A and 108 of them clicked it. 510 visitors saw format B and 144 of them clicked it. We use the same prior for both θA and θB, which is beta (3, 3). Additionally, the posterior distribution for θA will be B (111, 411) and for θB, it will be B (147, 369). This results in the following output:

We now want to know the probability of θA being less than θBthis is difficult to compute analytically. We can randomly simulate θA and θB, and then use that to estimate this probability. So, let's randomly simulate one θA, as follows:

Then, randomly simulate one θB, as follows:

Finally, we're going to do 1,000 simulations by computing 1,000 θA values and 1,000 θB values, as follows:

This is what we end up with; here, we can see how often θA is less than θB, that is, θwas 996 times less than θB. So, what's the average of this? Well, it is 0.996; this is the probability that θA is less than θB, or an estimate of that probability. Given this, it seems highly likely that more people clicked on the ad for format B than people who clicked on the ad for format A.

That's it for proportions. Next up, we will look at Bayesian methods for analyzing the means of quantitative data.

主站蜘蛛池模板: 石棉县| 荔浦县| 阳江市| 应用必备| 东阿县| 沁水县| 洛南县| 安乡县| 平阳县| 新源县| 汝阳县| 南丰县| 革吉县| 左云县| 闽侯县| 华宁县| 五寨县| 广州市| 丽江市| 库尔勒市| 抚远县| 广饶县| 启东市| 仙居县| 泽州县| 五原县| 宁波市| 宁化县| 桦甸市| 娱乐| 页游| 金堂县| 兴宁市| 资兴市| 墨脱县| 通城县| 新丰县| 上虞市| 东乌珠穆沁旗| 鱼台县| 衡阳市|