官术网_书友最值得收藏!

  • Machine Learning in Java
  • AshishSingh Bhatia Bostjan Kaluza
  • 357字
  • 2021-06-10 19:29:56

Sampling traps

Data collection may involve many traps. To demonstrate one, let me share a story. There is supposed to be a global, unwritten rule for sending regular mail between students for free. If you write student to student in the place where the stamp should be, the mail is delivered to the recipient for free. Now, suppose Jacob sends a set of postcards to Emma, and given that Emma indeed receives some of the postcards, she concludes that all of the postcards are delivered and that the rule indeed holds true. Emma reasons that, as she received the postcards, all of the postcards are delivered. However, she does not know of the postcards that were sent by Jacob, but were undelivered; hence, she is unable to account for this in her inference. What Emma experienced is survivorship bias; that is, she drew the conclusion based on the data that survived. For your information, postcards that are sent with a student to student stamp get a circled black letter T stamp on them, which mean postage is due and the receiver should pay it, including a small fine. However, mail services often have higher costs on applying such fees and hence do not do it. (Magalh?es, 2010).

Another example is a study that found that the profession with the lowest average age of death was student. Being a student does not cause you to die at an early age; rather, being a student means you are young. This is why the average is so low. (Gelman and Nolan, 2002).

Furthermore, a study that found that only 1.5% of drivers in accidents reported they were using a cell phone, whereas 10.9% reported another occupant in the car distracted them. Can we conclude that using a cell phone is safer than speaking with another occupant? (Uts, 2003) To answer this question, we need to know the prevalence of the cell phone use. It is likely that a higher number of people talked to another occupant in the car while driving than talked on a cell phone during the period when the data was collected.

主站蜘蛛池模板: 泰州市| 柯坪县| 仲巴县| 遂溪县| 平谷区| 日喀则市| 鱼台县| 南阳市| 海伦市| 高陵县| 梧州市| 凤庆县| 红桥区| 读书| 年辖:市辖区| 潼关县| 织金县| 江山市| 丰镇市| 怀化市| 赫章县| 元阳县| 白银市| 阳新县| 嘉禾县| 探索| 科技| 镇安县| 舞阳县| 嵊泗县| 荣昌县| 麻城市| 邢台市| 湟源县| 屏山县| 政和县| 阿合奇县| 菏泽市| 灵丘县| 商丘市| 襄垣县|