官术网_书友最值得收藏!

  • Machine Learning in Java
  • AshishSingh Bhatia Bostjan Kaluza
  • 357字
  • 2021-06-10 19:29:56

Sampling traps

Data collection may involve many traps. To demonstrate one, let me share a story. There is supposed to be a global, unwritten rule for sending regular mail between students for free. If you write student to student in the place where the stamp should be, the mail is delivered to the recipient for free. Now, suppose Jacob sends a set of postcards to Emma, and given that Emma indeed receives some of the postcards, she concludes that all of the postcards are delivered and that the rule indeed holds true. Emma reasons that, as she received the postcards, all of the postcards are delivered. However, she does not know of the postcards that were sent by Jacob, but were undelivered; hence, she is unable to account for this in her inference. What Emma experienced is survivorship bias; that is, she drew the conclusion based on the data that survived. For your information, postcards that are sent with a student to student stamp get a circled black letter T stamp on them, which mean postage is due and the receiver should pay it, including a small fine. However, mail services often have higher costs on applying such fees and hence do not do it. (Magalh?es, 2010).

Another example is a study that found that the profession with the lowest average age of death was student. Being a student does not cause you to die at an early age; rather, being a student means you are young. This is why the average is so low. (Gelman and Nolan, 2002).

Furthermore, a study that found that only 1.5% of drivers in accidents reported they were using a cell phone, whereas 10.9% reported another occupant in the car distracted them. Can we conclude that using a cell phone is safer than speaking with another occupant? (Uts, 2003) To answer this question, we need to know the prevalence of the cell phone use. It is likely that a higher number of people talked to another occupant in the car while driving than talked on a cell phone during the period when the data was collected.

主站蜘蛛池模板: 乌鲁木齐市| 海兴县| 蒙城县| 炎陵县| 莲花县| 乐亭县| 临颍县| 香港| 永州市| 启东市| 广元市| 平陆县| 邓州市| 开远市| 呼和浩特市| 尉犁县| 滁州市| 阳江市| 美姑县| 盐源县| 宁陕县| 科技| 柳河县| 中西区| 中方县| 西乡县| 准格尔旗| 古田县| 双流县| 黄冈市| 安陆市| 秦皇岛市| 武邑县| 天镇县| 彭阳县| 田阳县| 青岛市| 长治县| 黔西县| 海门市| 棋牌|