官术网_书友最值得收藏!

Getting started

Before we get started with discussing the process of tidying data, it would be very prudent to point out that whatever you do to tidy your data, you should be sure to:

  1. Create and save your scripts so that you can use them again for new or similar data sources. This is referred to as reusability. Why spend time recreating the same code, rules, or logic if you don't have to? This applies to new data within the same project (that the scripts were developed for) or new projects you may be involved with in the future.
  2. Tidy your data as "far upstream" as possible, perhaps even at the original source. In other words, save and maintain the original data, but use programmatic scripts to clean it, fix mistakes, and save that cleaned dataset for further analysis.
主站蜘蛛池模板: 大田县| 阜新| 长丰县| 阿拉善盟| 长治市| 汉中市| 枣阳市| 惠安县| 电白县| 涿州市| 卫辉市| 伊春市| 广宗县| 易门县| 泰兴市| 兴隆县| 莎车县| 社旗县| 库伦旗| 定远县| 闻喜县| 弋阳县| 崇阳县| 岢岚县| 正定县| 寿阳县| 松原市| 林西县| 呈贡县| 泸西县| 双牌县| 应城市| 霍州市| 清镇市| 寿光市| 偏关县| 营山县| 苍山县| 甘南县| 长阳| 武冈市|