官术网_书友最值得收藏!

Introduction

Reproducible data analysis is a cornerstone of good science. In today's rapidly evolving world of science and technology, reproducibility is a hot topic. Reproducibility is about lowering barriers for other people. It may seem strange or unnecessary, but reproducible analysis is essential to get your work acknowledged by others. If a lot of people confirm your results, it will have a positive effect on your career. However, reproducible analysis is hard. It has important economic consequences, as you can read in Freedman LP, Cockburn IM, Simcoe TS (2015) The Economics of Reproducibility in Preclinical Research. PLoS Biol 13(6): e1002165. doi:10.1371/journal.pbio.1002165.

So reproducibility is important for society and for you, but how does it apply to Python users? Well, we want to lower barriers for others by:

  • Giving information about the software and hardware we used, including versions.
  • Sharing virtual environments.
  • Logging program behavior.
  • Unit testing the code. This also serves as documentation of sorts.
  • Sharing configuration files.
  • Seeding random generators and making sure program behavior is as deterministic as possible.
  • Standardizing reporting, data access, and code style.

I created the dautil package for this book, which you can install with pip or from the source archive provided in this book's code bundle. If you are in a hurry, run $ python install_ch1.py to install most of the software for this chapter, including dautil. I created a test Docker image, which you can use if you don't want to install anything except Docker (see the recipe, Sandboxing Python applications with Docker images).

主站蜘蛛池模板: 孝昌县| 乐陵市| 宿松县| 朝阳县| 霍邱县| 昌图县| 华宁县| 巴塘县| 武城县| 绿春县| 五常市| 迁安市| 从江县| 通化市| 荔波县| 安平县| 海南省| 磐石市| 缙云县| 泌阳县| 襄城县| 房产| 图们市| 双牌县| 黔西县| 民勤县| 突泉县| 德令哈市| 安阳县| 新干县| 西畴县| 昂仁县| 六枝特区| 万州区| 方城县| 连城县| 博客| 微博| 黄平县| 嘉峪关市| 镇巴县|