官术网_书友最值得收藏!

Collecting data

This should be somewhat obvious—without (at least some) data, we cannot perform any of the subsequent steps (although one might argue the point of inference, that would be inappropriate. There is no magic in data science. We, as data scientists, don't make something from anything. Inference (which we'll define later in this chapter) requires at least some data to begin with.

Some new concepts for collecting data include the fact that data can be collected from ample of sources, and the number and types of data sources continue to grow daily. In addition, how data is collected might require a perspective new to a data developer; data for data science isn't always sourced from a relational database, rather from machine-generated logging files, online surveys, performance statistics, and so on; again, the list is ever evolving.

Another point to ponder—collecting data also involves supplementation. For example, a data scientist might determine that he or she needs to be adding additional demographics to a particular pool of application data previously collected, processed, and reviewed.

主站蜘蛛池模板: 盐亭县| 宜黄县| 泊头市| 攀枝花市| 邯郸县| 龙井市| 浙江省| 长海县| 繁峙县| 宁晋县| 冀州市| 石渠县| 济阳县| 通许县| 武宁县| 巴彦淖尔市| 天气| 涪陵区| 钦州市| 苏尼特右旗| 大洼县| 武胜县| 阜宁县| 新宾| 襄汾县| 靖安县| 罗平县| 措美县| 黄石市| 阿克| 新野县| 宣化县| 靖江市| 闵行区| 巴塘县| 喀喇| 新源县| 喀喇沁旗| 涡阳县| 轮台县| 高淳县|