官术网_书友最值得收藏!

Introduction

There's not much data analysis that can be done without data, so the first step in any project is to evaluate the data we have and the data that we need. Once we have some idea of what we'll need, we have to figure out how to get it.

Many of the recipes in this chapter and in this book use Incanter (http://incanter.org/) to import the data and target Incanter datasets. Incanter is a library that is used for statistical analysis and graphics in Clojure (similar to R) an open source language for statistical computing (http://www.r-project.org/). Incanter might not be suitable for every task (for example, we'll use the Weka library for machine learning later) but it is still an important part of our toolkit for doing data analysis in Clojure. This chapter has a collection of recipes that can be used to gather data and make it accessible to Clojure.

For the very first recipe, we'll take a look at how to start a new project. We'll start with very simple formats such as comma-separated values (CSV) and move into reading data from relational databases using JDBC. We'll examine more complicated data sources, such as web scraping and linked data (RDF).

主站蜘蛛池模板: 剑川县| 乐安县| 白沙| 乌兰浩特市| 平舆县| 曲阜市| 天峨县| 金乡县| 原平市| 手机| 中阳县| 甘孜| 台东市| 攀枝花市| 上高县| 曲阳县| 仁布县| 沂源县| 瓦房店市| 文山县| 屯门区| 德安县| 噶尔县| 金秀| 灌南县| 镇赉县| 津市市| 镇安县| 德惠市| 墨玉县| 静乐县| 龙南县| 白水县| 壶关县| 洞头县| 孟州市| 顺昌县| 靖安县| 大同县| 夏河县| 扎囊县|