官术网_书友最值得收藏!

What you need for this book

To complete the projects in this book, you will need a version of Python 3.5 or higher. I recommend using Anaconda Python, but any Python distribution will do as long as it is updated and contains the following packages: Numpy, Matplotlib, NetworkX, PyMySQL, Gensim, and NLTK. In Chapter 1, Expanding Your Data Mining Toolbox, we will walk through an easy installation of Python and all these libraries, and each time a library is used later in the book, we will install it or upgrade it together.

Because data mining is obviously data-centric, and because the data sets we are working with are sometimes large or require some type of persistent data storage, I chose to implement some of the data mining algorithms alongside a relational database system. I chose MySQL for accomplishing this since it is an established, easy-to-download and install piece of infrastructure. The chapters where MySQL comes into play are in working with the memory-intensive algorithms in Chapter 2, Association Rule Mining, and Chapter 3, Entity Matching. I also use MySQL for some of the examples in Chapter 9, Mining for Data Anomalies, but it is possible to go through that chapter without MySQL.

主站蜘蛛池模板: 五指山市| 康保县| 闽清县| 临安市| 遵化市| 饶平县| 缙云县| 镇雄县| 阳泉市| 三原县| 南昌县| 信阳市| 仲巴县| 海盐县| 民乐县| 平潭县| 新兴县| 白玉县| 灵丘县| 怀远县| 郧西县| 宁波市| 宣武区| 滨海县| 延吉市| 富锦市| 东阿县| 友谊县| 时尚| 五华县| 防城港市| 江永县| 天台县| 曲麻莱县| 开江县| 镇平县| 永嘉县| 道孚县| 苍梧县| 颍上县| 彭水|