官术网_书友最值得收藏!

Scientific libraries used in the book

Throughout this book, certain libraries are necessary to implement the machine-learning techniques discussed in each chapter. We are going to briefly describe the most relevant libraries employed hereafter:

  • SciPy is a collection of mathematical methods based on the NumPy array objects. It is an open source project so it takes advantage of additional methods continuously written from developers around the world. Python software that employs a SciPy routine is part of advanced projects or applications comparable to similar frameworks such as MATLAB, Octave or RLab. There are a wide range of methods available from manipulating and visualizing data functions to parallel computing routines that enhance the versatility and potentiality of the Python language.
  • scikit-learn (sklearn) is an open source machine learning module for Python programming language. It implements various algorithms such as clustering, classification, and regression including support vector machines, Naive Bayes, Decision Trees, Random Forests, k-means, and Density Based Spatial Clustering of Applications with Noise (DBSCAN) and it interacts natively with numerical Python libraries such as NumPy and SciPy. Although most of the routines are written in Python, some functions are implemented in Cython to achieve better performance. For instance, support vector machines and logistic regression are written in Cython wrapping other external libraries (LIBSVM, LIBLINEAR).
  • The Natural Language Toolkit (NLTK), is a collection of libraries and functions for Natural Language Processing (NLP) for Python language processing. NLTK is designed to support research and teaching on NLP and related topics including artificial intelligence, cognitive science, information retrieval, linguistics, and machine learning. It also features a series of text processing routines for tokenization, stemming, tagging, parsing, semantic reasoning, and classification. NLTK includes sample codes and sample data and interfaces to more than 50 corpora and lexical databases.
  • Scrapy is an open source web crawling framework for the Python programming language. Originally designed for scraping websites, and as a general purpose crawler, it is also suitable for extracting data through APIs. The Scrapy project is written around spiders that act by providing a set of instructions. It also features a web crawling shell that allows the developers to test their concepts before actually implementing them. Scrapy is currently maintained by Scrapinghub Ltd., a web scraping development and services Company.
  • Django is a free and open source web application framework implemented in Python following the model view controller architectural pattern. Django is designed for creation of complex, database-oriented websites. It also allows us to manage the application through an administrative interface, which can create, read, delete, or update data used in the application. There are a series of established websites that currently use Django, such as Pinterest, Instagram, Mozilla, The Washington Times, and Bitbucket.
主站蜘蛛池模板: 江华| 佛冈县| 滨州市| 汉中市| 罗定市| 玉屏| 法库县| 香格里拉县| 乌鲁木齐市| 囊谦县| 景洪市| 尤溪县| 奉新县| 察哈| 卢湾区| 新蔡县| 沧源| 米脂县| 灌南县| 祁连县| 吴忠市| 钟山县| 尼木县| 汨罗市| 清镇市| 汉沽区| 家居| 乌什县| 武定县| 洞头县| 石柱| 怀集县| 北川| 辽宁省| 金塔县| 凉城县| 庆安县| 武冈市| 南溪县| 司法| 弋阳县|