官术网_书友最值得收藏!

Apache Mahout

The Apache Mahout project aims to build a scalable machine learning library. It is built atop scalable, distributed architectures, such as Hadoop, using the MapReduce paradigm, which is an approach for processing and generating large datasets with a parallel, distributed algorithm using a cluster of servers.

Mahout features a console interface and the Java API as scalable algorithms for clustering, classification, and collaborative filtering. It is able to solve three business problems:

  • Item recommendation: Recommending items such as People who liked this movie also liked
  • Clustering: Sorting of text documents into groups of topically-related documents
  • Classification: Learning which topic to assign to an unlabelled document

Mahout is distributed under a commercially friendly Apache license, which means that you can use it as long as you keep the Apache license included and display it in your program's copyright notice.

Mahout features the following libraries:

  • org.apache.mahout.cf.taste: These are collaborative filtering algorithms based on user-based and item-based collaborative filtering and matrix factorization with ALS
  • org.apache.mahout.classifier: These are in-memory and distributed implementations, including logistic regression, Naive Bayes, random forest, hidden Markov models (HMM), and multilayer perceptron
  • org.apache.mahout.clustering: These are clustering algorithms such as canopy clustering, k-means, fuzzy k-means, streaming k-means, and spectral clustering
  • org.apache.mahout.common: These are utility methods for algorithms, including distances, MapReduce operations, iterators, and so on
  • org.apache.mahout.driver: This implements a general-purpose driver to run main methods of other classes
  • org.apache.mahout.ep: This is the evolutionary optimization using the recorded-step mutation
  • org.apache.mahout.math: These are various math utility methods and implementations in Hadoop
  • org.apache.mahout.vectorizer: These are classes for data presentation, manipulation, and MapReduce jobs
主站蜘蛛池模板: 嘉禾县| 乌拉特中旗| 汤阴县| 儋州市| 思茅市| 忻州市| 罗平县| 丽水市| 镇巴县| 墨竹工卡县| 贵溪市| 平罗县| 栾城县| 镇江市| 乌鲁木齐县| 锡林郭勒盟| 潞城市| 文山县| 江川县| 祁阳县| 同心县| 勃利县| 揭西县| 平山县| 台东市| 盖州市| 安远县| 中江县| 色达县| 冕宁县| 鄂尔多斯市| 阿拉善右旗| 建湖县| 永修县| 丽水市| 镇雄县| 渭南市| 德格县| 富民县| 兴文县| 革吉县|