官术网_书友最值得收藏!

Data analysis packages for Scala

By data analysis packages, we mean software designed for analyzing data in some way. A simple statistical regression would be an example. Software implementing machine-learning algorithms would be another example.

Saddle

Saddle is Scala's answer to R and Python's pandas package. It supports reading in structured data in a variety of different formats, including CSV and HDF5. The data can be loaded into frames and then manipulated as you would in other similar software. Statistical analysis can be performed, and you can build your own statistical analysis methods on top of the data structures provided by Saddle. Saddle is examined in detail in a separate chapter dedicated to it. It can be found at the following website:

https://saddle.github.io/

MLlib

Apache's MLlib library provides machine learning algorithms for the Spark platform. The library can be accessed from Scala as well as from Java and Python. It supports basic statistical methods for data analysis, various regression and classification methods, clustering via k-means, dimensionality reduction, and optimization methods. The number of algorithms in the library is constantly growing. The MLib library can be found at the following website:

http://spark.apache.org/mllib/

主站蜘蛛池模板: 会东县| 临沭县| 泰顺县| 靖宇县| 岐山县| 三穗县| 新疆| 万山特区| 额尔古纳市| 上高县| 商洛市| 石景山区| 利辛县| 吉林市| 宜兴市| 应城市| 邛崃市| 收藏| 彝良县| 正定县| 金沙县| 塔城市| 巫溪县| 昭平县| 莱州市| 玛纳斯县| 偃师市| 唐海县| 南昌市| 临汾市| 鹤庆县| 栖霞市| 达拉特旗| 庐江县| 琼结县| 桑植县| 高州市| 凤凰县| 蓝田县| 姚安县| 贞丰县|