官术网_书友最值得收藏!

Data analysis packages for Scala

By data analysis packages, we mean software designed for analyzing data in some way. A simple statistical regression would be an example. Software implementing machine-learning algorithms would be another example.

Saddle

Saddle is Scala's answer to R and Python's pandas package. It supports reading in structured data in a variety of different formats, including CSV and HDF5. The data can be loaded into frames and then manipulated as you would in other similar software. Statistical analysis can be performed, and you can build your own statistical analysis methods on top of the data structures provided by Saddle. Saddle is examined in detail in a separate chapter dedicated to it. It can be found at the following website:

https://saddle.github.io/

MLlib

Apache's MLlib library provides machine learning algorithms for the Spark platform. The library can be accessed from Scala as well as from Java and Python. It supports basic statistical methods for data analysis, various regression and classification methods, clustering via k-means, dimensionality reduction, and optimization methods. The number of algorithms in the library is constantly growing. The MLib library can be found at the following website:

http://spark.apache.org/mllib/

主站蜘蛛池模板: 财经| 安仁县| 吴忠市| 金山区| 新乐市| 扎囊县| 巴林右旗| 逊克县| 盐源县| 满洲里市| 汕尾市| 宜宾县| 手机| 黎城县| 永泰县| 江北区| 保山市| 鸡西市| 襄汾县| 河北省| 尤溪县| 泰兴市| 曲沃县| 唐河县| 徐闻县| 蒙山县| 武强县| 珲春市| 固阳县| 察雅县| 方城县| 凤城市| 玉溪市| 延寿县| 黄平县| 黎川县| 安新县| 醴陵市| 新绛县| 班戈县| 商都县|