手机捕鱼游戏辅助修改脚本软件

書名： Apache Spark Machine Learning Blueprints
作者名： Alex Liu
本章字數： 184字
更新時間： 2021-07-16 10:39:48

Spark computing for machine learning

With its innovations on RDD and in-memory processing, Apache Spark has truly made distributed computing easily accessible to data scientists and machine learning professionals. According to the Apache Spark team, Apache Spark runs on the Mesos cluster manager, letting it share resources with Hadoop and other applications. Therefore, Apache Spark can read from any Hadoop input source like HDFS.

For the above, the Apache Spark computing model is very suitable to distributed computing for machine learning. Especially for rapid interactive machine learning, parallel computing, and complicated modelling at scale, Apache Spark should definitely be utilized.

According to the Spark development team, Spark's philosophy is to make life easy and productive for data scientists and machine learning professionals. Due to this, Apache Spark has:

Well documented, expressive API's
Powerful domain specific libraries
Easy integration with storage systems
Caching to avoid data movement

Per the introduction by Patrick Wendell, co-founder of Databricks, Spark is especially made for large scale data processing. Apache Spark supports agile data science to iterate rapidly, and Spark can be integrated with IBM and other solutions easily.

官术网_书友最值得收藏!

Apache Spark Machine Learning Blueprints

Spark computing for machine learning