- Learning Salesforce Einstein
- Mohith Shrivastava
- 375字
- 2021-07-02 21:44:02
Architecture and integration with applications
The architecture is well covered in the official documentation located at http://predictionio.incubator.apache.org/system/. However, we will expand on the important aspects a little more in this section so that we can completely understand the flexibility and the platform offering in detail.
The following diagram is from the official documentation of PredictionIO:

The key things to understand from the preceding diagram are as follows:
- Event Server will provide a RESTful endpoint for all the applications to drop events in real time. For applications such as product recommender, events may include view data, for when a buyer views various products, an event when a buyer adds a product to a cart, an event from IOT devices, and so on. Event Server of the current version of PredictionIO can use PostgreSQL 9.1/MySQL 5.1 or Apache HBase/ElasticSearch for the event data store. PredictionIO allows different engines to be used in training, but many algorithms come from Spark's MLlib. For scalable and large data volume applications, it is better to consider Apache HBASE, which is an open source, distributed, versioned, and non-relational database capable of handling billions of transactions for the training of data.
- Training: PredictionIO uses Apache Spark to train the dataset. Apache Spark has an extensive API support for developers using data structure and most of the templates use libraries such as SPARK MLlib to directly access machine learning functions developed by data scientists.
- Prediction Server will be a RESTful endpoint to submit a query in real time and get predictive results. The output of the training has two parts: a model and its metadata. The model is then stored in Hadoop Distributed File System (HDFS--a local file system) or ElasticSearch.
HDFS is a distributed filesystem from Hadoop; it allows the storage to be shared among clustered machines. It is used to stage data for the batch import into PredictionIO (PIO), for the export of Event Server datasets, and for the storage of some models. ElasticSearch is a distributed, RESTful search and analytics engine; it's at the core of the Elastic Stack and stores your data centrally so that you can discover the expected and uncover the unexpected.
推薦閱讀
- Web前端開發(fā)技術:HTML、CSS、JavaScript(第3版)
- 移動UI設計(微課版)
- Developing Mobile Web ArcGIS Applications
- Twilio Best Practices
- Python從菜鳥到高手(第2版)
- 匯編語言程序設計(第2版)
- Symfony2 Essentials
- The Professional ScrumMaster’s Handbook
- Scala for Machine Learning(Second Edition)
- Java程序設計案例教程
- Python入門很輕松(微課超值版)
- OpenMP核心技術指南
- 軟件體系結構
- Learning Unreal Engine Game Development
- Java程序設計