- The Artificial Intelligence Infrastructure Workshop
- Chinmay Arankalle Gareth Dwyer Bas Geerdink Kunal Gera Kevin Liao Anand N.S.
- 201字
- 2021-06-11 18:35:26
Summary
In this chapter, we have discussed many ways to prepare data for machine learning and other forms of AI. Raw data from source systems had to be transported across the data layers of a modern data lake, including a historical data archive, a set of (virtualized) analytics datasets, and a machine learning environment. There are several tools for creating such a data pipeline: simple scripts and traditional software, ETL tools, big data processing frameworks, and streaming data engines.
We have also introduced the concept of feature engineering. This is an important piece of work in any AI system, where data is prepared to be consumed by a machine learning model. Independent of the programming language and frameworks that are chosen for this, an AI team has to spend significant time writing the features and ensuring that the resulting code and binaries are well managed and deployed, together with the models themselves.
We have performed exercises and activities where we have worked with Bash scripts, Jupyter Notebooks, Spark, and finally, stream processing with live Twitter data.
In the next chapter, we will look into a less technical but very important topic for data engineering and machine learning: the ethics of AI.
- 用“芯”探核:龍芯派開發(fā)實戰(zhàn)
- 數(shù)字道路技術(shù)架構(gòu)與建設指南
- 硬件產(chǎn)品經(jīng)理成長手記(全彩)
- Artificial Intelligence Business:How you can profit from AI
- 電腦維護365問
- OUYA Game Development by Example
- 計算機組裝與維修技術(shù)
- Practical Machine Learning with R
- SiFive 經(jīng)典RISC-V FE310微控制器原理與實踐
- Blender Quick Start Guide
- VMware Workstation:No Experience Necessary
- 3D Printing Blueprints
- Arduino項目開發(fā):智能生活
- IP網(wǎng)絡視頻傳輸:技術(shù)、標準和應用
- Building Machine Learning Systems with Python