官术网_书友最值得收藏!

Summary

In this chapter, we discussed the non-functional requirements for data storage solutions. It has become clear that a data lake, which is an evolution of a data warehouse, consists of multiple layers that have their own requirements and thus technology. We have discussed the key requirements for a raw data store where primarily flat files need to be stored in a robust way, for a historical database where temporal information is saved, and for analytics data stores where fast querying is necessary. Furthermore, we have explained the requirements for a streaming data engine and for a model development environment. In all cases, requirements management is an ongoing process in an AI project. Rather than setting all the requirements in stone at the start of the project, architects and developers should be agile, revisiting and revising the requirements after every iteration.

In the next chapter, we will connect the layers of the architecture we have explored in this chapter by creating a data processing pipeline that transforms data from the raw data layer to the historical data layer and to the analytics layer. We will do this to ensure that all the data has been prepared for use in machine learning models. We will also cover data preparation for streaming data scenarios.

主站蜘蛛池模板: 昭苏县| 信宜市| 岳阳县| 镇巴县| 德令哈市| 堆龙德庆县| 特克斯县| 阆中市| 鸡东县| 梅州市| 广昌县| 雅安市| 霍邱县| 太原市| 焦作市| 萨嘎县| 汾西县| 大英县| 博客| 柳江县| 高雄市| 淳安县| 乐至县| 共和县| 宣武区| 龙游县| 张家口市| 五莲县| 普格县| 襄樊市| 抚州市| 衡阳市| 翁源县| 松潘县| 上栗县| 扎兰屯市| 天津市| 汉源县| 莱州市| 无锡市| 三江|