香港财神网站网

書名： Mastering Spark for Data Science
作者名： Andrew Morgan Antoine Amend David George Matthew Hallett
本章字?jǐn)?shù)： 86字
更新時間： 2021-07-09 18:49:33

Summary

In this chapter, we walked through the full setup of an Apache NiFi GDELT ingest pipeline, complete with metadata forks and a brief introduction to visualizing the resulting data. This section is particularly important as GDELT is used extensively throughout the book and the NiFi method is a highly effective way to source data in a scalable and modular way.

In the next chapter, we will get to grips with what to do with the data once it's landed, by looking at schemas and formats.

官术网_书友最值得收藏!

Mastering Spark for Data Science

Summary