官术网_书友最值得收藏!

Landing–staging–target scenario

As mentioned in the preceding section, sometimes, actual data sources are not too reliable. This is why we need to add an extra layer to our architectures to defend against most uncertainties coming from data sources. This extra layer is called landing. The landing database is a zone used for only one thing: to catch data from data sources with no respect to their schema stability, accessibility, or data quality. The following screenshot shows a complete architecture containing the landing database:

As seen in the preceding screenshot, the landing database is added to the staging/target database architecture. The landing database plays a vital role in scenarios in which data sources vary. As an example, let's take a set of CSV files stored on an FTP site or web services with XML or JSON responses. The schema of such data is not reliable enough, so the landing database could help us to recognize schema changes between two loads from data sources, and it can also help us to manipulate data from non-relational data sources.

Previous sections provided a description of typical database architectures for data transformations. Certain databases in staging-target and landing-staging-target scenarios could not be separated in isolated instances nor isolated databases; every part of data transformation could be created simply as a separated schema in a single database. This decision depends on several factors, such as owned resources, licences, or security requirements.

The question is how to develop data movements between certain data sources and databases. We have a wide set of options available for this.

主站蜘蛛池模板: 特克斯县| 太湖县| 松潘县| 惠东县| 灵武市| 钦州市| 竹北市| 峡江县| 仲巴县| 宝应县| 潮安县| 霍州市| 万载县| 周口市| 汶川县| 霍城县| 微博| 灌云县| 郯城县| 前郭尔| 林周县| 平安县| 台北市| 山西省| 嘉善县| 左权县| 余干县| 静乐县| 大姚县| 宣恩县| 沂南县| 斗六市| 洞口县| 固阳县| 陵川县| 凭祥市| 扎兰屯市| 舟山市| 留坝县| 广水市| 吉木乃县|