官术网_书友最值得收藏!

  • ETL with Azure Cookbook
  • Christian Cote Matija Lah Madina Saitakhmetova
  • 550字
  • 2021-06-11 18:02:40

Loading data before its transformation

ELT is very similar to ETL, but with a crucial difference: the order of the transform and load steps are inverted. This is very useful with big data in the cloud or when we do not have an ETL tool on-premises. This recipe will be much simpler than the previous one, as we'll implement ELT using a database, so no tools are involved here except for calling the ELT task.

It also relies on the previous recipe, Creating a simple ETL package, since we're going to use the SalesLT.CustomerFullName table data to implement the ELT pattern.

There are essentially two parts to this recipe:

  1. Extract and load data into our data lake. Here, we don't have a real data lake; we're using AdventureWorksLT on Azure to mimic the data lake concept.
  2. Transform the data inside the database using a simple SQL script. We're going to add the FullName column to the SalesLT.Customer table and update it using this script.

Getting ready

This recipe assumes that you have installed SSMS, Visual Studio 2019, and SSIS.

How to do it…

Let's dig into the recipe:

  1. In the SSIS solution, right-click on the SSIS packages folder and select New SSIS package. Rename the package from Package1 to SalesLT.Customer.
  2. Drag and drop Execute SQL Task from the Favorites section of the SSIS toolbox onto the control flow and name it SQL_ELT_SalesLT_Customer.
  3. We're going to add the FullName column to SalesLT.Customer. Open SSMS and connect it to our Azure database. Make sure that the context is set to the AdventureWorksLT database as shown in the following screenshot:

    Figure 2.32 – SSMS AdventureWorksLT database context

  4. In the query window, type the following DDL statement:

    ALTER TABLE SalesLT.Customer ADD FullName NVARCHAR(350)

  5. Execute the command and verify that the FullName column has been successfully added in the Object Explorer as shown in the following screenshot:

    Figure 2.33 – SSMS Object Explorer FullName column

  6. Go back to the SSIS package. Double-click on the SQL_ELT_SalesLT_Customer task to open Execute SQL Task Editor. Set the properties as follows:

    a) Connection: cmgr_etlinazurecookbook.database.windows.net.AdventureWorksLT.ETLAdmin

    b) SQL SourceType: Direct input

    c) SQL Statement:

    UPDATE c

    SET FullName = cfn.FullName

    FROM SalesLT.Customer AS c

    INNER JOIN SalesLT.CustomerFullName as cfn

    ON c.CustomerID = cfn.CustomerID;

    Click on OK to close the editor and go back to the control flow.

  7. Execute the package.
  8. Go back into SSMS, create a new query, and type the following SQL statement:

    SELECT [CustomerID]

           ,[FullName]

          ,[NameStyle]

          ,[Title]

          ,[FirstName]

          ,[MiddleName]

          ,[LastName]

          ,[Suffix]

          ,[CompanyName]

          ,[SalesPerson]

          ,[EmailAddress]

          ,[Phone]

      FROM [SalesLT].[Customer]

    The result should look like the following screenshot:

Figure 2.34 – SSMS Query Editor result

How it works…

This recipe showed you the pattern of ELT. Data was extracted from the source first, then loaded into the database (as we saw in the previous recipe, Creating a simple ETL package). We then transformed the SalesLT.Customer data by using already loaded data.

主站蜘蛛池模板: 昌邑市| 青阳县| 泗阳县| 巨鹿县| 读书| 裕民县| 宁乡县| 阜城县| 华宁县| 新安县| 阜新市| 丁青县| 旬阳县| 蒙自县| 姜堰市| 全州县| 仁布县| 宜兰县| 常山县| 莲花县| 平武县| 叶城县| 保德县| 大丰市| 潮州市| 从江县| 石嘴山市| 许昌县| 聊城市| 阿鲁科尔沁旗| 周口市| 安福县| 太谷县| 蒲城县| 江油市| 双城市| 涿鹿县| 务川| 易门县| 松阳县| 三穗县|