- ETL with Azure Cookbook
- Christian Cote Matija Lah Madina Saitakhmetova
- 550字
- 2021-06-11 18:02:40
Loading data before its transformation
ELT is very similar to ETL, but with a crucial difference: the order of the transform and load steps are inverted. This is very useful with big data in the cloud or when we do not have an ETL tool on-premises. This recipe will be much simpler than the previous one, as we'll implement ELT using a database, so no tools are involved here except for calling the ELT task.
It also relies on the previous recipe, Creating a simple ETL package, since we're going to use the SalesLT.CustomerFullName table data to implement the ELT pattern.
There are essentially two parts to this recipe:
- Extract and load data into our data lake. Here, we don't have a real data lake; we're using AdventureWorksLT on Azure to mimic the data lake concept.
- Transform the data inside the database using a simple SQL script. We're going to add the FullName column to the SalesLT.Customer table and update it using this script.
Getting ready
This recipe assumes that you have installed SSMS, Visual Studio 2019, and SSIS.
How to do it…
Let's dig into the recipe:
- In the SSIS solution, right-click on the SSIS packages folder and select New SSIS package. Rename the package from Package1 to SalesLT.Customer.
- Drag and drop Execute SQL Task from the Favorites section of the SSIS toolbox onto the control flow and name it SQL_ELT_SalesLT_Customer.
- We're going to add the FullName column to SalesLT.Customer. Open SSMS and connect it to our Azure database. Make sure that the context is set to the AdventureWorksLT database as shown in the following screenshot:
- In the query window, type the following DDL statement:
ALTER TABLE SalesLT.Customer ADD FullName NVARCHAR(350)
- Execute the command and verify that the FullName column has been successfully added in the Object Explorer as shown in the following screenshot:
- Go back to the SSIS package. Double-click on the SQL_ELT_SalesLT_Customer task to open Execute SQL Task Editor. Set the properties as follows:
a) Connection: cmgr_etlinazurecookbook.database.windows.net.AdventureWorksLT.ETLAdmin
b) SQL SourceType: Direct input
c) SQL Statement:
UPDATE c
SET FullName = cfn.FullName
FROM SalesLT.Customer AS c
INNER JOIN SalesLT.CustomerFullName as cfn
ON c.CustomerID = cfn.CustomerID;
Click on OK to close the editor and go back to the control flow.
- Execute the package.
- Go back into SSMS, create a new query, and type the following SQL statement:
SELECT [CustomerID]
,[FullName]
,[NameStyle]
,[Title]
,[FirstName]
,[MiddleName]
,[LastName]
,[Suffix]
,[CompanyName]
,[SalesPerson]
,[EmailAddress]
,[Phone]
FROM [SalesLT].[Customer]
The result should look like the following screenshot:
How it works…
This recipe showed you the pattern of ELT. Data was extracted from the source first, then loaded into the database (as we saw in the previous recipe, Creating a simple ETL package). We then transformed the SalesLT.Customer data by using already loaded data.