ETL testing.

ETL (Extraction, Transformation, Loading) Testing.

ETL testing ensures that data transferred from source to target accurately and as per requirements.

Extract

Extract required data from source.

Data can have one or multiple sources with different data formats.

Transform

Transform data into required format.

It involves data changes like simplifying data, removing errors, data deduplication (removing duplicate data) and other mappings.

Load

After transforming move data into target warehouse.

data loading can be done by two methods

1. Full load

Entire data from source is transformed to target.

We need to do full load first time when loading data from source to target.

2. Incremental load

in incremental loads delta between source and target at regular time intervals. It loads data which is extracted after last date only new records will be added.

Example of ETL testing.

We have to transfer one table from source to target with few basic transformations logics

Source table :

IDNameSurnameCityState
1ManojPatilPuneMaharashtra
2AkashPawarMysoreKarnataka
3SwatiPatilMumbaiMaharashtra

In above table we have to apply some transformation logics

  1. Create extra column which indicates username it should be first letter of name and full surname ending with ID.
  2. Another transformation logic is to add city and state to create address column.

So the resulting table will look like below.

IDNameSurnameUsernameAddress
1ManojPatilMPatil1Pune, Maharashtra
2AkashPawarAPawar2Mysore,Karnataka
3SwatiPatilSPatil3Mumbai, Maharashtra

As per given transformation logics we should validate that data has been transformed as per given logics.

to note that organization like banking domain have billions of data in their databases.

Types of validations.

let’s see what types of data validations needs to be done in ETL testing.

there are few basic validations such as null constraint check, duplicate data check, metadata check.

Also some validations are based on business requirements that we can do based on use cases/ test cases.

Null Check.

If there is the requirement for few columns that it should not have null value then we need to run SQL query for null check to verify the columns are.

select * from table_name where column1 is not null;

if above query returns 0 records then given column don’t have null values but result shows records then it’s not as per requirement.

If we are having multiple columns to check not null constraint we can write query as below:

select * from table_name where column1 is not null or column2 is not null or column3 is not null;

Duplicate check.

we need to ensure that given data should not contains duplicate record and for that we need to run query to find out if there is any duplicate record in the table.

To find out duplicate record we need to select all the columns with group by condition as below

Select column1, column2, column3 from table_name group by column1, column2, column3 having count(*) > 1;

If the above query gives zero result then data is not having duplicate values.

Metadata / Table structure validation.

In this validation we need to check if the column names of table and data type is matching as per requirements

if ID column having requirment that it should be integer data type and name should be string then in database also it should match we can validate by comparing from requirements and database

to get column names and data types of table we need to simply run below query.

desc table_name;

Count check.

We need to validate that source data count and target data count is matching as per requirements.

to check records count we can write queries as below.

For source:

select count(*) from source_table;

For target :

select count(*) from target_table;

Mapping Rule Validation.

In this validation we need to ensure all the data transferred into target as per given mapping document. Change in data should be as per given requirements.

For example we have given requirement that transform name column to lower case.

Then we need to run query as required and check the results.

select lower(name) from source_table
Minus
select name from target_table;

Above query will return zero records if the transformation is accurate.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top