What is The Process of ETL Testing and Its Types
What Is ETL?
ETL stands for Extract Transform and Load. ETL is a process in which data transfer takes place in multiple stages. Data transfer starts from the legacy source to the staging server, from staging to a data warehouse, and finally from a data warehouse to data marts, and then loads the data from the data mart into cubes of browsing.
What Is a Data Warehouse?
Data warehousing is a technique of managing data after collecting it from various sources to provide meaningful business understanding. It allows the planned use of data by combining technologies and components.
The data warehouse is the electronic storage of a huge amount of information through business. It is designed for query and analysis instead of transaction processing. It is a process of transforming data into information and making it available to users in an appropriate manner to make a difference.
Why Do We Need ETL Testing?
If we want to create a data warehouse that combines data from different sources, we will need a way to get that data into the warehouse. Each step in the ETL process is highly vulnerable and subject to errors or loss of data or erroneous transfer of data. This is the main reason to involve the concept of testing in the ETL cycles.
ETL Testing Process
Different stages of the ETL testing process are:-
- Understand business requirements.
- Test Planning and estimation on the basis of a number of tables and volume of the data.
- Designing of test cases from all available inputs, design mapping data, SQL scripts, and preparation of test data.
- Test execution and bug reporting, execution phase includes running ETL jobs, monitoring job runs, SQL script execution, defect logging, defect retesting, and regression testing.
- Summarizing reports and result in analysis, in this phase sign-off is given to promote the job or move to the next phase.
- Test closure
Read Also:- Accessibility Testing: Need, Benefits, and Types
These are the five pillars of the ETL Testing process:
- Upstream data issues.
- Timelines and accuracy of data.
- Security of data.
- History of data and its maintenance.
- Availability of data for decision.
Difference between ETL Testing and Database Testing
ETL Testing | Database Testing |
---|---|
ETL testing verifies whether the data is moved as per requirement or not. | Database testing verifies whether the data is following the rules, as defined in the Data Model. |
It verifies if the counts in the source and target are equal. | It verifies if foreign and primary key relation is maintained and also checks for any orphan records. |
ETL checks if foreign and primary key references are maintained during the ETL process. | Database verifies whether the database is ideally normalized and whether there are no redundant tables. |
It verifies that there is no duplicity in the data loaded into the data warehouse. | It checks for any missing data in the required columns. |
Types of ETL Testing
- Data Transformation Testing: Multiple SQL queries require to be run for each and every row to verify data transformation correctly according to various business requirements and rules.
- Source to Target Testing: It includes matching the count of records and data validation at the source and target system.
- End-User Testing: It includes report generation to verify if the data is as per expectation by cross-checking the data in the target system for report validation.
- Retesting: It is a process of data validation by running the reports again once the bugs and defects are fixed in the target system.
- Production Validation Testing: It includes data validation in the production system & compares it against the source data.
- Data Integration Testing: It makes sure that the data from different sources has been loaded properly into the target system and that all the threshold values check.
- Application Migration Testing: This testing ensures that the ETL application is working fine on moving to a new platform.
Read Also:- How To Learn About API testing with REST Assured
ETL Tools
ETL tools use for building and applying transformation logic to move data from one source to another. They also use for mapping the schemas from source to destination that may occur in various ways. They also transform and clean data before moving data to the destination.
Some of the ETL tools are:
-
- RightData
- Informatica Data Validation
- QuerySurge
- ICEDQ
- Datagaps ETL Validator
- QualiDI
- Talend Open Studio for Data Integration
Creating Test Cases In ETL testing
ETL test case writing includes the following documents:
- ETL Mapping Sheets: This contains all information about the source and destination tables and their These sheets help in creating SQL queries while doing testing.
- Database Schemas of Source and Destination Table: ETL mapping sheets are kept up to date with Database Schema for data validation.
Types of ETL Bugs
- Calculation Bugs: It is reported when the end result is deviated due to some mathematical error.
- Input/Output Bugs: Accept invalid values and rejects valid/ correct values.
- User Interface Bugs: It is related to the GUI of an application like color, font size, alignment, etc
- Hardware Bugs: When the device doesn’t respond properly to the application due to some hardware issues.
- Load Condition Bugs: When the system doesn’t allow multiple users or customers expected load.
Related Articles:-