There are three phases in ETL testing
1) Extract Data
2) Transform Data
3) Load Data
Extract Data : Reads the data and Extracts the data from different sources with help of ETL tools. Extracting is nothing but copy data from one source to another temporary source. Data are extracted from heterogeneous data base source. During the extracting it shoud be be collected the similar data and copy to the data to the temporary database. Each data source has its distinct set of characteristics that need to be managed and integrated into the ETL system in order to effectively extract data. Data could be extracted on xml or any supporting file format. But during load you have to confirm what format destination data base accepts the data.
Transform Data : Once data extracted successfully and copied to the temporary data base then its time to transform the data to the supporting format so that data could be loaded to the destination data base. During transformation of data it should always be in mind that data are transformed according to the destination database rules. For example it might be source data consists country with country name like Sweden or USA etc and destination data can except only the country short name like SW or US instead of country name. This will create the problem during load.
Load Data : Finally once data are extracted and transformed, now its ready to load to the destination data base or usually data-ware house. Data could be loaded based on your requirement. It could be once or daily or weekly or monthly. Load of data could be scheduled based on the tool that you are using to load the data. During the load there is one more important thing that you have to keep in mind that might be already data are existing in destination data base so the new data could either replace the old data or over write the old data or duplicate the data. So duplication of data could create the problem in future.
There are few key points that need to take care if you are going to perform ETL testing
- Make sure data are extracted or copied successfully without fail.
- Data should not be lost during load the data to the target data base.
- Data should not duplicate to the target data base.
- Data should be inserted to the proper format and in the proper rows and columns.
- Data should not exceed the capacity of the data base.
- Once data loaded successfully check the performances and behavior of data.
- Try to avoid load all the data at a time, it could case a reason to loss of data.
- Always take a back up of existing data available in destination data base.
- Make sure data recovery rule is defined in destination database in case if any data lost or not loaded successfully.
- Make sure all the data loaded at right place.