Currently I am trying to figure out a strategy to automate the below testing scenarios. Data (csv files) has been ingested from On-premise servers to S3 and then transformed (ETL) to Redshift using glue
- Compare data between On premise server file and Amazon S3 (csv) i.e. comparing two file's entire content while they are residing in two diff server.
- Compare data between Amazon S3 and Amazon Redshift (After data extracted, transformed and loaded (ETL) from S3 to Redshift). Please suggest if there is any SIT test framework to test On-premise to AWS Cloud migration.
Is reading data from s3 and Redshift to Pandas data frames using python-pycharm an option, so we can compare data in data frames? If so, please suggest on how to read in to data frames.
Aucun commentaire:
Enregistrer un commentaire