below Is the code for finding the duplicate in dataframe
duplicate_su_g = df_ft_g[df_ft_g.duplicated(keep=False)].sort_values(
by=['effective_date', 'ind_cd', 'ent_id', 'sec_id','sec_region', 'sec_sector','domain'])
print(duplicate_su_g[['effective_date','ind_cd', 'ent_id', 'sec_id', 'sec_region', 'sec_sector','is_acwi']])
print('no of duplicate records in transaction_FINAL table ' + str(df_ft_g.duplicated(keep=False).sum()))
can I use it as test case in pytest and get the summary of duplicate record in pytest report ?
Aucun commentaire:
Enregistrer un commentaire