Data Cleaning

Includes checking data for errors and inconsistencies that can lower data quality and reliability, then correcting them. Includes checking data for errors and inconsistencies that can lower data quality and reliability, then correcting them.

Tasks:

  • Tidy data: make sure each variable is in it’s specific column and each data item in it’s own row.
  • Check and remove duplicate values
  • Handle Missing Values
  • Clean Noisy Data
  • Perform Outlier Handling if necessary.

Learning Material: