Run Systematic statistics for each variable and check for outliers. consider the data collection methods and any bias involved, what options were the population given. Checking for missing data and if we represent the existing data as it is or if we have to fill in the gaps (Multiple imputation). e.g. Use a new column and count how many missing values there are, remove all with more than one missing value. Check if all entries are valid: check how long it took for the user to submit a survey, we can use questions that contradict each other or multiple questions with similar intent but worded slightly differently to check if users are randomly clicking answers, we can then remove these invalid entries.