Perform Preprocessing Independently: apply preprocessing steps based on information from the train set without considering the test set.
Use Pipelines: implement preprocessing steps within a pipeline to ensure consistency and prevent information flow between the training and test sets.
Handle Missing Values Appropriately: if missing values are inputed, use methods based solely on information from the training set. Avoid using global statistics or values derived from the test set.