Data preprocessing is an important aspect of automated machine learning, as generating a usable dataset for prediction and classification problems is among the most time-consuming aspects of data science problems. Most machine learning algorithms work only with well-structured data, but in reality, most real-world data needs considerable work prior to being usable.
In this 30 minute webinar, we’ll examine some of the most common data problems such as missing values, scaling feature values for algorithms that need it, handling cyclic features, and removing low variance or highly correlated features. We’ll also look at some of the ways AutoML tools put the “auto” into addressing data preprocessing issues, with a detailed look at how Auger addresses data preprocessing.
Presenter: Vladyslav Khizhanov