Putting “Automating AutoML” to Work

In my last post I discussed why we at Auger believe that AI will eat software. Enterprises will move beyond just solving their biggest problems with painstakingly built predictive models that teams of data scientists spend months on. Instead every enterprise application can build predictive models wherever they have access to data. Such predictive models can replace hand-coded rule of thumb “business rules” (sort orders, if then else statements, switch-case statements, complex menus that users must navigate) and enable optimal decisions.

The enabler for this transition is truly accurate AutoML: better than human data scientists. That’s the Maginot line for how AutoML becomes transformative: accuracy better than a data scientist. First generation AutoML was primarily used for “baselining”: acting as a target for human data scientists to beat. But a second generation of AutoML tools has emerged which finally use Bayesian optimization to search among algorithm/hyperparameter combinations (versus random or grid search). A side effect of our comparisons of AutoML tool accuracy is that we determined that it consistently outperforms human data scientists: even the data science contest winners (look at the results for the OpenML winners by comparison).

So what are some use cases for building such predictive models automatically into applications (the A²ML approach)? Here are some we have seen from Auger customers and embedding machine learning into applications in general.

  • show best treatments in order given patient, identified symptoms and quantitative disease markers

For a healthcare non-profit investigating various treatment alternatives for difficult or incurable diseases, the best treatments are a matter of degree. If there was only one and it worked consistently these diseases would not be considered difficult. Sorting amongst treatment alternatives based on patient information (attributes of city of residence, genetic attributes, age, sex, weight, body mass index) and quantitative markers of the disease (such as protein levels or time between treatments) is a powerful way to show options available to patients, without the responsibility of “picking one”.

  • suggest learning resources based on attributes of the student, the resource and the learning objective

In education the concept of “efficacy” has long been applied to digital curriculum and simple statistical models have been deployed to measure how well a particular piece of curriculum (textbook, worksheet, video, game) affects subsequent educational outcomes (such as a test). But the best resource is heavily dependent on attributes of the student (such as their general performance on the same subject matter), the time and location of consumption and more details of the subsequent event (when, where and how outcomes are measured). For a large educational publisher we built a complex multi-attribute machine learning model with AutoML. The error rate (inverse of the accuracy) was half that of the prior simple statistical measure of efficacy (R² was 0.88 when it had been 0.76). The AutoML-generated model was programmatically embedded inside a larger resource library recommendation system via A²ML.

  • show aviation components that should be scheduled for maintenance

While much aviation maintenance is performed for regulatory compliance, most day to day component failures that cause delays and cancellations fall well before regulatory deadlines. Most aviation maintenance software puts work into the queue for scheduling based on a set of criteria: number of flight miles exceeds some metric, takeoff and landings exceed a threshold, calendar time since last maintenance exceeds some level.

Of course this doesn’t predict true failures well. To provide the best prevention all of these factors need to be weighted and machine learning can build those weights. Such optimal models for this problem benefit from the accuracy provided by AutoML. Feeding predictions about impending failures and cost right into the maintenance scheduling system insures that maintenance jobs are performed optimally. The resulting schedules both reduce component failure and simplify the application logic necessary to define the criteria.

  • optimize conversion rate and cost of digital advertising based attributes of ad content, publisher and time

Picking the best venues, times and locations to place digital ad content is highly dependent on many attributes of the ad content itself, what publication its presented in, and what users the publisher wants to target. While some ad vendors purport to optimize this for the ad publisher (buyer) these blackbox systems are not maximizing conversion rate and minimizing cost. An AutoML-generated machine learning model allows the ad publisher software to present the best opportunities along with predicted click through rates, conversion volumes and cost per conversion.

  • classify learning objectives relationships to other learning objectives

A large educational testing company wanted to predict what learning objectives are related to what other learning objectives and present them to subject matter experts for confirmation.

  • predict worker absenteeism for job scheduling

Worker scheduling has always been recognized as a difficult problem ripe for algorithmic application (it was actually one of the first applications of linear programming in the 1950s). Having an accurate predictive model of certain critical factors, such as how many employees are actually available to work, typically have been the weak link in production deployment of such systems. Use of AutoML to use the best possible algorithm and hyperparameters reduced the error rate of the previous hand-tuned model by fully half (from an R² of low 0.7s to R² above 0.85). Automated retraining (via A²ML) against specific enterprise generated data improves the accuracy even further.

In each of these cases, the AutoML generated predictive model far outperformed simple statistical models or hand-tuned use of specific machine learning algorithms. The “Automating AutoML” (A²ML) approach (embedding the training and retraining via AutoML inside larger applications) enabled the applications to be deployed much earlier, when there was barely enough data to generate valid predictive models. As the data accumulated the models were regenerated via our AutoML. Most fundamentally as underlying conditions changed scripts written around A²ML framework detected the degradation monitored by A2ML in its “Review Phase” and reexecuted the A²ML pipeline to rescue the degraded models.

The concept of “AutoML Review”, and the necessity of always monitoring the performance of your model, is an undercovered topic in AutoML and machine learning in general. In our next post we will dig deeper into “AutoML Review” phase and how it can be used to insure evergreen model accuracy .

In the meantime if you’d like to try the A²ML pipeline for your own automated predictive model generation, feel free to clone the open source project (which supports Google AutoML with Microsoft Azure AutoML support underway) from here. More importantly if the idea of Automating AutoML intrigues you and you have some requirements for the framework (including support yet more AutoML software providers) please contact me. If you’re interested in adding such support on your own, you are of course welcome to fork the project, but perhaps we can collaborate on moving the framework forward together.