Of Grapes and Mushrooms: From AutoML to OptiML

Something I rarely tell people is that I am a dropout from UC Davis’ Ph.D. viticulture program. One of the first models we built in Wine Production (which encompassed both growing and later production of wine) was based on mostly exogenous variables not under grower control. Things like temperature and humidity at various points in the growing process, nitrogen levels and other nutrients in the soil (which for various reasons were generally taken as a given even though fertilizer was used), rainfall, hours of sunshine, even wind and air quality. And of course grape varietal and characteristics of those varietals. We built statistical models that did an admirable job predicting yields given this wealth of data. These are quite useful for predicting overall yield, which helps plan actual wine production and grape reselling. Auger.AI, with its ability to test all possible predictive algorithms with a large number of hyperparameter settings and intelligently search the problem, would have indeed built the best predictive model. I wish I had it at the time.


By contrast, I also happen to (randomly enough) know several mushroom growers. They have a much higher degree of control of the environmental variables affecting product. Variables include temperature, humidity, soil moisture, growing area per plant (density), compost per square meter, and growing period (in days) — all under grower control. Their problem is significantly one of optimization of the variables under their control. Yes, a predictive model is helpful as it allows some prediction of yield given some settings for those variables they control. You can try various sets of values for those variables and say “what if?”. And the predictive model will tell you the yield. Try several sets of values that you were considering. Pick the best and, voila, you probably are going to increase your yields over the status quo.

But the data scientists among you should now protest “who cares if its better? It's not optimal!” And you would be right. Now its possible that the AutoML process came up with a resulting model that can be approximated a simple linear function of those variables. In this case, there are certainly well-known mathematical techniques to quickly traverse the space of possible settings and compute the optimal result. But it is quite unlikely that you would be driven to Automated Machine Learning if you had such a simple problem. And indeed mushroom growing has so many controllable variables with such a complex interplay between them that it does not result in a simple linear formula. So, once you have a predictive model, you are then left with “so what is the best possible group of settings for these variables.” What we have seen in customer models is there is almost always some demand for using the predictive model to “pick best values.” Most real-world problems fall somewhere in the spectrum from pure prediction for grapes to prediction driving optimization for mushrooms.

So, apologies if the grape to mushroom metaphor leaves you nonplussed, what we want to discuss if the value of predictive models driving optimization, or making the best choices. Optimization today doesn’t have much to do with AutoML. BUT… it turns out that if you have the infrastructure for AutoML in place that can intelligently search (trading off between exploration and exploitation along the way) the space of hyperparameter settings, that infrastructure can also be helpful in searching the (also quite unbounded) space of possible variable settings fed into a predictive model to intelligently find the best answer. Even in the case where each predictive model must be executed to come up with the predicted target for each set of input variables. We call this approach “OptiML”. This is not exposed in Auger.AI today. But if you have successfully built a predictive model, and you are asking “OK, but NOW how do I use this to pick the best inputs which I have control over?” please write to us, and we can almost certainly help you. If you are just an enthusiast of the AutoML industry niche and want to know when this OptiML thing is available on a self-service basis, also ping us and stay tuned…