What Cricket Can Teach Us About Ensemble Models

Machine Learning, Artificial Intelligence

Fill out this field
Please enter a valid email address.
Fill out this field

In cricket, a bowling strategy that is never considered is to have all bowlers of the same bowling style. A good bowling strategy would be to have a variety of bowling styles,  fast, swing, spin, etc. Each challenges the batsman in a different way, increasing the likelihood that the batsman gets out. A good AI/ML system is built similarly and is made up of many complementary models, each trying to achieve the same goal (or wicket!) from different perspectives. Such systems are known as Ensemble Models.

However, a key requirement for Ensemble Models to work is to ensure that each model’s weakness is unique. The value of an ensemble comes not from perfection of any individual model, but from that fact that each model makes unique mistakes that are not made by other models. Thus given the diversity of errors, one model’s blind spots and errors get picked up by other models. Statistically we say ensemble models should have low error covariance or that the errors of each model should be uncorrelated. Thus, to make a good ensemble model, one needs to think carefully about what errors each model is likely to make and create ensembles so that models have diverse error patterns and uncorrelated errors.

Using the cricket analogy, if a teams fast bowlers are unable to take the wicket of a batsman for XYZ reason, hopefully the spin bowlers do not suffer from the XYZ reason, and are able to get batsman out. If spin bowlers also suffer from the XYZ reason, there is correlation and overlap between the fast and spin bowlers and together this is determinable to the bowling team. Here, the objective is not to make the fast bowlers better, but rather accept that they will perform poorly in some scenarios, given other bowling styles do not perform poorly in those scenarios. Thus, rather than optimising one bowling style in a cricket team or one model in an ensemble, the objective is to minimise error or failure covariance between the different bowling styles/models.

Not Every Bowler Should Be a Fast Bowler — And Not Every Model Should Be Perfect!