Use the Drift and Stability of Data to Build More Resilient Models

https://www.linkedin.com/in/anindya-datta-b61476/ When building predictive models, model accuracy, measured by metrics like area under the curve (AUC), has traditionally been the primary driver of model design and operationalization. Increasingly, ML practitioners are leaning towards operationalizing decently performing, predictable production models rather than those that exhibit high performance at test time but don’t quite deliver on that promise when deployed.

For time-series data (the most common form of data powering ML models), drift is a measure of the “distance” of data at two different instances in time.

Stable data elements are likely to lead to stable features, which are likely to power stable (or resilient) models.

Knowing how stable each input attribute is, modelers will choose stable data elements, while avoiding unstable ones, to build stable features, which, in turn, will power resilient models.

DevOps Articles

Use the Drift and Stability of Data to Build More Resilient Models

Product

Useful Links

DevOps Articles