How to reduce the cost of machine learning inference

4 years ago towardsdatascience.com

Summary: This is a summary of an article originally published by the source. Read the full original article here →

When the cost of machine learning is discussed, it is often in the context of model training. Left out in these discussions is the bottleneck presented by inference costs, particularly in the case of realtime inference.

Here, I want to share what we’ve learned as a short checklist of opportunities for cost optimization in inference pipelines. Depending on the level of optimization in your current pipeline, these steps could reduce inference costs by more than 80%. Inference costs, particularly in the case of realtime inference, are largely driven by cloud compute costs.

DevOps Articles