When displaying relevant first-party ads to buyers in the Etsy marketplace, ads are ranked using a combination of outputs from ML models. The relevance of ads displayed to buyers and costs charged to sellers are highly sensitive to the output distributions of the models. Various factors contribute to model outputs which include the makeup of training data, model architecture, and input features. To make the system more robust and resilient to modeling changes, we have calibrated all ML models that power ranking and bidding.
In this talk, we will first discuss the pain points and use cases that identified the need for calibration in our system. We will share the journey, learnings, and challenges of calibrating our machine learning models and the implications of calibrated outputs. Finally, we will explain how we are using the calibrated outputs in downstream applications and explore opportunities that calibration unlocks at Etsy.