Tecton
timezone
+00:00 GMT
SIGN IN
  • Home
  • Events
  • Content
  • Help
Sign In
Sign in or Join the community to continue

Training Large-Scale Recommendation Models with TPUs

Posted Apr 12
# Model training
# Production Use Case
# Systems and Architecture
Share
SPEAKER
Aymeric Damien
Aymeric Damien
Aymeric Damien
Machine Learning Engineer @ Snap Inc.

Aymeric is a ML Engineer at Snap, leading various efforts to optimize Snap's Ad-Ranking ML systems. His work includes training & inference pipeline optimization, modelling efficiency, and collaboration with Google/Nvidia/Intel for ML hardware optimization.

+ Read More

Aymeric is a ML Engineer at Snap, leading various efforts to optimize Snap's Ad-Ranking ML systems. His work includes training & inference pipeline optimization, modelling efficiency, and collaboration with Google/Nvidia/Intel for ML hardware optimization.

+ Read More
SUMMARY

At Snap, we train a large number of deep learning models every day to continuously improve the ad recommendation quality to Snapchatters and provide more value to the advertisers. These ad ranking models have hundreds of millions of parameters and are trained on billions of examples. Training an ad ranking model is a computation-intensive and memory-lookup-heavy task. It requires a state-of-the-art distributed system and performant hardware to complete the training reliably and in a timely manner. This session will describe how we leveraged GoogleÕs Tensor Processing Units (TPU) for fast and efficient training.

+ Read More

Watch More

1:00
Posted Jun 01 | Views 152
# Data engineering
# Model serving
# Systems and Architecture
30
Posted Apr 12 | Views 19
# Data engineering
# Feature Stores
# Open Source
# Production Use Case
# Systems and Architecture
10
Posted Apr 12 | Views 24
# Explainability and Observability
# Production Use Case