Tecton
timezone
+00:00 GMT
SIGN IN
  • Home
  • Events
  • Content
  • Help
Sign In
Sign in or Join the community to continue

Training Large-Scale Recommendation Models with TPUs

Posted Apr 12, 2022 | Views 896
# Model training
# Production Use Case
# Systems and Architecture
Share
SPEAKER
Aymeric Damien
Aymeric Damien
Aymeric Damien
Machine Learning Engineer @ Snap Inc.

Aymeric is a ML Engineer at Snap, leading various efforts to optimize Snap's Ad-Ranking ML systems. His work includes training & inference pipeline optimization, modelling efficiency, and collaboration with Google/Nvidia/Intel for ML hardware optimization.

+ Read More

Aymeric is a ML Engineer at Snap, leading various efforts to optimize Snap's Ad-Ranking ML systems. His work includes training & inference pipeline optimization, modelling efficiency, and collaboration with Google/Nvidia/Intel for ML hardware optimization.

+ Read More
SUMMARY

At Snap, we train a large number of deep learning models every day to continuously improve the ad recommendation quality to Snapchatters and provide more value to the advertisers. These ad ranking models have hundreds of millions of parameters and are trained on billions of examples. Training an ad ranking model is a computation-intensive and memory-lookup-heavy task. It requires a state-of-the-art distributed system and performant hardware to complete the training reliably and in a timely manner. This session will describe how we leveraged GoogleÕs Tensor Processing Units (TPU) for fast and efficient training.

+ Read More

Watch More

30
Posted Apr 12, 2022 | Views 737
# Data engineering
# Feature Stores
# Open Source
# Production Use Case
# Systems and Architecture
1:00
Posted Jun 01, 2022 | Views 934
# Data engineering
# Model serving
# Systems and Architecture
23:12
Posted Dec 12, 2022 | Views 352
# apply(recsys) 2022
# Production Use Case
# Model training