Tecton
timezone
+00:00 GMT
SIGN IN
  • Home
  • Events
  • Content
  • Help
Sign In
Sign in or Join the community to continue

Programmatic Supervision for Software 2.0

Posted Mar 28, 2021 | Views 331
# Data labeling
# Open Source
# Systems and Architecture
Share
SPEAKER
Alex Ratner
Alex Ratner
Alex Ratner
Co-Founder & CEO @ Snorkel AI

Alex Ratner is the co-founder and CEO at Snorkel AI and an Assistant Professor of Computer Science at the University of Washington. Prior to Snorkel AI and UW, he completed his Ph.D. in CS advised by Christopher Ré at Stanford, where he started and led the Snorkel open source project, and where his research focused on applying data management and statistical learning techniques to emerging machine learning workflows such as creating and managing training data and applying this to real-world problems in medicine, knowledge base construction, and more. Previously, he earned his A.B. in Physics from Harvard University.

+ Read More

Alex Ratner is the co-founder and CEO at Snorkel AI and an Assistant Professor of Computer Science at the University of Washington. Prior to Snorkel AI and UW, he completed his Ph.D. in CS advised by Christopher Ré at Stanford, where he started and led the Snorkel open source project, and where his research focused on applying data management and statistical learning techniques to emerging machine learning workflows such as creating and managing training data and applying this to real-world problems in medicine, knowledge base construction, and more. Previously, he earned his A.B. in Physics from Harvard University.

+ Read More
SUMMARY

One of major bottlenecks in the development and deployment of AI applications is the need for the massive labeled training datasets that drive modern ML approaches today. These training datasets traditionally are often labeled by hand at great time and monetary expense, and often cannot be hand-labeled practically at all due to privacy, expertise, and or rate-of-change requirements in real world settings like healthcare and more.

This talk will cover a range of programmatic (often called "weak supervision") approaches to building, labeling, augmenting, and structuring training datasets, as well as the broader effects on end-to-end ML and AI application development. Specifically, this talk will cover techniques around programmatic labeling- such as the data programming and Snorkel approaches; data augmentation techniques for augmenting datasets with transformed copies of data to increase model robustness; data structuring or ÒslicingÓ techniques for highlighting, monitoring, and enabling models to attend to critical and/or difficult subsets of the data; and more key techniques around training data management.

More broadly, this talk will address how these new programmatic approaches lead to a whole new end-to-end ML/AI application development process. Using the example of Snorkel Flow, a new platform for this process, I will cover these ideas and how they extend to model training, monitoring and analysis, and the feedback loops that lead to actionable modification or extension of the programmatic supervision approaches, leading more broadly to a more iterative and error analysis-driven development and deployment process for ML and AI applications overall.

+ Read More

Watch More

30
Posted Jan 04, 2022 | Views 337
# Organization and Processes
10
Posted May 12, 2022 | Views 1.1K
# Feature Stores
# Production Use Case
# Systems and Architecture