Интересное что-то

49 views11:22

Forwarded from Artem Ryblov’s Data Science Weekly

Machine Learning in Production by Carnegie Mellon University

This is a course for those who want to build software products with machine learning, not just models and demos. We assume that you can train a model or build prompts to make predictions, but what does it take to turn the model into a product and actually deploy it, have confidence in its quality, and successfully operate and maintain it at scale?

The course is designed to establish a working relationship between software engineers and data scientists: both contribute to building ML-enabled systems but have different expertise and focuses. To work together they need a mutual understanding of their roles, tasks, concerns, and goals and build a working relationship. This course is aimed at software engineers who want to build robust and responsible products meeting the specific challenges of working with ML components and at data scientists who want to understand the requirements of the model for production use and want to facilitate getting a prototype model into production; it facilitates communication and collaboration between both roles. The course is a good fit for student looking at a career as an ML engineer. The course focuses on all the steps needed to turn a model into a production system in a responsible and reliable manner.

It covers topics such as:
- How to design for wrong predictions the model may make?
How to assure safety and security despite possible mistakes? How to design the user interface and the entire system to operate in the real world?
- How to reliably deploy and update models in production?
How can we test the entire machine learning pipeline? How can MLOps tools help to automate and scale the deployment process? How can we experiment in production (A/B testing, canary releases)? How do we detect data quality issues, concept drift, and feedback loops in production?
- How to scale production ML systems?
How do we design a system to process huge amounts of training data, telemetry data, and user requests? Should we use stream processing, batch processing, lambda architecture, or data lakes?
- How to test and debug production ML systems?
How can we evaluate the quality of a model’s predictions in production? How can we test the entire ML-enabled system, not just the model? What lessons can we learn from software testing, automated test case generation, simulation, and continuous integration for testing for production machine learning?
- Which qualities matter beyond a model’s prediction accuracy?
How can we identify and measure important quality requirements, including learning and inference latency, operating cost, scalability, explainablity, fairness, privacy, robustness, and safety? Does the application need to be able to operate offline and how often do we need to update the models? How do we identify what’s important in a ML-enabled product in a production setting for a business? How do we resolve conflicts and tradeoffs?
How to work effectively in interdisciplinary teams?
How can we bring data scientists, software engineers, UI designers, managers, domain experts, big data specialists, operators, legal council, and other roles together and develop a shared understanding and team culture?

Link: Course

Navigational hashtags: #armcourses
General hashtags: #ml #dl #machinelearning #deeplearning #mlsystemdesign #mlops #mlsysdes

@data_science_weekly

60 views11:22