How is 𝗖𝗜/𝗖𝗗 𝗽𝗿𝗼𝗰𝗲𝘀𝘀 𝗱𝗶𝗳𝗳𝗲𝗿𝗲𝗻𝘁 𝗳𝗼𝗿 𝗠𝗮𝗰𝗵𝗶𝗻𝗲 𝗟𝗲𝗮𝗿𝗻𝗶𝗻𝗴 𝗣𝗿𝗼𝗷𝗲𝗰𝘁𝘀 compared to 𝗥𝗲𝗴𝘂𝗹𝗮𝗿 𝘀𝗼𝗳𝘁𝘄𝗮𝗿𝗲?
The important difference that the Machine Learning aspect of the projects brings to the CI/CD process is the treatment of the Machine Learning Training pipeline as a first class citizen of the software world.
➡️ CI/CD pipeline is a separate entity from Machine Learning Training pipeline. There are frameworks and tools that provide capabilities specific to Machine Learning pipelining needs (e.g. KubeFlow Pipelines, Sagemaker Pipelines etc.).
➡️ ML Training pipeline is an artifact produced by Machine Learning project and should be treated in the CI/CD pipelines as such.
What does it mean? Let’s take a closer look:
Regular CI/CD pipelines will usually be composed of at-least three main steps. These are:
𝗦𝘁𝗲𝗽 𝟭: Unit Tests - you test your code so that the functions and methods produce desired results for a set of predefined inputs.
𝗦𝘁𝗲𝗽 𝟮: Integration Tests - you test specific pieces of the code for ability to integrate with systems outside the boundaries of your code (e.g. databases) and between the pieces of the code itself.
𝗦𝘁𝗲𝗽 𝟯: Delivery - you deliver the produced artifact to a pre-prod or prod environment depending on which stage of GitFlow you are in.
What does it look like when ML Training pipelines are involved?
𝗦𝘁𝗲𝗽 𝟭: Unit Tests - in mature MLOps setup the steps in ML Training pipeline should be contained in their own environments and Unit Testable separately as these are just pieces of code composed of methods and functions.
𝗦𝘁𝗲𝗽 𝟮: Integration Tests - you test if ML Training pipeline can successfully integrate with outside systems, this includes connecting to a Feature Store and extracting data from it, ability to hand over the ML Model artifact to the Model Registry, ability to log metadata to ML Metadata Store etc. This CI/CD step also includes testing the integration between each of the Machine Learning Training pipeline steps, e.g. does it succeed in passing validation data from training step to evaluation step.
𝗦𝘁𝗲𝗽 𝟯: Delivery - the pipeline is delivered to a pre-prod or prod environment depending on which stage of GitFlow you are in. If it is a production environment, the pipeline is ready to be used for Continuous Training. You can trigger the training or retraining of your ML Model ad-hoc, periodically or if the deployed model starts showing signs of Feature/Concept Drift.
The important difference that the Machine Learning aspect of the projects brings to the CI/CD process is the treatment of the Machine Learning Training pipeline as a first class citizen of the software world.
➡️ CI/CD pipeline is a separate entity from Machine Learning Training pipeline. There are frameworks and tools that provide capabilities specific to Machine Learning pipelining needs (e.g. KubeFlow Pipelines, Sagemaker Pipelines etc.).
➡️ ML Training pipeline is an artifact produced by Machine Learning project and should be treated in the CI/CD pipelines as such.
What does it mean? Let’s take a closer look:
Regular CI/CD pipelines will usually be composed of at-least three main steps. These are:
𝗦𝘁𝗲𝗽 𝟭: Unit Tests - you test your code so that the functions and methods produce desired results for a set of predefined inputs.
𝗦𝘁𝗲𝗽 𝟮: Integration Tests - you test specific pieces of the code for ability to integrate with systems outside the boundaries of your code (e.g. databases) and between the pieces of the code itself.
𝗦𝘁𝗲𝗽 𝟯: Delivery - you deliver the produced artifact to a pre-prod or prod environment depending on which stage of GitFlow you are in.
What does it look like when ML Training pipelines are involved?
𝗦𝘁𝗲𝗽 𝟭: Unit Tests - in mature MLOps setup the steps in ML Training pipeline should be contained in their own environments and Unit Testable separately as these are just pieces of code composed of methods and functions.
𝗦𝘁𝗲𝗽 𝟮: Integration Tests - you test if ML Training pipeline can successfully integrate with outside systems, this includes connecting to a Feature Store and extracting data from it, ability to hand over the ML Model artifact to the Model Registry, ability to log metadata to ML Metadata Store etc. This CI/CD step also includes testing the integration between each of the Machine Learning Training pipeline steps, e.g. does it succeed in passing validation data from training step to evaluation step.
𝗦𝘁𝗲𝗽 𝟯: Delivery - the pipeline is delivered to a pre-prod or prod environment depending on which stage of GitFlow you are in. If it is a production environment, the pipeline is ready to be used for Continuous Training. You can trigger the training or retraining of your ML Model ad-hoc, periodically or if the deployed model starts showing signs of Feature/Concept Drift.
👍2
Leetcode patterns you should definitely checkout to Learn DSA(Java) from scratch
1️⃣ Arrays: Data structures, such as arrays, store elements in contiguous memory locations. They are versatile and useful for a wide variety of purposes.
LeetCode Problems:
• Search in Rotated Sorted Array (Problem #33)
• Product of Array Except Self (Problem #238)
• Find the Missing Number (Problem #268)
2️⃣Two Pointers: In Two Pointers, two pointers are maintained in the collection and can be manipulated to solve a problem efficiently.
LeetCode problems:
• Trapping Rain Water (Problem #42)
• Longest Substring Without Repeating Characters (Problem #3)
• Squares of a Sorted Array (Problem #977)
3️⃣In-place Linked List Traversal: As an explanation, in-place traversal is a technique for modifying linked list nodes without using extra space.
LeetCode Problems:
• Remove Nth Node From End of List (Problem #19)
• Reorder List (Problem #143)
4️⃣Fast & Slow Pointers: This pattern uses two pointers to traverse a sequence at different speeds (fast and slow), often used to detect cycles or find a specific position in the sequence.
LeetCode Problems:
• Happy Number (Problem #202)
• Subarray Sum Equals K (Problem #560)
• Intersection of Two Linked Lists (Problem #160)
5️⃣Merge Intervals: This pattern involves merging overlapping intervals in a collection, often used in problems dealing with intervals or ranges.
LeetCode problems:
• Non-overlapping Intervals (Problem #435)
• Minimum Number of Arrows to Burst Balloons (Problem #452)
Join for more: https://t.me/crackingthecodinginterview
DSA Interview Preparation Resources: https://topmate.io/coding/886874
ENJOY LEARNING 👍👍
1️⃣ Arrays: Data structures, such as arrays, store elements in contiguous memory locations. They are versatile and useful for a wide variety of purposes.
LeetCode Problems:
• Search in Rotated Sorted Array (Problem #33)
• Product of Array Except Self (Problem #238)
• Find the Missing Number (Problem #268)
2️⃣Two Pointers: In Two Pointers, two pointers are maintained in the collection and can be manipulated to solve a problem efficiently.
LeetCode problems:
• Trapping Rain Water (Problem #42)
• Longest Substring Without Repeating Characters (Problem #3)
• Squares of a Sorted Array (Problem #977)
3️⃣In-place Linked List Traversal: As an explanation, in-place traversal is a technique for modifying linked list nodes without using extra space.
LeetCode Problems:
• Remove Nth Node From End of List (Problem #19)
• Reorder List (Problem #143)
4️⃣Fast & Slow Pointers: This pattern uses two pointers to traverse a sequence at different speeds (fast and slow), often used to detect cycles or find a specific position in the sequence.
LeetCode Problems:
• Happy Number (Problem #202)
• Subarray Sum Equals K (Problem #560)
• Intersection of Two Linked Lists (Problem #160)
5️⃣Merge Intervals: This pattern involves merging overlapping intervals in a collection, often used in problems dealing with intervals or ranges.
LeetCode problems:
• Non-overlapping Intervals (Problem #435)
• Minimum Number of Arrows to Burst Balloons (Problem #452)
Join for more: https://t.me/crackingthecodinginterview
DSA Interview Preparation Resources: https://topmate.io/coding/886874
ENJOY LEARNING 👍👍
👍2
https://topmate.io/coding/886874
If you're a job seeker, these well structured document DSA resources will help you to know and learn all the real time DSA & OOPS Interview questions with their exact answer. folks who are having 0-4+ years of experience have cracked the interview using this guide!
Please use the above link to avail them!👆
NOTE: -Most people hoard resources without actually opening them even once! The reason for keeping a small price for these resources is to ensure that you value the content available inside this and encourage you to make the best out of it.
Hope this helps in your job search journey... All the best!👍✌️
If you're a job seeker, these well structured document DSA resources will help you to know and learn all the real time DSA & OOPS Interview questions with their exact answer. folks who are having 0-4+ years of experience have cracked the interview using this guide!
Please use the above link to avail them!👆
NOTE: -Most people hoard resources without actually opening them even once! The reason for keeping a small price for these resources is to ensure that you value the content available inside this and encourage you to make the best out of it.
Hope this helps in your job search journey... All the best!👍✌️
❤2👍1