Machine Learning And AI

https://geekycodesin.wordpress.com/2024/11/08/searching-in-a-rotated-sorted-array/

233 views15:44

𝗗𝗮𝘁𝗮 𝗦𝗰𝗶𝗲𝗻𝗰𝗲 𝗜𝗻𝘁𝗲𝗿𝘃𝗶𝗲𝘄 𝗤𝘂𝗲𝘀𝘁𝗶𝗼𝗻:
How would you extend SVM for multi-class classification?

Two common ways are -

𝗢𝗻𝗲-𝘃𝘀-𝗥𝗲𝘀𝘁 (𝗢𝘃𝗥) (𝗼𝗿 𝗢𝗻𝗲-𝘃𝘀-𝗔𝗹𝗹)
Each classifier is trained to separate one class from all others. For K classes, OvR builds K SVM models, where each model is trained with the class of interest labeled as positive and all other classes labeled as negative. For a new instance, each classifier outputs a score, and the class with the highest score is chosen as the predicted class.

Pros of OvR -
🧤 Computationally efficient, especially when there are many classes, as it requires fewer classifiers.
🧤 Works well when the dataset is large, and class overlap isn’t significant.

Cons of OvR -
🔻 The negative class for each classifier can be a mix of very different classes, which can make the boundary between classes less distinct.
🔻 May struggle with overlapping classes, as it requires each classifier to make broad distinctions between one class and all others.

𝗢𝗻𝗲-𝘃𝘀-𝗢𝗻𝗲 (𝗢𝘃𝗢)
This method involves building a separate binary classifier for each pair of classes, resulting in (K(K−1))/2 classifiers for K classes. Each classifier learns to distinguish between just two classes. For classification, each binary classifier votes for a class, and the class with the most votes is selected.

Pros of OvO -
🧤 Creates simpler decision boundaries, as each classifier only has to separate two classes.
🧤 Often yields higher accuracy for complex, overlapping classes since it doesn't force each classifier to distinguish between all classes.

Cons of OvO -
🔻 Computationally intensive for large numbers of classes, due to the higher number of classifiers.
🔻 Prediction time can be slower as it requires voting among all classifiers, which can be significant if there are many classes.

𝗖𝗵𝗼𝗼𝘀𝗶𝗻𝗴 𝗕𝗲𝘁𝘄𝗲𝗲𝗻 𝗢𝘃𝗥 𝗮𝗻𝗱 𝗢𝘃𝗢
The choice between OvR and OvO depends largely on the specific dataset characteristics and computational constraints:
👉 If computational resources are limited and the number of classes is high, OvR may be preferred, as it requires fewer classifiers and is faster to train and predict with.
👉 If accuracy is critical and the classes overlap significantly, OvO often performs better since it learns more specialized decision boundaries for each pair of classes.

421 views04:52

Machine Learning And AI

276 views04:52

Machine Learning And AI

https://geekycodesin.wordpress.com/2024/11/09/finding-triple-sum-in-a-binary-tree/

370 views05:44

Machine Learning And AI

https://geekycodesin.wordpress.com/2024/11/09/sorting-array-based-on-k-th-bit-a-detailed-explanation/

239 views13:02

Machine Learning And AI

So what should an entry-level interview experience look like?

Having been on both sides of the process - this format, IMO, is the most effective one

Round 1:
⭐️ 30 minutes LeetCode, 30 minutes SQL
The goal? Understand how candidate approaches the problem - clarifies ambiguity, addresses edge cases, and writes code.
Passing a few test cases is required, but not all.
Better than brute force is required, optimal solution is not.

Round 2:
⭐️ Machine Learning/Statistics and Resume-based
The goal? Make sure they understand basic concepts - bias vs variance, hypothesis testing, cleaning data etc. and how they have approached ML formulation, metric selection and modelling in the past.

Round 3:
⭐️ Hiring Manager (+ senior team member) to review work on resume + culture fit
The goal? For the HM and senior team members to assess if the candidate is a culture fit with the team; To review prior work and see if how they think about solving a data/ML problem would work in the team (or if the person is coachable)
Join our channel for more information like this

432 views14:12

Machine Learning And AI

https://geekycodesin.wordpress.com/2024/11/09/count-ways-to-reach-the-nth-stair-a-dynamic-programming-approach/

297 views16:13

Machine Learning And AI

https://geekycodesin.wordpress.com/2024/11/10/reversing-a-doubly-linked-list-a-step-by-step-guide/

269 views17:16

Machine Learning And AI

https://geekycodesin.wordpress.com/2024/11/10/finding-the-third-greatest-element-in-an-array-a-step-by-step-guide/

Geeky Codes

Finding the Third Greatest Element in an Array: A Step-by-Step Guide

In many programming problems, you’re asked to find the third largest element in a given list or array. This problem is quite common in coding interviews and algorithmic challenges. Understand…

288 views17:28

Machine Learning And AI

https://geekycodesin.wordpress.com/2024/11/11/understanding-normalization-indexing-sql-queries-and-constraints-in-dbms/

Geeky Codes

Understanding Normalization, Indexing, SQL Queries, and Constraints in DBMS

In this blog post, we will delve into several crucial topics related to Database Management Systems (DBMS), including normalization, indexing, SQL queries for common operations, and constraints. Th…

268 views15:43

Machine Learning And AI

https://geekycodesin.wordpress.com/2024/11/11/understanding-oops-concepts-a-deep-dive-with-real-life-examples/

Geeky Codes

Understanding OOPS Concepts: A Deep Dive with Real-Life Examples

In this blog post, we will explore the fundamental concepts of Object-Oriented Programming (OOP), including Abstraction, Encapsulation, Polymorphism, and Inheritance. We will also look at Dynamic a…

280 views16:21

Machine Learning And AI

https://geekycodesin.wordpress.com/2024/11/11/understanding-the-singleton-design-pattern-ensuring-a-single-instance/

Geeky Codes

Geeky CodesUnderstanding the Singleton Design Pattern: Ensuring a Single Instance

When designing software systems, managing resources and ensuring consistency across different parts of an application are essential goals. One way to achieve this is through the Singleton Design Pa…

264 views18:01

Machine Learning And AI

https://geekycodesin.wordpress.com/2024/11/12/understanding-extension-methods-in-programming/

Geeky Codes

Understanding Extension Methods in Programming

When working with object-oriented programming languages, it’s not uncommon to encounter situations where you want to add new functionality to existing classes—without modifying their source c…

299 views18:33

Machine Learning And AI

https://geekycodesin.wordpress.com/2024/11/12/understanding-primary-and-secondary-constraints-in-sql/

Geeky Codes

Understanding Primary and Secondary Constraints in SQL

When designing a relational database, it’s crucial to ensure data integrity, consistency, and structure. One way to enforce these rules is through constraints. Constraints are conditions or rules a…

279 views12:26

Machine Learning And AI

https://youtu.be/P1QX6bhnojk

YouTube

Big Countries | Leet Code SQL Day 3 | 50 Day Challenge

Hey Guys,
Welcome to my Youtube Channel,Geeky Codes. In this channel, I am trying to create videos related to data science. I've started a series of SQL 50 Day Plan This is 3rd Video in this series. Please do not forget to subscribe my channel if you're…

404 views13:39

Machine Learning And AI

https://geekycodesin.wordpress.com/2024/11/13/understanding-filters-in-mvc-a-deep-dive/

Geeky Codes

Understanding Filters in MVC: A Deep Dive

nderstanding Filters in MVC: A Deep Dive Filters in MVC (Model-View-Controller) are a powerful feature that can help developers centralize code, manage cross-cutting concerns, and improve the maint…

270 views14:43

Machine Learning And AI

https://geekycodesin.wordpress.com/2024/11/13/understanding-scrum-methodology-a-comprehensive-guide/

Geeky Codes

Understanding Scrum Methodology: A Comprehensive Guide

In today’s fast-paced, ever-changing world of software development, traditional project management approaches often struggle to keep up with the demands of innovation, speed, and flexibility. Enter…

303 views15:31

Machine Learning And AI

https://geekycodesin.wordpress.com/2024/11/13/understanding-dependency-injection-in-machine-learning-a-comprehensive-guide/

Geeky Codes

Understanding Dependency Injection in Machine Learning: A Comprehensive Guide

In the world of software development, dependency injection (DI) has long been used as a key design pattern to improve the flexibility, scalability, and testability of applications. But what about d…

287 views15:43

Machine Learning And AI

https://geekycodesin.wordpress.com/2024/11/13/sorting-an-array-with-0s-1s-and-2s-in-one-pass-python-implementation/

Geeky Codes

Sorting an Array with 0s, 1s, and 2s in One Pass: Python Implementation.

When it comes to sorting an array that contains only 0s, 1s, and 2s, the task becomes relatively simpler compared to sorting general arrays. This type of problem is often referred to as the Dutch N…

320 views15:59

Machine Learning And AI

Amazon Data Science Interview Question:
In a linear regression model, what are the key assumptions that need to be satisfied for the model to be valid? How would you evaluate whether these assumptions hold in your dataset?

This is also, the most common question I see across companies!

So the assumptions are -

𝗟𝗶𝗻𝗲𝗮𝗿𝗶𝘁𝘆
The relationship between the independent variables (predictors) and the dependent variable is linear. This means that the effect of each predictor on the outcome is constant and additive.
How to evaluate? - Scatter plots of predictors vs. the dependent variable and residual vs. fitted value plots. You can also use polynomial regression or transformations (log, square root) if non-linearity is detected.
How to fix? - Apply feature transformations (e.g., log, square root, polynomial) or use non-linear models.

𝗡𝗼𝗿𝗺𝗮𝗹𝗶𝘁𝘆 𝗼𝗳 𝗘𝗿𝗿𝗼𝗿𝘀
The residuals are normally distributed, especially for the purpose of conducting statistical tests and constructing confidence intervals.
How to evaluate - Residual autocorrelation plots or the Durbin-Watson test for time-series data. For non-time-series data, this assumption can often be assumed to be satisfied if the data is randomly sampled.
How to fix - Transform the dependent variable (log, box-cox) and/or check for outliers.

𝗛𝗼𝗺𝗼𝘀𝗰𝗲𝗱𝗮𝘀𝘁𝗶𝗰𝗶𝘁𝘆 (𝗖𝗼𝗻𝘀𝘁𝗮𝗻𝘁 𝗩𝗮𝗿𝗶𝗮𝗻𝗰𝗲 𝗼𝗳 𝗘𝗿𝗿𝗼𝗿𝘀)
The variance of the residuals (errors) is constant across all levels of the independent variables. In other words, the spread of residuals should not increase or decrease as the predicted values increase.
How to evaluate - Plot the residuals against fitted values. If the plot shows a "fan" shape (i.e., increasing or decreasing spread of residuals), you may need to address heteroscedasticity using robust standard errors or a transformation (e.g., log-transformation).
How to fix - Transformation of dependent variable (log, box-cox) or weighted least squares regression can help

𝗡𝗼 𝗠𝘂𝗹𝘁𝗶𝗰𝗼𝗹𝗹𝗶𝗻𝗲𝗮𝗿𝗶𝘁𝘆
The independent variables (predictors) are not highly correlated with each other. High correlation between predictors can lead to multicollinearity, which makes it difficult to determine the individual effect of each predictor on the dependent variable.
How to evaluate - Calculate the Variance Inflation Factor (VIF) for each predictor. If VIF is high, consider removing highly correlated predictors or combining them into a single predictor (e.g., using Principal Component Analysis).
How to fix - Remove or combine correlated predictors, or use regularized regression models like Ridge or Lasso regression.

470 views08:38

About

Blog

Apps

Platform