If tomorrow I were hit by a bus, got a concussion, and lost all of my memory.
I would be SAD, but I'd still want to learn ML engineering from scratch. And here is how I would do it in 2024, mostly in order:
Learn math:
- linear algebra: Matrices, vectors, linear equations and transformations,
- probability and statistics: Distributions, hypothesis testing, Bias-Variance Tradeoff, and conditional probability.
- calculus: Derivatives, integrals, gradient descent.
Learn Python:
- data structures: Lists, arrays, and dictionaries.
- libraries: NumPy, Pandas for data manipulations, scikit-learn for ml and Matplotlib for data viz.
- code organization and control flow: Functions, loops, and conditionals.
Data Preprocessing:
- learn how to handle missing values, duplicates, and outliers
- read on feature engineering
- try out data normalization and scaling
Core ML Algorithms:
- Supervised Learning: Linear and Logistic Regression, and Classification algorithms(K-NN, Decision Trees, SVM)
- Unsupervised Learning: Clustering (K-means) and dimensionality reduction techniques (PCA)
- Reinforcement Learning: Basics of agents, environments, and reward systems
Model Evaluation and Validation:
- learn metrics like accuracy, precision, recall, and F1 score
- cross-validation techniques
- learn about overfitting and underfitting, and how to address them
Get Hands-On Experience with Deployment:
- use datasets from Kaggle and/or models from Huggingface.
- learn to deploy with Flask, FastAPI, and on Cloud (AWS or Azure or GCP)
- participate in ml competitions or workshops.
Digest everything before diving into advanced topics like:
- ml frameworks like PyTorch, and TensorFlow
- neural networks, convolutional networks, and recurrent networks
- understand (NLP) and computer vision (CV) applications and ensemble techniques
I would be SAD, but I'd still want to learn ML engineering from scratch. And here is how I would do it in 2024, mostly in order:
Learn math:
- linear algebra: Matrices, vectors, linear equations and transformations,
- probability and statistics: Distributions, hypothesis testing, Bias-Variance Tradeoff, and conditional probability.
- calculus: Derivatives, integrals, gradient descent.
Learn Python:
- data structures: Lists, arrays, and dictionaries.
- libraries: NumPy, Pandas for data manipulations, scikit-learn for ml and Matplotlib for data viz.
- code organization and control flow: Functions, loops, and conditionals.
Data Preprocessing:
- learn how to handle missing values, duplicates, and outliers
- read on feature engineering
- try out data normalization and scaling
Core ML Algorithms:
- Supervised Learning: Linear and Logistic Regression, and Classification algorithms(K-NN, Decision Trees, SVM)
- Unsupervised Learning: Clustering (K-means) and dimensionality reduction techniques (PCA)
- Reinforcement Learning: Basics of agents, environments, and reward systems
Model Evaluation and Validation:
- learn metrics like accuracy, precision, recall, and F1 score
- cross-validation techniques
- learn about overfitting and underfitting, and how to address them
Get Hands-On Experience with Deployment:
- use datasets from Kaggle and/or models from Huggingface.
- learn to deploy with Flask, FastAPI, and on Cloud (AWS or Azure or GCP)
- participate in ml competitions or workshops.
Digest everything before diving into advanced topics like:
- ml frameworks like PyTorch, and TensorFlow
- neural networks, convolutional networks, and recurrent networks
- understand (NLP) and computer vision (CV) applications and ensemble techniques
Some frequently Asked SQL Interview Questions with Answers in data analyst interviews:
1. Write a SQL query to find the average purchase amount for each customer. Assume you have two tables: Customers (CustomerID, Name) and Orders (OrderID, CustomerID, Amount).
SELECT c.CustomerID, c. Name, AVG(o.Amount) AS AveragePurchase
FROM Customers c
JOIN Orders o ON c.CustomerID = o.CustomerID
GROUP BY c.CustomerID, c. Name;
2. Write a query to find the employee with the minimum salary in each department from a table Employees with columns EmployeeID, Name, DepartmentID, and Salary.
SELECT e1.DepartmentID, e1.EmployeeID, e1 .Name, e1.Salary
FROM Employees e1
WHERE Salary = (SELECT MIN(Salary) FROM Employees e2 WHERE e2.DepartmentID = e1.DepartmentID);
3. Write a SQL query to find all products that have never been sold. Assume you have a table Products (ProductID, ProductName) and a table Sales (SaleID, ProductID, Quantity).
SELECT p.ProductID, p.ProductName
FROM Products p
LEFT JOIN Sales s ON p.ProductID = s.ProductID
WHERE s.ProductID IS NULL;
4. Given a table Orders with columns OrderID, CustomerID, OrderDate, and a table OrderItems with columns OrderID, ItemID, Quantity, write a query to find the customer with the highest total order quantity.
SELECT o.CustomerID, SUM(oi.Quantity) AS TotalQuantity
FROM Orders o
JOIN OrderItems oi ON o.OrderID = oi.OrderID
GROUP BY o.CustomerID
ORDER BY TotalQuantity DESC
LIMIT 1;
5. Write a SQL query to find the earliest order date for each customer from a table Orders (OrderID, CustomerID, OrderDate).
SELECT CustomerID, MIN(OrderDate) AS EarliestOrderDate
FROM Orders
GROUP BY CustomerID;
6. Given a table Employees with columns EmployeeID, Name, ManagerID, write a query to find the number of direct reports for each manager.
SELECT ManagerID, COUNT(*) AS NumberOfReports
FROM Employees
WHERE ManagerID IS NOT NULL
GROUP BY ManagerID;
7. Given a table Customers with columns CustomerID, Name, JoinDate, and a table Orders with columns OrderID, CustomerID, OrderDate, write a query to find customers who placed their first order within the last 30 days.
SELECT c.CustomerID, c. Name
FROM Customers c
JOIN Orders o ON c.CustomerID = o.CustomerID
WHERE o.OrderDate = (SELECT MIN(o2.OrderDate) FROM Orders o2 WHERE o2.CustomerID = c.CustomerID)
AND o.OrderDate >= CURRENT_DATE - INTERVAL '30 day';
1. Write a SQL query to find the average purchase amount for each customer. Assume you have two tables: Customers (CustomerID, Name) and Orders (OrderID, CustomerID, Amount).
SELECT c.CustomerID, c. Name, AVG(o.Amount) AS AveragePurchase
FROM Customers c
JOIN Orders o ON c.CustomerID = o.CustomerID
GROUP BY c.CustomerID, c. Name;
2. Write a query to find the employee with the minimum salary in each department from a table Employees with columns EmployeeID, Name, DepartmentID, and Salary.
SELECT e1.DepartmentID, e1.EmployeeID, e1 .Name, e1.Salary
FROM Employees e1
WHERE Salary = (SELECT MIN(Salary) FROM Employees e2 WHERE e2.DepartmentID = e1.DepartmentID);
3. Write a SQL query to find all products that have never been sold. Assume you have a table Products (ProductID, ProductName) and a table Sales (SaleID, ProductID, Quantity).
SELECT p.ProductID, p.ProductName
FROM Products p
LEFT JOIN Sales s ON p.ProductID = s.ProductID
WHERE s.ProductID IS NULL;
4. Given a table Orders with columns OrderID, CustomerID, OrderDate, and a table OrderItems with columns OrderID, ItemID, Quantity, write a query to find the customer with the highest total order quantity.
SELECT o.CustomerID, SUM(oi.Quantity) AS TotalQuantity
FROM Orders o
JOIN OrderItems oi ON o.OrderID = oi.OrderID
GROUP BY o.CustomerID
ORDER BY TotalQuantity DESC
LIMIT 1;
5. Write a SQL query to find the earliest order date for each customer from a table Orders (OrderID, CustomerID, OrderDate).
SELECT CustomerID, MIN(OrderDate) AS EarliestOrderDate
FROM Orders
GROUP BY CustomerID;
6. Given a table Employees with columns EmployeeID, Name, ManagerID, write a query to find the number of direct reports for each manager.
SELECT ManagerID, COUNT(*) AS NumberOfReports
FROM Employees
WHERE ManagerID IS NOT NULL
GROUP BY ManagerID;
7. Given a table Customers with columns CustomerID, Name, JoinDate, and a table Orders with columns OrderID, CustomerID, OrderDate, write a query to find customers who placed their first order within the last 30 days.
SELECT c.CustomerID, c. Name
FROM Customers c
JOIN Orders o ON c.CustomerID = o.CustomerID
WHERE o.OrderDate = (SELECT MIN(o2.OrderDate) FROM Orders o2 WHERE o2.CustomerID = c.CustomerID)
AND o.OrderDate >= CURRENT_DATE - INTERVAL '30 day';
👍2
GridSearchCV vs RandomizedSearchCV in Machine Learning: Differences, Advantages & Disadvantages of Each, and Use Cases
1. GridSearchCV
- Definition: GridSearchCV is an exhaustive search over specified parameter values for an estimator. It uses cross-validation to evaluate the performance of each combination of parameter values.
How it Works:
- Parameter Grid: Define a grid of parameters to search over.
- Exhaustive Search: Evaluate all possible combinations of parameters in the grid.
- Cross-Validation: For each combination, perform cross-validation to assess the model's performance.
- Best Parameters: Select the combination that results in the best performance based on a predefined metric (e.g., accuracy, F1-score).
2. RandomizedSearchCV
- Definition: RandomizedSearchCV performs a random search over specified parameter values for an estimator. It samples a fixed number of parameter settings from the specified distributions.
How it Works:
- Parameter Distributions: Define distributions from which to sample parameter values.
- Random Sampling: Randomly sample a fixed number of parameter combinations.
- Cross-Validation: For each sampled combination, perform cross-validation to assess the model's performance.
- Best Parameters: Select the combination that results in the best performance based on a predefined metric.
Advantages and Disadvantages
- GridSearchCV:
-- Advantages:
1. Exhaustive Search: Guarantees finding the optimal combination within the specified grid.
2. Deterministic: Always produces the same results for the same parameter grid and data.
-- Disadvantages:
1. Computationally Expensive: Evaluates all combinations, which can be very slow for large grids.
2. Scalability Issues: Not feasible for high-dimensional parameter spaces.
- RandomizedSearchCV:
-- Advantages:
1. Efficiency: Can be faster than GridSearchCV by evaluating a fixed number of parameter combinations.
2. Scalability: More feasible for high-dimensional parameter spaces.
3. Exploration: Can potentially find good parameter combinations that GridSearchCV might miss due to its limited grid.
-- Disadvantages:
1. Non-Exhaustive: May not find the optimal combination if the number of iterations is too low.
2. Randomness: Results can vary between runs unless a random seed is set.
Use Cases
- GridSearchCV:
1. Small Parameter Spaces: Suitable when the parameter grid is small and computational resources are sufficient.
2. High Precision: When the goal is to find the exact optimal parameters within the defined grid.
3. Limited Time Constraint: When there is enough time to perform an exhaustive search.
- RandomizedSearchCV:
1. Large Parameter Spaces: Suitable for larger and high-dimensional parameter spaces where an exhaustive search is impractical.
2. Time Efficiency: When there is a need to balance between time and performance, providing a good solution quickly.
3. Exploratory Analysis: Useful in the early stages of model tuning to quickly identify promising parameter regions.
1. GridSearchCV
- Definition: GridSearchCV is an exhaustive search over specified parameter values for an estimator. It uses cross-validation to evaluate the performance of each combination of parameter values.
How it Works:
- Parameter Grid: Define a grid of parameters to search over.
- Exhaustive Search: Evaluate all possible combinations of parameters in the grid.
- Cross-Validation: For each combination, perform cross-validation to assess the model's performance.
- Best Parameters: Select the combination that results in the best performance based on a predefined metric (e.g., accuracy, F1-score).
2. RandomizedSearchCV
- Definition: RandomizedSearchCV performs a random search over specified parameter values for an estimator. It samples a fixed number of parameter settings from the specified distributions.
How it Works:
- Parameter Distributions: Define distributions from which to sample parameter values.
- Random Sampling: Randomly sample a fixed number of parameter combinations.
- Cross-Validation: For each sampled combination, perform cross-validation to assess the model's performance.
- Best Parameters: Select the combination that results in the best performance based on a predefined metric.
Advantages and Disadvantages
- GridSearchCV:
-- Advantages:
1. Exhaustive Search: Guarantees finding the optimal combination within the specified grid.
2. Deterministic: Always produces the same results for the same parameter grid and data.
-- Disadvantages:
1. Computationally Expensive: Evaluates all combinations, which can be very slow for large grids.
2. Scalability Issues: Not feasible for high-dimensional parameter spaces.
- RandomizedSearchCV:
-- Advantages:
1. Efficiency: Can be faster than GridSearchCV by evaluating a fixed number of parameter combinations.
2. Scalability: More feasible for high-dimensional parameter spaces.
3. Exploration: Can potentially find good parameter combinations that GridSearchCV might miss due to its limited grid.
-- Disadvantages:
1. Non-Exhaustive: May not find the optimal combination if the number of iterations is too low.
2. Randomness: Results can vary between runs unless a random seed is set.
Use Cases
- GridSearchCV:
1. Small Parameter Spaces: Suitable when the parameter grid is small and computational resources are sufficient.
2. High Precision: When the goal is to find the exact optimal parameters within the defined grid.
3. Limited Time Constraint: When there is enough time to perform an exhaustive search.
- RandomizedSearchCV:
1. Large Parameter Spaces: Suitable for larger and high-dimensional parameter spaces where an exhaustive search is impractical.
2. Time Efficiency: When there is a need to balance between time and performance, providing a good solution quickly.
3. Exploratory Analysis: Useful in the early stages of model tuning to quickly identify promising parameter regions.
Yesterday's Llama 3.1 release marked a big milestone for LLM researchers and practitioners. Llama 3.1 405B is the biggest and most capable LLM with openly available LLMs. And particularly exciting is that the new Llama release comes with a 93-page research paper this time. Below, I want to share a few interesting facts from the paper, and I will likely write a longer analysis this weekend.
Model sizes
Llama 3.1 now comes in 3 sizes: 8B, 70B, and 405B parameters. The 8B and 70B variants are sight upgrades from the previous Llama 3 models that have been released in April 2024. (See the figure below for a brief performance comparison). The 405B model was used to improve the 8B and 70B via synthetic data during the finetuning stages.
Pretraining Data
The 93-page report by Meta (a link to the report is in the comments below) offers amazing detail. Particularly, the section on preparing the 15.6 trillion tokens for pretraining offers so much detail that it would make it possible to reproduce the dataset preparation. However, Meta doesn't share the dataset sources. All we know is that it's trained primarily on "web data." This is probably because of the usual copyright concerns and to prevent lawsuits.
Still, it's a great writeup if you plan to prepare your own pretraining datasets as it shares recipes on deduplication, formatting (removal of markdown markers), quality filters, removal of unsafe content, and more.
Long-context Support
The models support a context size of up to 128k tokens. The researchers achieved this via a multiple-stage process. First, they pretrained it on 8k context windows (due to resource constraints), followed by continued pretraining on longer 128k token windows. In the continued pretraining, they increased the context length in 6 stages. Moreover, they also observed that finetuning requires 0.1% of long-context instruction samples; otherwise, the long-context capabilities will decline.
Alignment
In contrast to earlier rumors, Llama 3 was not finetuned using both RLHF with proximal policy optimization (PPO) and direct preference optimization (DPO). Following a supervised instruction finetuning stage (SFT), the models were only trained with DPO, not PPO. (Unlike in the Llama 2 paper, unfortunately, the researchers didn't include a chart analyzing the improvements made via this process.). Although they didn't use PPO, they used a reward model for rejection sampling during the instruction finetuning stage.
Inference
The 405B model required 16k H100 GPUs for training. During inference, the bfloat16-bit version of the model still requires 16 H100 GPUs. However, Meta also has an FP8 version that runs on a single server node (that is, 8xH100s).
Performance
You are probably curious about how it compares to other models. The short answer is "very favorable", on par with GPT4.
Model sizes
Llama 3.1 now comes in 3 sizes: 8B, 70B, and 405B parameters. The 8B and 70B variants are sight upgrades from the previous Llama 3 models that have been released in April 2024. (See the figure below for a brief performance comparison). The 405B model was used to improve the 8B and 70B via synthetic data during the finetuning stages.
Pretraining Data
The 93-page report by Meta (a link to the report is in the comments below) offers amazing detail. Particularly, the section on preparing the 15.6 trillion tokens for pretraining offers so much detail that it would make it possible to reproduce the dataset preparation. However, Meta doesn't share the dataset sources. All we know is that it's trained primarily on "web data." This is probably because of the usual copyright concerns and to prevent lawsuits.
Still, it's a great writeup if you plan to prepare your own pretraining datasets as it shares recipes on deduplication, formatting (removal of markdown markers), quality filters, removal of unsafe content, and more.
Long-context Support
The models support a context size of up to 128k tokens. The researchers achieved this via a multiple-stage process. First, they pretrained it on 8k context windows (due to resource constraints), followed by continued pretraining on longer 128k token windows. In the continued pretraining, they increased the context length in 6 stages. Moreover, they also observed that finetuning requires 0.1% of long-context instruction samples; otherwise, the long-context capabilities will decline.
Alignment
In contrast to earlier rumors, Llama 3 was not finetuned using both RLHF with proximal policy optimization (PPO) and direct preference optimization (DPO). Following a supervised instruction finetuning stage (SFT), the models were only trained with DPO, not PPO. (Unlike in the Llama 2 paper, unfortunately, the researchers didn't include a chart analyzing the improvements made via this process.). Although they didn't use PPO, they used a reward model for rejection sampling during the instruction finetuning stage.
Inference
The 405B model required 16k H100 GPUs for training. During inference, the bfloat16-bit version of the model still requires 16 H100 GPUs. However, Meta also has an FP8 version that runs on a single server node (that is, 8xH100s).
Performance
You are probably curious about how it compares to other models. The short answer is "very favorable", on par with GPT4.
Here are 5 beginner-friendly data science project ideas
Loan Approval Prediction
Predict whether a loan will be approved based on customer demographic and financial data. This requires data preprocessing, feature engineering, and binary classification techniques.
Credit Card Fraud Detection
Detect fraudulent credit card transactions with a dataset that contains transactions made by credit cards. This is a good project for learning about imbalanced datasets and anomaly detection methods.
Netflix Movies and TV Shows Analysis
Analyze Netflix's movies and TV shows to discover trends in ratings, popularity, and genre distributions. Visualization tools and exploratory data analysis are key components here.
Sentiment Analysis of Tweets
Analyze the sentiment of tweets to determine whether they are positive, negative, or neutral. This project involves natural language processing and working with text data.
Weather Data Analysis
Analyze historical weather data from the National Oceanic and Atmospheric Administration (NOAA) to look for seasonal trends, weather anomalies, or climate change indicators. This project involves time series analysis and data visualization.
Loan Approval Prediction
Predict whether a loan will be approved based on customer demographic and financial data. This requires data preprocessing, feature engineering, and binary classification techniques.
Credit Card Fraud Detection
Detect fraudulent credit card transactions with a dataset that contains transactions made by credit cards. This is a good project for learning about imbalanced datasets and anomaly detection methods.
Netflix Movies and TV Shows Analysis
Analyze Netflix's movies and TV shows to discover trends in ratings, popularity, and genre distributions. Visualization tools and exploratory data analysis are key components here.
Sentiment Analysis of Tweets
Analyze the sentiment of tweets to determine whether they are positive, negative, or neutral. This project involves natural language processing and working with text data.
Weather Data Analysis
Analyze historical weather data from the National Oceanic and Atmospheric Administration (NOAA) to look for seasonal trends, weather anomalies, or climate change indicators. This project involves time series analysis and data visualization.
👍1
https://youtu.be/ZOJvKbbc6cw
Hi guys a lot of you have not subscribed my channel yet. If you're reading this message then don't forget to subscribe my channel and comment your views. At least half of you go and subscribe my channel.
Thank you in advance
Hi guys a lot of you have not subscribed my channel yet. If you're reading this message then don't forget to subscribe my channel and comment your views. At least half of you go and subscribe my channel.
Thank you in advance
YouTube
Find Customer Referee Leet Code | SQL Day 2 Where and COALESCE
Remove Consecutive Duplicates in Python from a list – Data Science With Ved
https://datasciencewithved.wordpress.com/2024/08/28/remove-consecutive-duplicates-in-python-from-a-list/
https://datasciencewithved.wordpress.com/2024/08/28/remove-consecutive-duplicates-in-python-from-a-list/
Data Science With Ved
Remove Consecutive Duplicates in Python from a list
When working with lists in Python, you might encounter a situation where you need to remove consecutive duplicate elements. Whether you’re processing data from a file, cleaning up user inputs…
📚 Understanding Linear Regression Through a Student’s Journey
Let’s take a trip back to your student days to understand linear regression, one of the most fundamental concepts in machine learning.
Alex, a dedicated student, is trying to predict their final exam score based on the number of hours they study each week. They gather data over the semester and notice a pattern—more hours studied generally leads to higher scores. To quantify this relationship, Alex uses linear regression.
What is Linear Regression?
Linear regression is like drawing a straight line through a scatterplot of data points that best predicts the dependent variable (exam scores) from the independent variable (study hours). The equation of the line looks like this:
Score= Intercept + Slope * Study Hours
Here, the intercept is the score Alex might expect with zero study hours (hopefully not too low!), and the slope shows how much the score increases with each additional hour of study.
Linear regression works under several assumptions:
1. Linearity: The relationship between study hours and exam scores should be linear. If Alex studies twice as much, their score should increase proportionally. But what if the benefit of extra hours diminishes over time? That’s where the linearity assumption can break down.
2. Independence: Each data point (study hours vs. exam score) should be independent of others. If Alex’s friends start influencing their study habits, this assumption might be violated.
3. Homoscedasticity: The variance of errors (differences between predicted and actual scores) should be consistent across all levels of study hours. If Alex’s predictions are more accurate for students who study a little but less accurate for those who study a lot, this assumption doesn’t hold.
4. Normality of Errors: The errors should follow a normal distribution. If the errors are skewed, it might suggest that factors beyond study hours are influencing scores.
Despite its simplicity, linear regression isn’t perfect. Here are a few limitations of linear regression.
- Non-Linearity:If the relationship between study hours and exam scores isn’t linear (e.g., diminishing returns after a certain point), linear regression might not capture the true pattern.
- Outliers: A few students who study a lot but still score poorly can heavily influence the regression line, leading to misleading predictions.
- Overfitting: If Alex adds too many variables (like study environment, type of study material, etc.), the model might become too complex, fitting the noise rather than the true signal.
In Alex’s case, while linear regression provides a simple and interpretable model, it’s important to remember these assumptions and limitations. By understanding them, Alex can better assess when to rely on linear regression and when it might be necessary to explore more advanced methods.
Let’s take a trip back to your student days to understand linear regression, one of the most fundamental concepts in machine learning.
Alex, a dedicated student, is trying to predict their final exam score based on the number of hours they study each week. They gather data over the semester and notice a pattern—more hours studied generally leads to higher scores. To quantify this relationship, Alex uses linear regression.
What is Linear Regression?
Linear regression is like drawing a straight line through a scatterplot of data points that best predicts the dependent variable (exam scores) from the independent variable (study hours). The equation of the line looks like this:
Score= Intercept + Slope * Study Hours
Here, the intercept is the score Alex might expect with zero study hours (hopefully not too low!), and the slope shows how much the score increases with each additional hour of study.
Linear regression works under several assumptions:
1. Linearity: The relationship between study hours and exam scores should be linear. If Alex studies twice as much, their score should increase proportionally. But what if the benefit of extra hours diminishes over time? That’s where the linearity assumption can break down.
2. Independence: Each data point (study hours vs. exam score) should be independent of others. If Alex’s friends start influencing their study habits, this assumption might be violated.
3. Homoscedasticity: The variance of errors (differences between predicted and actual scores) should be consistent across all levels of study hours. If Alex’s predictions are more accurate for students who study a little but less accurate for those who study a lot, this assumption doesn’t hold.
4. Normality of Errors: The errors should follow a normal distribution. If the errors are skewed, it might suggest that factors beyond study hours are influencing scores.
Despite its simplicity, linear regression isn’t perfect. Here are a few limitations of linear regression.
- Non-Linearity:If the relationship between study hours and exam scores isn’t linear (e.g., diminishing returns after a certain point), linear regression might not capture the true pattern.
- Outliers: A few students who study a lot but still score poorly can heavily influence the regression line, leading to misleading predictions.
- Overfitting: If Alex adds too many variables (like study environment, type of study material, etc.), the model might become too complex, fitting the noise rather than the true signal.
In Alex’s case, while linear regression provides a simple and interpretable model, it’s important to remember these assumptions and limitations. By understanding them, Alex can better assess when to rely on linear regression and when it might be necessary to explore more advanced methods.
🚨 Major Announcement: Mukesh Ambani to transform Rel'AI'ince into a deeptech company
He is focused on driving AI adoption across Reliance Industries Limited's operations through several initiatives:
➡️ Developing cost-effective generative AI models and partnering with tech companies to optimize AI inferencing
➡️ Introducing Jio Brain, a comprehensive suite of AI tools designed to enhance decision-making, predictions, and customer insights across Reliance’s ecosystem
➡️ Building a large-scale, AI-ready data center in Jamnagar, Gujarat, equipped with advanced AI inference facilities
➡️ Launching JioAI Cloud with a special Diwali offer of up to 100 GB of free cloud storage
➡️ Collaborating with Jio Institute to create AI programs for upskilling
➡️ Introducing "Hello Jio," a generative AI voice assistant integrated with JioTV OS to help users find content on Jio set-top boxes
➡️ Launching "JioPhoneCall AI," a feature that uses generative AI to transcribe, summarize, and translate phone calls.
He is focused on driving AI adoption across Reliance Industries Limited's operations through several initiatives:
➡️ Developing cost-effective generative AI models and partnering with tech companies to optimize AI inferencing
➡️ Introducing Jio Brain, a comprehensive suite of AI tools designed to enhance decision-making, predictions, and customer insights across Reliance’s ecosystem
➡️ Building a large-scale, AI-ready data center in Jamnagar, Gujarat, equipped with advanced AI inference facilities
➡️ Launching JioAI Cloud with a special Diwali offer of up to 100 GB of free cloud storage
➡️ Collaborating with Jio Institute to create AI programs for upskilling
➡️ Introducing "Hello Jio," a generative AI voice assistant integrated with JioTV OS to help users find content on Jio set-top boxes
➡️ Launching "JioPhoneCall AI," a feature that uses generative AI to transcribe, summarize, and translate phone calls.