Prepare for GATE: The Right Time is NOW!
GeeksforGeeks brings you everything you need to crack GATE 2026 – 900+ live hours, 300+ recorded sessions, and expert mentorship to keep you on track.
What’s inside?
✔ Live & recorded classes with India’s top educators
✔ 200+ mock tests to track your progress
✔ Study materials - PYQs, workbooks, formula book & more
✔ 1:1 mentorship & AI doubt resolution for instant support
✔ Interview prep for IITs & PSUs to help you land opportunities
Learn from Experts Like:
Satish Kumar Yadav – Trained 20K+ students
Dr. Khaleel – Ph.D. in CS, 29+ years of experience
Chandan Jha – Ex-ISRO, AIR 23 in GATE
Vijay Kumar Agarwal – M.Tech (NIT), 13+ years of experience
Sakshi Singhal – IIT Roorkee, AIR 56 CSIR-NET
Shailendra Singh – GATE 99.24 percentile
Devasane Mallesham – IIT Bombay, 13+ years of experience
Use code UPSKILL30 to get an extra 30% OFF (Limited time only)
📌 Enroll for a free counseling session now: https://gfgcdn.com/tu/UI2/
GeeksforGeeks brings you everything you need to crack GATE 2026 – 900+ live hours, 300+ recorded sessions, and expert mentorship to keep you on track.
What’s inside?
✔ Live & recorded classes with India’s top educators
✔ 200+ mock tests to track your progress
✔ Study materials - PYQs, workbooks, formula book & more
✔ 1:1 mentorship & AI doubt resolution for instant support
✔ Interview prep for IITs & PSUs to help you land opportunities
Learn from Experts Like:
Satish Kumar Yadav – Trained 20K+ students
Dr. Khaleel – Ph.D. in CS, 29+ years of experience
Chandan Jha – Ex-ISRO, AIR 23 in GATE
Vijay Kumar Agarwal – M.Tech (NIT), 13+ years of experience
Sakshi Singhal – IIT Roorkee, AIR 56 CSIR-NET
Shailendra Singh – GATE 99.24 percentile
Devasane Mallesham – IIT Bombay, 13+ years of experience
Use code UPSKILL30 to get an extra 30% OFF (Limited time only)
📌 Enroll for a free counseling session now: https://gfgcdn.com/tu/UI2/
👍4
Python project-based interview questions for a data analyst role, along with tips and sample answers [Part-1]
1. Data Cleaning and Preprocessing
- Question: Can you walk me through the data cleaning process you followed in a Python-based project?
- Answer: In my project, I used Pandas for data manipulation. First, I handled missing values by imputing them with the median for numerical columns and the most frequent value for categorical columns using
- Tip: Mention specific functions you used, like
2. Exploratory Data Analysis (EDA)
- Question: How did you perform EDA in a Python project? What tools did you use?
- Answer: I used Pandas for data exploration, generating summary statistics with
- Tip: Focus on how you used visualization tools like Matplotlib, Seaborn, or Plotly, and mention any specific insights you gained from EDA (e.g., data distributions, relationships, outliers).
3. Pandas Operations
- Question: Can you explain a situation where you had to manipulate a large dataset in Python using Pandas?
- Answer: In a project, I worked with a dataset containing over a million rows. I optimized my operations by using vectorized operations instead of Python loops. For example, I used
- Tip: Emphasize your understanding of efficient data manipulation with Pandas, mentioning functions like
4. Data Visualization
- Question: How do you create visualizations in Python to communicate insights from data?
- Answer: I primarily use Matplotlib and Seaborn for static plots and Plotly for interactive dashboards. For example, in one project, I used
- Tip: Mention the specific plots you created and how you customized them (e.g., adding labels, titles, adjusting axis scales). Highlight the importance of clear communication through visualization.
Like this post if you want next part of this interview series 👍❤️
1. Data Cleaning and Preprocessing
- Question: Can you walk me through the data cleaning process you followed in a Python-based project?
- Answer: In my project, I used Pandas for data manipulation. First, I handled missing values by imputing them with the median for numerical columns and the most frequent value for categorical columns using
fillna()
. I also removed outliers by setting a threshold based on the interquartile range (IQR). Additionally, I standardized numerical columns using StandardScaler from Scikit-learn and performed one-hot encoding for categorical variables using Pandas' get_dummies()
function.- Tip: Mention specific functions you used, like
dropna()
, fillna()
, apply()
, or replace()
, and explain your rationale for selecting each method.2. Exploratory Data Analysis (EDA)
- Question: How did you perform EDA in a Python project? What tools did you use?
- Answer: I used Pandas for data exploration, generating summary statistics with
describe()
and checking for correlations with corr()
. For visualization, I used Matplotlib and Seaborn to create histograms, scatter plots, and box plots. For instance, I used sns.pairplot()
to visually assess relationships between numerical features, which helped me detect potential multicollinearity. Additionally, I applied pivot tables to analyze key metrics by different categorical variables.- Tip: Focus on how you used visualization tools like Matplotlib, Seaborn, or Plotly, and mention any specific insights you gained from EDA (e.g., data distributions, relationships, outliers).
3. Pandas Operations
- Question: Can you explain a situation where you had to manipulate a large dataset in Python using Pandas?
- Answer: In a project, I worked with a dataset containing over a million rows. I optimized my operations by using vectorized operations instead of Python loops. For example, I used
apply()
with a lambda function to transform a column, and groupby()
to aggregate data by multiple dimensions efficiently. I also leveraged merge()
to join datasets on common keys.- Tip: Emphasize your understanding of efficient data manipulation with Pandas, mentioning functions like
groupby()
, merge()
, concat()
, or pivot()
.4. Data Visualization
- Question: How do you create visualizations in Python to communicate insights from data?
- Answer: I primarily use Matplotlib and Seaborn for static plots and Plotly for interactive dashboards. For example, in one project, I used
sns.heatmap()
to visualize the correlation matrix and sns.barplot()
for comparing categorical data. For time-series data, I used Matplotlib to create line plots that displayed trends over time. When presenting the results, I tailored visualizations to the audience, ensuring clarity and simplicity.- Tip: Mention the specific plots you created and how you customized them (e.g., adding labels, titles, adjusting axis scales). Highlight the importance of clear communication through visualization.
Like this post if you want next part of this interview series 👍❤️
👍5
How to convert image to pdf in Python
# Python3 program to convert image to pfd
# using img2pdf library
# importing necessary libraries
import img2pdf
from PIL import Image
import os
# storing image path
img_path = "Input.png"
# storing pdf path
pdf_path = "file_pdf.pdf"
# opening image
image = Image.open(img_path)
# converting into chunks using img2pdf
pdf_bytes = img2pdf.convert(image.filename)
# opening or creating pdf file
file = open(pdf_path, "wb")
# writing pdf files with chunks
file.write(pdf_bytes)
# closing image file
image.close()
# closing pdf file
file.close()
# output
print("Successfully made pdf file")
pip3 install pillow && pip3 install img2pdf
👍9