💡 Building a Simple Convolutional Neural Network (CNN)
Constructing a basic Convolutional Neural Network (CNN) is a fundamental step in deep learning for image processing. Using TensorFlow's Keras API, we can define a network with convolutional, pooling, and dense layers to classify images. This example sets up a simple CNN to recognize handwritten digits from the MNIST dataset.
Code explanation: This script defines a simple CNN using Keras. It loads and normalizes MNIST images. The
#Python #DeepLearning #CNN #Keras #TensorFlow
━━━━━━━━━━━━━━━
By: @DataScienceM ✨
Constructing a basic Convolutional Neural Network (CNN) is a fundamental step in deep learning for image processing. Using TensorFlow's Keras API, we can define a network with convolutional, pooling, and dense layers to classify images. This example sets up a simple CNN to recognize handwritten digits from the MNIST dataset.
import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.datasets import mnist
import numpy as np
# 1. Load and preprocess the MNIST dataset
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
# Reshape images for CNN: (batch_size, height, width, channels)
# MNIST images are 28x28 grayscale, so channels = 1
train_images = train_images.reshape((60000, 28, 28, 1)).astype('float32') / 255
test_images = test_images.reshape((10000, 28, 28, 1)).astype('float32') / 255
# 2. Define the CNN architecture
model = models.Sequential()
# First Convolutional Block
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
model.add(layers.MaxPooling2D((2, 2)))
# Second Convolutional Block
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
# Flatten the 3D output to 1D for the Dense layers
model.add(layers.Flatten())
# Dense (fully connected) layers
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10, activation='softmax')) # Output layer for 10 classes (digits 0-9)
# 3. Compile the model
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
# Print a summary of the model layers
model.summary()
# 4. Train the model (uncomment to run training)
# print("\nTraining the model...")
# model.fit(train_images, train_labels, epochs=5, batch_size=64, validation_split=0.1)
# 5. Evaluate the model (uncomment to run evaluation)
# print("\nEvaluating the model...")
# test_loss, test_acc = model.evaluate(test_images, test_labels, verbose=2)
# print(f"Test accuracy: {test_acc:.4f}")
Code explanation: This script defines a simple CNN using Keras. It loads and normalizes MNIST images. The
Sequential model adds Conv2D layers for feature extraction, MaxPooling2D for downsampling, a Flatten layer to transition to 1D, and Dense layers for classification. The model is then compiled with an optimizer, loss function, and metrics, and a summary of its architecture is printed. Training and evaluation steps are included as commented-out examples.#Python #DeepLearning #CNN #Keras #TensorFlow
━━━━━━━━━━━━━━━
By: @DataScienceM ✨
💡 Python: Simple K-Means Clustering Project
K-Means is a popular unsupervised machine learning algorithm used to partition
Code explanation: This script loads the Iris dataset, scales its features using
#Python #MachineLearning #KMeans #Clustering #DataScience
━━━━━━━━━━━━━━━
By: @DataScienceM ✨
K-Means is a popular unsupervised machine learning algorithm used to partition
n observations into k clusters, where each observation belongs to the cluster with the nearest mean (centroid). This simple project demonstrates K-Means on the classic Iris dataset using scikit-learn to group similar flower species based on their measurements.import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
from sklearn.cluster import KMeans
from sklearn.preprocessing import StandardScaler
import numpy as np
# 1. Load the Iris dataset
iris = load_iris()
X = iris.data # Features (sepal length, sepal width, petal length, petal width)
y = iris.target # True labels (0, 1, 2 for different species) - not used by KMeans
# 2. (Optional but recommended) Scale the features
# K-Means is sensitive to the scale of features
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
# 3. Define and train the K-Means model
# We know there are 3 species in Iris, so we set n_clusters=3
kmeans = KMeans(n_clusters=3, random_state=42, n_init=10) # n_init is important for robust results
kmeans.fit(X_scaled)
# 4. Get the cluster assignments for each data point
labels = kmeans.labels_
# 5. Get the coordinates of the cluster centroids
centroids = kmeans.cluster_centers_
# 6. Visualize the clusters (using first two features for simplicity)
plt.figure(figsize=(8, 6))
# Plot each cluster
colors = ['red', 'green', 'blue']
for i in range(3):
plt.scatter(X_scaled[labels == i, 0], X_scaled[labels == i, 1],
s=50, c=colors[i], label=f'Cluster {i+1}', alpha=0.7)
# Plot the centroids
plt.scatter(centroids[:, 0], centroids[:, 1],
s=200, marker='X', c='black', label='Centroids', edgecolor='white')
plt.title('K-Means Clustering on Iris Dataset (Scaled Features)')
plt.xlabel('Scaled Sepal Length')
plt.ylabel('Scaled Sepal Width')
plt.legend()
plt.grid(True)
plt.show()
# You can also compare with true labels (for evaluation, not part of clustering process itself)
# print("True labels:", y)
# print("K-Means labels:", labels)
Code explanation: This script loads the Iris dataset, scales its features using
StandardScaler, and then applies KMeans to group the data into 3 clusters. It visualizes the resulting clusters and their centroids using a scatter plot with the first two scaled features.#Python #MachineLearning #KMeans #Clustering #DataScience
━━━━━━━━━━━━━━━
By: @DataScienceM ✨
🤖🧠 Reflex: Build Full-Stack Web Apps in Pure Python — Fast, Flexible and Powerful
🗓️ 29 Oct 2025
📚 AI News & Trends
Building modern web applications has traditionally required mastering multiple languages and frameworks from JavaScript for the frontend to Python, Java or Node.js for the backend. For many developers, switching between different technologies can slow down productivity and increase complexity. Reflex eliminates that problem. It is an innovative open-source full-stack web framework that allows developers to ...
#Reflex #FullStack #WebDevelopment #Python #OpenSource #WebApps
🗓️ 29 Oct 2025
📚 AI News & Trends
Building modern web applications has traditionally required mastering multiple languages and frameworks from JavaScript for the frontend to Python, Java or Node.js for the backend. For many developers, switching between different technologies can slow down productivity and increase complexity. Reflex eliminates that problem. It is an innovative open-source full-stack web framework that allows developers to ...
#Reflex #FullStack #WebDevelopment #Python #OpenSource #WebApps
💡 Pandas Cheatsheet
A quick guide to essential Pandas operations for data manipulation, focusing on creating, selecting, filtering, and grouping data in a DataFrame.
1. Creating a DataFrame
The primary data structure in Pandas is the DataFrame. It's often created from a dictionary.
• A dictionary is defined where keys become column names and values become the data in those columns.
2. Selecting Data with
Use
•
•
3. Filtering Data
Select subsets of data based on conditions.
• The expression
• Using this Series as an index
4. Grouping and Aggregating
The "group by" operation involves splitting data into groups, applying a function, and combining the results.
•
•
#Python #Pandas #DataAnalysis #DataScience #Programming
━━━━━━━━━━━━━━━
By: @DataScienceM ✨
A quick guide to essential Pandas operations for data manipulation, focusing on creating, selecting, filtering, and grouping data in a DataFrame.
1. Creating a DataFrame
The primary data structure in Pandas is the DataFrame. It's often created from a dictionary.
import pandas as pd
data = {'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 32, 28],
'City': ['New York', 'Paris', 'New York']}
df = pd.DataFrame(data)
print(df)
# Name Age City
# 0 Alice 25 New York
# 1 Bob 32 Paris
# 2 Charlie 28 New York
• A dictionary is defined where keys become column names and values become the data in those columns.
pd.DataFrame() converts it into a tabular structure.2. Selecting Data with
.loc and .ilocUse
.loc for label-based selection and .iloc for integer-position based selection.# Select the first row by its integer position (0)
print(df.iloc[0])
# Select the row with index label 1 and only the 'Name' column
print(df.loc[1, 'Name'])
# Output for df.iloc[0]:
# Name Alice
# Age 25
# City New York
# Name: 0, dtype: object
#
# Output for df.loc[1, 'Name']:
# Bob
•
.iloc[0] gets all data from the row at index position 0.•
.loc[1, 'Name'] gets the data at the intersection of index label 1 and column label 'Name'.3. Filtering Data
Select subsets of data based on conditions.
# Select rows where Age is greater than 27
filtered_df = df[df['Age'] > 27]
print(filtered_df)
# Name Age City
# 1 Bob 32 Paris
# 2 Charlie 28 New York
• The expression
df['Age'] > 27 creates a boolean Series (True/False).• Using this Series as an index
df[...] returns only the rows where the value was True.4. Grouping and Aggregating
The "group by" operation involves splitting data into groups, applying a function, and combining the results.
# Group by 'City' and calculate the mean age for each city
city_ages = df.groupby('City')['Age'].mean()
print(city_ages)
# City
# New York 26.5
# Paris 32.0
# Name: Age, dtype: float64
•
.groupby('City') splits the DataFrame into groups based on unique city values.•
['Age'].mean() then calculates the mean of the 'Age' column for each of these groups.#Python #Pandas #DataAnalysis #DataScience #Programming
━━━━━━━━━━━━━━━
By: @DataScienceM ✨
❤1👍1
💡 SciPy: Scientific Computing in Python
SciPy is a fundamental library for scientific and technical computing in Python. Built on NumPy, it provides a wide range of user-friendly and efficient numerical routines for tasks like optimization, integration, linear algebra, and statistics.
• Optimization:
• We provide the function (
• The result object (
• Numerical Integration:
• It returns a tuple containing the integral result and an estimate of the absolute error.
• Linear Algebra:
•
• Statistics:
•
• The p-value helps determine if the difference between sample means is statistically significant (a low p-value, e.g., < 0.05, suggests it is).
#SciPy #Python #DataScience #ScientificComputing #Statistics
━━━━━━━━━━━━━━━
By: @DataScienceM ✨
SciPy is a fundamental library for scientific and technical computing in Python. Built on NumPy, it provides a wide range of user-friendly and efficient numerical routines for tasks like optimization, integration, linear algebra, and statistics.
import numpy as np
from scipy.optimize import minimize
# Define a function to minimize: f(x) = (x - 3)^2
def f(x):
return (x - 3)**2
# Find the minimum of the function with an initial guess
res = minimize(f, x0=0)
print(f"Minimum found at x = {res.x[0]:.4f}")
# Output:
# Minimum found at x = 3.0000
• Optimization:
scipy.optimize.minimize is used to find the minimum value of a function.• We provide the function (
f) and an initial guess (x0=0).• The result object (
res) contains the solution in the .x attribute.from scipy.integrate import quad
# Define the function to integrate: f(x) = sin(x)
def integrand(x):
return np.sin(x)
# Integrate sin(x) from 0 to pi
result, error = quad(integrand, 0, np.pi)
print(f"Integral result: {result:.4f}")
print(f"Estimated error: {error:.2e}")
# Output:
# Integral result: 2.0000
# Estimated error: 2.22e-14
• Numerical Integration:
scipy.integrate.quad calculates the definite integral of a function over a given interval.• It returns a tuple containing the integral result and an estimate of the absolute error.
from scipy.linalg import solve
# Solve the linear system Ax = b
# 3x + 2y = 12
# x - y = 1
A = np.array([[3, 2], [1, -1]])
b = np.array([12, 1])
solution = solve(A, b)
print(f"Solution (x, y): {solution}")
# Output:
# Solution (x, y): [2.8 1.8]
• Linear Algebra:
scipy.linalg provides more advanced linear algebra routines than NumPy.•
solve(A, b) efficiently finds the solution vector x for a system of linear equations defined by a matrix A and a vector b.from scipy import stats
# Create two independent samples
sample1 = np.random.normal(loc=5, scale=2, size=100)
sample2 = np.random.normal(loc=5.5, scale=2, size=100)
# Perform an independent t-test
t_stat, p_value = stats.ttest_ind(sample1, sample2)
print(f"T-statistic: {t_stat:.4f}")
print(f"P-value: {p_value:.4f}")
# Output (will vary):
# T-statistic: -1.7432
# P-value: 0.0829
• Statistics:
scipy.stats is a powerful module for statistical analysis.•
ttest_ind calculates the T-test for the means of two independent samples.• The p-value helps determine if the difference between sample means is statistically significant (a low p-value, e.g., < 0.05, suggests it is).
#SciPy #Python #DataScience #ScientificComputing #Statistics
━━━━━━━━━━━━━━━
By: @DataScienceM ✨
❤3
#CNN #DeepLearning #Python #Tutorial
Lesson: Building a Convolutional Neural Network (CNN) for Image Classification
This lesson will guide you through building a CNN from scratch using TensorFlow and Keras to classify images from the CIFAR-10 dataset.
---
Part 1: Setup and Data Loading
First, we import the necessary libraries and load the CIFAR-10 dataset. This dataset contains 60,000 32x32 color images in 10 classes.
#TensorFlow #Keras #DataLoading
---
Part 2: Data Exploration and Preprocessing
We need to prepare the data before feeding it to the network. This involves:
• Normalization: Scaling pixel values from the 0-255 range to the 0-1 range.
• One-Hot Encoding: Converting class vectors (integers) to a binary matrix.
Let's also visualize some images to understand our data.
#DataPreprocessing #Normalization #Visualization
---
Part 3: Building the CNN Model
Now, we'll construct our CNN model. A common architecture consists of a stack of
• Conv2D: Extracts features (like edges, corners) from the input image.
• MaxPooling2D: Reduces the spatial dimensions (downsampling), which helps in making the feature detection more robust.
• Flatten: Converts the 2D feature maps into a 1D vector.
• Dense: A standard fully-connected neural network layer.
#ModelBuilding #CNN #KerasLayers
---
Part 4: Compiling the Model
Before training, we need to configure the learning process. This is done via the
• Optimizer: An algorithm to update the model's weights (e.g., 'adam').
• Loss Function: A function to measure how inaccurate the model is during training (e.g., 'categorical_crossentropy' for multi-class classification).
• Metrics: Used to monitor the training and testing steps (e.g., 'accuracy').
#ModelCompilation #Optimizer #LossFunction
---
Lesson: Building a Convolutional Neural Network (CNN) for Image Classification
This lesson will guide you through building a CNN from scratch using TensorFlow and Keras to classify images from the CIFAR-10 dataset.
---
Part 1: Setup and Data Loading
First, we import the necessary libraries and load the CIFAR-10 dataset. This dataset contains 60,000 32x32 color images in 10 classes.
import tensorflow as tf
from tensorflow.keras import datasets, layers, models
import matplotlib.pyplot as plt
import numpy as np
# Load the CIFAR-10 dataset
(x_train, y_train), (x_test, y_test) = datasets.cifar10.load_data()
# Check the shape of the data
print("Training data shape:", x_train.shape)
print("Test data shape:", x_test.shape)
#TensorFlow #Keras #DataLoading
---
Part 2: Data Exploration and Preprocessing
We need to prepare the data before feeding it to the network. This involves:
• Normalization: Scaling pixel values from the 0-255 range to the 0-1 range.
• One-Hot Encoding: Converting class vectors (integers) to a binary matrix.
Let's also visualize some images to understand our data.
# Define class names for CIFAR-10
class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']
# Visualize a few images
plt.figure(figsize=(10,10))
for i in range(25):
plt.subplot(5,5,i+1)
plt.xticks([])
plt.yticks([])
plt.grid(False)
plt.imshow(x_train[i])
plt.xlabel(class_names[y_train[i][0]])
plt.show()
# Normalize pixel values to be between 0 and 1
x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0
# One-hot encode the labels
y_train = tf.keras.utils.to_categorical(y_train, num_classes=10)
y_test = tf.keras.utils.to_categorical(y_test, num_classes=10)
#DataPreprocessing #Normalization #Visualization
---
Part 3: Building the CNN Model
Now, we'll construct our CNN model. A common architecture consists of a stack of
Conv2D and MaxPooling2D layers, followed by Dense layers for classification.• Conv2D: Extracts features (like edges, corners) from the input image.
• MaxPooling2D: Reduces the spatial dimensions (downsampling), which helps in making the feature detection more robust.
• Flatten: Converts the 2D feature maps into a 1D vector.
• Dense: A standard fully-connected neural network layer.
model = models.Sequential()
# Convolutional Base
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
# Flatten and Dense Layers
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10, activation='softmax')) # 10 output classes
# Print the model summary
model.summary()
#ModelBuilding #CNN #KerasLayers
---
Part 4: Compiling the Model
Before training, we need to configure the learning process. This is done via the
compile() method, which requires:• Optimizer: An algorithm to update the model's weights (e.g., 'adam').
• Loss Function: A function to measure how inaccurate the model is during training (e.g., 'categorical_crossentropy' for multi-class classification).
• Metrics: Used to monitor the training and testing steps (e.g., 'accuracy').
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
#ModelCompilation #Optimizer #LossFunction
---
#YOLOv8 #ComputerVision #ObjectDetection #IndustrialAI #Python
Applying YOLOv8 for Industrial Automation: Counting Plastic Bottles
This lesson will guide you through a complete computer vision project using YOLOv8. The goal is to detect and count plastic bottles in an image from an industrial setting, such as a conveyor belt or a storage area.
---
Step 1: Setup and Installation
First, we need to install the necessary libraries. The
#Setup #Installation
---
Step 2: Loading the Model and the Target Image
We will load a pre-trained YOLOv8 model. These models are trained on the large COCO dataset, which already knows how to identify common objects like 'bottle'. Then, we'll load our industrial image. Ensure you have an image named
#ModelLoading #DataHandling
---
Step 3: Performing Detection on the Image
With the model and image loaded, we can now run the detection. The
#Inference #ObjectDetection
---
Step 4: Filtering and Counting the Bottles
The model detects many types of objects. Our task is to go through the results, filter for only the 'bottle' class, and count how many there are. We'll also store the locations (bounding boxes) of each detected bottle for visualization.
#DataProcessing #Filtering
---
Step 5: Visualizing the Results
A number is good, but seeing what the model detected is better. We will draw the bounding boxes and the final count directly onto the image to create a clear visual output.
#Visualization #OpenCV
Applying YOLOv8 for Industrial Automation: Counting Plastic Bottles
This lesson will guide you through a complete computer vision project using YOLOv8. The goal is to detect and count plastic bottles in an image from an industrial setting, such as a conveyor belt or a storage area.
---
Step 1: Setup and Installation
First, we need to install the necessary libraries. The
ultralytics library provides the YOLOv8 model, and opencv-python is essential for image processing tasks.#Setup #Installation
# Open your terminal or command prompt and run this command:
pip install ultralytics opencv-python
---
Step 2: Loading the Model and the Target Image
We will load a pre-trained YOLOv8 model. These models are trained on the large COCO dataset, which already knows how to identify common objects like 'bottle'. Then, we'll load our industrial image. Ensure you have an image named
factory_bottles.jpg in your project folder.#ModelLoading #DataHandling
import cv2
from ultralytics import YOLO
# Load a pre-trained YOLOv8 model (yolov8n.pt is the smallest and fastest)
model = YOLO('yolov8n.pt')
# Load the image from the industrial setting
image_path = 'factory_bottles.jpg' # Make sure this image is in your directory
img = cv2.imread(image_path)
# A quick check to ensure the image was loaded correctly
if img is None:
print(f"Error: Could not load image at {image_path}")
else:
print("YOLOv8 model and image loaded successfully.")
---
Step 3: Performing Detection on the Image
With the model and image loaded, we can now run the detection. The
ultralytics library makes this process incredibly simple. The model will analyze the image and identify all the objects it recognizes.#Inference #ObjectDetection
# Run the model on the image to get detection results
results = model(img)
print("Detection complete. Processing results...")
---
Step 4: Filtering and Counting the Bottles
The model detects many types of objects. Our task is to go through the results, filter for only the 'bottle' class, and count how many there are. We'll also store the locations (bounding boxes) of each detected bottle for visualization.
#DataProcessing #Filtering
# Initialize a counter for the bottles
bottle_count = 0
bottle_boxes = []
# The model's results is a list, so we loop through it
for result in results:
# Each result has a 'boxes' attribute with the detections
boxes = result.boxes
for box in boxes:
# Get the class ID of the detected object
class_id = int(box.cls)
# Check if the class name is 'bottle'
if model.names[class_id] == 'bottle':
bottle_count += 1
# Store the bounding box coordinates (x1, y1, x2, y2)
bottle_boxes.append(box.xyxy[0])
print(f"Total plastic bottles detected: {bottle_count}")
---
Step 5: Visualizing the Results
A number is good, but seeing what the model detected is better. We will draw the bounding boxes and the final count directly onto the image to create a clear visual output.
#Visualization #OpenCV
🔥1
#Pandas #DataAnalysis #Python #DataScience #Tutorial
Top 30 Pandas Functions & Methods
This lesson covers 30 essential Pandas functions for data manipulation and analysis, each with a standalone example and its output.
---
1.
Creates a new DataFrame (a 2D labeled data structure) from various inputs like dictionaries or lists.
---
2.
Creates a new Series (a 1D labeled array).
---
3.
Reads data from a CSV file into a DataFrame. (Assuming a file
---
4.
Writes a DataFrame to a CSV file.
#PandasIO #DataFrame #Series
---
5.
Returns the first
---
6.
Returns the last
---
7.
Provides a concise summary of the DataFrame, including data types and non-null values.
---
8.
Returns a tuple representing the dimensionality (rows, columns) of the DataFrame.
#DataInspection #PandasBasics
---
9.
Generates descriptive statistics for numerical columns (count, mean, std, min, max, etc.).
Top 30 Pandas Functions & Methods
This lesson covers 30 essential Pandas functions for data manipulation and analysis, each with a standalone example and its output.
---
1.
pd.DataFrame()Creates a new DataFrame (a 2D labeled data structure) from various inputs like dictionaries or lists.
import pandas as pd
data = {'col1': [1, 2], 'col2': [3, 4]}
df = pd.DataFrame(data)
print(df)
col1 col2
0 1 3
1 2 4
---
2.
pd.Series()Creates a new Series (a 1D labeled array).
import pandas as pd
s = pd.Series([10, 20, 30, 40], name='MyNumbers')
print(s)
0 10
1 20
2 30
3 40
Name: MyNumbers, dtype: int64
---
3.
pd.read_csv()Reads data from a CSV file into a DataFrame. (Assuming a file
data.csv exists).# Create a dummy csv file first
with open('data.csv', 'w') as f:
f.write('Name,Age\nAlice,25\nBob,30')
df = pd.read_csv('data.csv')
print(df)
Name Age
0 Alice 25
1 Bob 30
---
4.
df.to_csv()Writes a DataFrame to a CSV file.
import pandas as pd
df = pd.DataFrame({'Name': ['Charlie'], 'Age': [35]})
# index=False prevents writing the DataFrame index to the file
df.to_csv('output.csv', index=False)
# You can check that 'output.csv' has been created.
print("File 'output.csv' created.")
File 'output.csv' created.
#PandasIO #DataFrame #Series
---
5.
df.head()Returns the first
n rows of the DataFrame (default is 5).import pandas as pd
data = {'Name': ['A', 'B', 'C', 'D', 'E', 'F'], 'Value': [1, 2, 3, 4, 5, 6]}
df = pd.DataFrame(data)
print(df.head(3))
Name Value
0 A 1
1 B 2
2 C 3
---
6.
df.tail()Returns the last
n rows of the DataFrame (default is 5).import pandas as pd
data = {'Name': ['A', 'B', 'C', 'D', 'E', 'F'], 'Value': [1, 2, 3, 4, 5, 6]}
df = pd.DataFrame(data)
print(df.tail(2))
Name Value
4 E 5
5 F 6
---
7.
df.info()Provides a concise summary of the DataFrame, including data types and non-null values.
import pandas as pd
import numpy as np
data = {'col1': [1, 2, 3], 'col2': [4.0, 5.0, np.nan], 'col3': ['A', 'B', 'C']}
df = pd.DataFrame(data)
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 col1 3 non-null int64
1 col2 2 non-null float64
2 col3 3 non-null object
dtypes: float64(1), int64(1), object(1)
memory usage: 200.0+ bytes
---
8.
df.shapeReturns a tuple representing the dimensionality (rows, columns) of the DataFrame.
import pandas as pd
df = pd.DataFrame({'A': [1, 2], 'B': [3, 4], 'C': [5, 6]})
print(df.shape)
(2, 3)
#DataInspection #PandasBasics
---
9.
df.describe()Generates descriptive statistics for numerical columns (count, mean, std, min, max, etc.).
import pandas as pd
df = pd.DataFrame({'Age': [22, 38, 26, 35, 29]})
print(df.describe())
❤2