# Initialize the TF-IDF Vectorizer
vectorizer = TfidfVectorizer()
# Fit the vectorizer on the training data and transform it
X_train_tfidf = vectorizer.fit_transform(X_train)
# Only transform the test data using the already-fitted vectorizer
X_test_tfidf = vectorizer.transform(X_test)
print("Shape of training data vectors:", X_train_tfidf.shape)
print("Shape of testing data vectors:", X_test_tfidf.shape)
---
Step 5: Training the NLP Model
Now we can train a machine learning model. Multinomial Naive Bayes is a simple yet powerful algorithm that works very well for text classification tasks.
#ModelTraining #NaiveBayes
# Initialize and train the Naive Bayes classifier
model = MultinomialNB()
model.fit(X_train_tfidf, y_train)
print("Model training complete.")
---
Step 6: Making Predictions and Evaluating the Model
With our model trained, let's use it to make predictions on our unseen test data and see how well it performs.
#Evaluation #ModelPerformance #Prediction
# Make predictions on the test set
y_pred = model.predict(X_test_tfidf)
# Calculate accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Model Accuracy: {accuracy * 100:.2f}%\n")
# Display a detailed classification report
print("Classification Report:")
print(classification_report(y_test, y_pred, target_names=['Negative', 'Positive']))
---
Step 7: Discussion of Results
#Results #Discussion
Our model achieved 100% accuracy on this very small test set.
Accuracy: This is the percentage of correct predictions. 100% is perfect, but this is expected on such a tiny, clean dataset. In the real world, an accuracy of 85-95% is often considered very good.
Precision: Of all the times the model predicted "Positive", what percentage were actually positive?
Recall: Of all the actual "Positive" texts, what percentage did the model correctly identify?
F1-Score: A weighted average of Precision and Recall.
Limitations: Our dataset is extremely small. A real model would need thousands of examples to be reliable and generalize well to new, unseen text.
---
Step 8: Testing the Model on New Sentences
Let's see how our complete pipeline works on brand new text.
#RealWorldNLP #Inference
# Function to predict sentiment of a new sentence
def predict_sentiment(sentence):
# 1. Preprocess the text
processed_sentence = preprocess_text(sentence)
# 2. Vectorize the text using the SAME vectorizer
vectorized_sentence = vectorizer.transform([processed_sentence])
# 3. Make a prediction
prediction = model.predict(vectorized_sentence)
# 4. Return the result
return "Positive" if prediction[0] == 1 else "Negative"
# Test with new sentences
new_sentence_1 = "The movie was absolutely amazing!"
new_sentence_2 = "I was very bored and did not like it."
print(f"'{new_sentence_1}' -> Sentiment: {predict_sentiment(new_sentence_1)}")
print(f"'{new_sentence_2}' -> Sentiment: {predict_sentiment(new_sentence_2)}")
━━━━━━━━━━━━━━━
By: @CodeProgrammer ✨
❤5👍3
nature papers: 2000$
Q1 and Q2 papers 1000$
Q3 and Q4 papers 500$
Doctoral thesis (complete) 700$
M.S thesis 300$
paper simulation 200$
Contact me
https://t.me/m/-nTmpj5vYzNk
Q1 and Q2 papers 1000$
Q3 and Q4 papers 500$
Doctoral thesis (complete) 700$
M.S thesis 300$
paper simulation 200$
Contact me
https://t.me/m/-nTmpj5vYzNk
❤4
#YOLOv8 #ComputerVision #TrafficManagement #Python #AI #SmartCity
Lesson: Detecting Traffic Congestion in Road Lanes with YOLOv8
This tutorial will guide you through building a system to monitor traffic on a highway from a video feed. We'll use YOLOv8 to detect vehicles and then define specific zones (lanes) to count the number of vehicles within them, determining if a lane is congested.
---
We need to install
Create a Python script (e.g.,
---
We'll load a pre-trained YOLOv8 model, which is excellent at detecting common objects like cars, trucks, and buses. The most critical part of this step is defining the zones of interest (our lanes) as polygons on the video frame. You will need to adjust these coordinates to match the perspective of your specific video.
You will also need a video file, for example,
---
This is the core of our program. We will loop through each frame of the video, run vehicle detection, and then check if the center of each detected vehicle falls inside our predefined lane polygons. We will keep a count for each lane.
(Note: The code below should be placed inside the
---
Lesson: Detecting Traffic Congestion in Road Lanes with YOLOv8
This tutorial will guide you through building a system to monitor traffic on a highway from a video feed. We'll use YOLOv8 to detect vehicles and then define specific zones (lanes) to count the number of vehicles within them, determining if a lane is congested.
---
#Step 1: Project Setup and DependenciesWe need to install
ultralytics for YOLOv8 and opencv-python for video and image processing. numpy is also essential for handling the coordinates of our detection zones.pip install ultralytics opencv-python numpy
Create a Python script (e.g.,
traffic_monitor.py) and import the necessary libraries.import cv2
import numpy as np
from ultralytics import YOLO
# Hashtags: #Setup #Python #OpenCV #YOLOv8
---
#Step 2: Model Loading and Lane DefinitionWe'll load a pre-trained YOLOv8 model, which is excellent at detecting common objects like cars, trucks, and buses. The most critical part of this step is defining the zones of interest (our lanes) as polygons on the video frame. You will need to adjust these coordinates to match the perspective of your specific video.
You will also need a video file, for example,
traffic_video.mp4.# Load a pre-trained YOLOv8 model (yolov8n.pt is small and fast)
model = YOLO('yolov8n.pt')
# Path to your video file
VIDEO_PATH = 'traffic_video.mp4'
# Define the polygons for two lanes.
# IMPORTANT: You MUST adjust these coordinates for your video's perspective.
# Each polygon is a numpy array of [x, y] coordinates.
LANE_1_POLYGON = np.array([[20, 400], [450, 400], [450, 250], [20, 250]], np.int32)
LANE_2_POLYGON = np.array([[500, 400], [980, 400], [980, 250], [500, 250]], np.int32)
# Define the congestion threshold. If vehicle count > this, the lane is congested.
CONGESTION_THRESHOLD = 10
# Hashtags: #Configuration #AIModel #SmartCity
---
#Step 3: Main Loop for Detection and CountingThis is the core of our program. We will loop through each frame of the video, run vehicle detection, and then check if the center of each detected vehicle falls inside our predefined lane polygons. We will keep a count for each lane.
cap = cv2.VideoCapture(VIDEO_PATH)
while cap.isOpened():
success, frame = cap.read()
if not success:
break
# Run YOLOv8 inference on the frame
results = model(frame)
# Initialize vehicle counts for each lane for the current frame
lane_1_count = 0
lane_2_count = 0
# Process detection results
for r in results:
for box in r.boxes:
# Check if the detected object is a vehicle
class_id = int(box.cls[0])
class_name = model.names[class_id]
if class_name in ['car', 'truck', 'bus', 'motorbike']:
# Get bounding box coordinates
x1, y1, x2, y2 = map(int, box.xyxy[0])
# Calculate the center point of the bounding box
center_x = (x1 + x2) // 2
center_y = (y1 + y2) // 2
# Check if the center point is inside Lane 1
if cv2.pointPolygonTest(LANE_1_POLYGON, (center_x, center_y), False) >= 0:
lane_1_count += 1
# Check if the center point is inside Lane 2
elif cv2.pointPolygonTest(LANE_2_POLYGON, (center_x, center_y), False) >= 0:
lane_2_count += 1
# Hashtags: #RealTime #ObjectDetection #VideoProcessing
(Note: The code below should be placed inside the
while loop of Step 3)---
#Step 4: Visualization and Displaying Results❤3
After counting the vehicles, we need to visualize the results on the video frame. We will draw the lane polygons, display the vehicle count for each, and change the lane's color to red if it is considered congested based on our threshold.
---
When you run the script, a video window will appear. You will see:
• Yellow polygons outlining the defined lanes.
• Text at the top indicating the number of vehicles in each lane and its status ("NORMAL" or "CONGESTED").
• The status text and its color will change in real-time based on the vehicle count exceeding the
Discussion of Results:
Threshold is Key: The
Polygon Accuracy: The system's accuracy is highly dependent on how well you define the
Limitations: This method only measures vehicle density (number of cars in an area). It does not measure traffic flow (vehicle speed). A lane could have many cars moving quickly (high density, but not congested) or a few stopped cars (low density, but very congested).
Potential Improvements:
Object Tracking: Implement an object tracker (like DeepSORT or BoT-SORT) to assign a unique ID to each car. This would allow you to calculate the average speed of vehicles within each lane, providing a much more reliable measure of congestion.
Time-Based Analysis: Analyze data over time. A lane that is consistently above the threshold for more than a minute is a stronger indicator of a traffic jam than a brief spike in vehicle count.
#ProjectComplete #AIforCities #Transportation
━━━━━━━━━━━━━━━
By: @CodeProgrammer ✨
# --- Visualization --- (This code continues inside the while loop)
# Draw the lane polygons on the frame
cv2.polylines(frame, [LANE_1_POLYGON], isClosed=True, color=(255, 255, 0), thickness=2)
cv2.polylines(frame, [LANE_2_POLYGON], isClosed=True, color=(255, 255, 0), thickness=2)
# Check for congestion and display status for Lane 1
if lane_1_count > CONGESTION_THRESHOLD:
status_1 = "CONGESTED"
color_1 = (0, 0, 255) # Red
else:
status_1 = "NORMAL"
color_1 = (0, 255, 0) # Green
cv2.putText(frame, f"Lane 1: {lane_1_count} ({status_1})", (50, 50), cv2.FONT_HERSHEY_SIMPLEX, 1, color_1, 2)
# Check for congestion and display status for Lane 2
if lane_2_count > CONGESTION_THRESHOLD:
status_2 = "CONGESTED"
color_2 = (0, 0, 255) # Red
else:
status_2 = "NORMAL"
color_2 = (0, 255, 0) # Green
cv2.putText(frame, f"Lane 2: {lane_2_count} ({status_2})", (530, 50), cv2.FONT_HERSHEY_SIMPLEX, 1, color_2, 2)
# Display the frame with detections and status
cv2.imshow("Traffic Congestion Monitor", frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
# Hashtags: #DataVisualization #OpenCV #TrafficFlow
---
#Step 5: Results and DiscussionWhen you run the script, a video window will appear. You will see:
• Yellow polygons outlining the defined lanes.
• Text at the top indicating the number of vehicles in each lane and its status ("NORMAL" or "CONGESTED").
• The status text and its color will change in real-time based on the vehicle count exceeding the
CONGESTION_THRESHOLD.Discussion of Results:
Threshold is Key: The
CONGESTION_THRESHOLD is the most important variable to tune. A value of 10 might be too high for a short lane or too low for a long one. It must be calibrated based on the specific camera view and what is considered "congested" for that road.Polygon Accuracy: The system's accuracy is highly dependent on how well you define the
LANE_POLYGON coordinates. They must accurately map to the lanes in the video, accounting for perspective.Limitations: This method only measures vehicle density (number of cars in an area). It does not measure traffic flow (vehicle speed). A lane could have many cars moving quickly (high density, but not congested) or a few stopped cars (low density, but very congested).
Potential Improvements:
Object Tracking: Implement an object tracker (like DeepSORT or BoT-SORT) to assign a unique ID to each car. This would allow you to calculate the average speed of vehicles within each lane, providing a much more reliable measure of congestion.
Time-Based Analysis: Analyze data over time. A lane that is consistently above the threshold for more than a minute is a stronger indicator of a traffic jam than a brief spike in vehicle count.
#ProjectComplete #AIforCities #Transportation
━━━━━━━━━━━━━━━
By: @CodeProgrammer ✨
❤3
Python | Machine Learning | Coding | R pinned «nature papers: 2000$ Q1 and Q2 papers 1000$ Q3 and Q4 papers 500$ Doctoral thesis (complete) 700$ M.S thesis 300$ paper simulation 200$ Contact me https://t.me/m/-nTmpj5vYzNk»
#TelegramBot #Python #SQLite #DataExport #CSV #API
Lesson: Creating a Telegram Bot to Export Channel Members to a CSV File
This tutorial guides you through building a Telegram bot from scratch. When added as an administrator to a channel, the bot will respond to an
IMPORTANT NOTE: Due to Telegram's privacy policy, bots CANNOT access users' phone numbers. This field will be marked as "N/A".
---
First, create a bot via the @BotFather on Telegram. Send it the
Next, set up your Python environment. Install the necessary library:
Create a new Python file named
---
To fulfill the requirement of using a database, we will log every export request. Create a new file named
---
Now, we'll write the core function in
Lesson: Creating a Telegram Bot to Export Channel Members to a CSV File
This tutorial guides you through building a Telegram bot from scratch. When added as an administrator to a channel, the bot will respond to an
/export command from the channel owner, exporting the list of members (Username, User ID, etc.) to a CSV file and storing a log of the action in a SQLite database.IMPORTANT NOTE: Due to Telegram's privacy policy, bots CANNOT access users' phone numbers. This field will be marked as "N/A".
---
#Step 1: Bot Creation and Project SetupFirst, create a bot via the @BotFather on Telegram. Send it the
/newbot command, follow the instructions, and save the HTTP API token it gives you.Next, set up your Python environment. Install the necessary library:
python-telegram-bot.pip install python-telegram-bot
Create a new Python file named
bot.py and add the basic structure. Replace 'YOUR_TELEGRAM_API_TOKEN' with the token you got from BotFather.import logging
from telegram import Update
from telegram.ext import Application, CommandHandler, ContextTypes
# Enable logging
logging.basicConfig(format="%(asctime)s - %(name)s - %(levelname)s - %(message)s", level=logging.INFO)
TOKEN = 'YOUR_TELEGRAM_API_TOKEN'
def main() -> None:
"""Start the bot."""
application = Application.builder().token(TOKEN).build()
# We will add command handlers here in the next steps
# application.add_handler(CommandHandler("export", export_members))
print("Bot is running...")
application.run_polling()
if __name__ == "__main__":
main()
# Hashtags: #Setup #TelegramAPI #PythonBot #BotFather
---
#Step 2: Database Setup for Logging (database.py)To fulfill the requirement of using a database, we will log every export request. Create a new file named
database.py. This separates our data logic from the bot logic.import sqlite3
from datetime import datetime
DB_NAME = 'bot_logs.db'
def setup_database():
"""Creates the database table if it doesn't exist."""
conn = sqlite3.connect(DB_NAME)
cursor = conn.cursor()
cursor.execute('''
CREATE TABLE IF NOT EXISTS export_logs (
id INTEGER PRIMARY KEY AUTOINCREMENT,
chat_id INTEGER NOT NULL,
chat_title TEXT,
requested_by_id INTEGER NOT NULL,
requested_by_username TEXT,
timestamp TEXT NOT NULL
)
''')
conn.commit()
conn.close()
def log_export_action(chat_id, chat_title, user_id, username):
"""Logs a successful export action to the database."""
conn = sqlite3.connect(DB_NAME)
cursor = conn.cursor()
timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
cursor.execute(
"INSERT INTO export_logs (chat_id, chat_title, requested_by_id, requested_by_username, timestamp) VALUES (?, ?, ?, ?, ?)",
(chat_id, chat_title, user_id, username, timestamp)
)
conn.commit()
conn.close()
# Hashtags: #SQLite #DatabaseDesign #Logging #DataPersistence
---
#Step 3: Implementing the Export Command and Permission CheckNow, we'll write the core function in
bot.py. This function will handle the /export command. The most crucial part is to check if the user who sent the command is the creator of the channel. This prevents any admin from exporting the data.❤4
# Add these imports to the top of bot.py
import csv
import io
import database as db
async def export_members(update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
"""Exports channel members to a CSV file."""
user = update.effective_user
chat = update.effective_chat
if chat.type != 'channel':
await update.message.reply_text("This command can only be used in a channel.")
return
# --- Permission Check ---
try:
admins = await context.bot.get_chat_administrators(chat.id)
is_creator = False
for admin in admins:
if admin.user.id == user.id and admin.status == 'creator':
is_creator = True
break
if not is_creator:
await update.message.reply_text("Sorry, only the channel creator can use this command.")
return
except Exception as e:
await update.message.reply_text(f"An error occurred while checking permissions: {e}")
return
await update.message.reply_text("Exporting members... This may take a moment for large channels.")
# The rest of the logic will go here in the next step
# In the main() function of bot.py, uncomment and add the handler:
# application.add_handler(CommandHandler("export", export_members))
# At the start of main(), also add the database setup call:
db.setup_database()
# Hashtags: #BotLogic #Permissions #Security #TelegramAPI
---
#Step 4: Fetching Members and Generating the CSV FileThis is the continuation of the
export_members function. After passing the permission check, the bot will fetch the list of administrators (as a reliable way to get some members) and generate a CSV file in memory.NOTE: The Bot API has limitations on fetching a complete list of non-admin members in very large channels. This example reliably fetches administrators. A more advanced "userbot" would be needed to guarantee fetching all members.
❤2
# This code continues inside the export_members function from Step 3
# --- Data Fetching and CSV Generation ---
try:
# Using get_chat_administrators is a reliable way for bots to get a list of some members.
# Getting all subscribers can be limited by the API for standard bots.
members_to_export = await context.bot.get_chat_administrators(chat.id)
# Create a CSV in-memory
output = io.StringIO()
writer = csv.writer(output)
# Write header row
header = ['User ID', 'First Name', 'Last Name', 'Username', 'Phone Number']
writer.writerow(header)
# Write member data
for admin in members_to_export:
member = admin.user
row = [
member.id,
member.first_name,
member.last_name or 'N/A',
f"@{member.username}" if member.username else 'N/A',
'N/A (Privacy Protected)' # Bots cannot access phone numbers
]
writer.writerow(row)
# Prepare the file for sending
output.seek(0)
file_to_send = io.BytesIO(output.getvalue().encode('utf-8'))
file_to_send.name = f"{chat.title or 'channel'}_members.csv"
await context.bot.send_document(chat_id=user.id, document=file_to_send,
caption="Here is the list of exported channel members.")
# Log the successful action
db.log_export_action(chat.id, chat.title, user.id, user.username)
except Exception as e:
logging.error(f"Failed to export members for chat {chat.id}: {e}")
await update.message.reply_text(f"An error occurred during export: {e}")
# Hashtags: #CSV #DataHandling #InMemoryFile #PythonCode
Make sure to add the handler line in
main() as mentioned in Step 3 to activate the command.---
#Step 5: Final Results and DiscussionTo use the bot:
• Run the
bot.py script.• Add your bot as an administrator to your Telegram channel.
• As the creator of the channel, send the
/export command in the channel.• The bot will respond that it's processing the request.
• You will receive a private message from the bot containing the
members.csv file.• A log entry will be created in the
bot_logs.db file in your project directory.Discussion of Results and Limitations:
Privacy is Paramount: The most significant result is observing Telegram's privacy protection in action. Bots cannot and should not access sensitive user data like phone numbers. This is a crucial security feature of the platform.
Permission Model: The check for
admin.status == 'creator' is robust and ensures that only the channel owner can trigger this sensitive data export, aligning with good security practices.API Limitations: A standard bot created with BotFather has limitations. While it can always get a list of administrators, fetching thousands of regular subscribers can be slow, rate-limited, or incomplete. For massive-scale scraping, developers often turn to "userbots" (using a regular user's API credentials), which have different rules and are against Telegram's Terms of Service if used for spam or abuse.
Database Logging: The use of SQLite provides a simple yet effective audit trail. You can see which channels were exported, by whom, and when. This is essential for accountability.
This project demonstrates a practical use for a Telegram bot while also highlighting the importance of working within the security and privacy constraints of an API.
#ProjectComplete #EthicalAI #DataPrivacy #TelegramDev
━━━━━━━━━━━━━━━
By: @CodeProgrammer ✨
❤4
Please open Telegram to view this post
VIEW IN TELEGRAM
❤4
#PDF #EPUB #TelegramBot #Python #SQLite #Project
Lesson: Building a PDF <> EPUB Telegram Converter Bot
This lesson walks you through creating a fully functional Telegram bot from scratch. The bot will accept PDF or EPUB files, convert them to the other format, and log each transaction in an SQLite database.
---
Part 1: Prerequisites & Setup
First, we need to install the necessary Python library for the Telegram Bot API. We will also rely on Calibre's command-line tools for conversion.
Important: You must install Calibre on the system where the bot will run and ensure its
#Setup #Prerequisites
---
Part 2: Database Initialization
We'll use SQLite to log every successful conversion. Create a file named
#Database #SQLite #Initialization
---
Part 3: The Main Bot Script - Imports & Basic Commands
Now, let's create our main bot file,
#TelegramBot #Python #Boilerplate
---
Part 4: The Core Conversion Logic
This function will be the heart of our bot. It uses the
Lesson: Building a PDF <> EPUB Telegram Converter Bot
This lesson walks you through creating a fully functional Telegram bot from scratch. The bot will accept PDF or EPUB files, convert them to the other format, and log each transaction in an SQLite database.
---
Part 1: Prerequisites & Setup
First, we need to install the necessary Python library for the Telegram Bot API. We will also rely on Calibre's command-line tools for conversion.
Important: You must install Calibre on the system where the bot will run and ensure its
ebook-convert tool is in your system's PATH.pip install python-telegram-bot==20.3#Setup #Prerequisites
---
Part 2: Database Initialization
We'll use SQLite to log every successful conversion. Create a file named
database_setup.py and run it once to create the database file and the table.# database_setup.py
import sqlite3
def setup_database():
conn = sqlite3.connect('conversions.db')
cursor = conn.cursor()
# Create table to store conversion logs
cursor.execute('''
CREATE TABLE IF NOT EXISTS conversions (
id INTEGER PRIMARY KEY AUTOINCREMENT,
user_id INTEGER NOT NULL,
original_filename TEXT NOT NULL,
converted_filename TEXT NOT NULL,
conversion_type TEXT NOT NULL,
timestamp DATETIME DEFAULT CURRENT_TIMESTAMP
)
''')
conn.commit()
conn.close()
print("Database setup complete. 'conversions.db' is ready.")
if __name__ == '__main__':
setup_database()
#Database #SQLite #Initialization
---
Part 3: The Main Bot Script - Imports & Basic Commands
Now, let's create our main bot file,
converter_bot.py. We'll start with imports and the initial /start and /help commands.# converter_bot.py
import logging
import os
import sqlite3
import subprocess
from telegram import Update
from telegram.ext import Application, CommandHandler, MessageHandler, filters, ContextTypes
# Enable logging
logging.basicConfig(format='%(asctime)s - %(name)s - %(levelname)s - %(message)s', level=logging.INFO)
# --- Bot Token ---
TELEGRAM_TOKEN = "YOUR_TELEGRAM_BOT_TOKEN"
# --- Command Handlers ---
async def start(update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
user = update.effective_user
await update.message.reply_html(
rf"Hi {user.mention_html()}! Send me a PDF or EPUB file to convert.",
)
async def help_command(update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
await update.message.reply_text("Simply send a .pdf file to get an .epub, or send an .epub file to get a .pdf. Note: Conversion quality depends on the source file's structure.")
#TelegramBot #Python #Boilerplate
---
Part 4: The Core Conversion Logic
This function will be the heart of our bot. It uses the
ebook-convert command-line tool (from Calibre) to perform the conversion. It's crucial that Calibre is installed correctly for this to work.❤1
# converter_bot.py (continued)
def run_conversion(input_path: str, output_path: str) -> bool:
"""Runs the ebook-convert command and returns True on success."""
try:
command = ['ebook-convert', input_path, output_path]
result = subprocess.run(command, check=True, capture_output=True, text=True)
logging.info(f"Calibre output: {result.stdout}")
return True
except FileNotFoundError:
logging.error("CRITICAL: 'ebook-convert' command not found. Is Calibre installed and in the system's PATH?")
return False
except subprocess.CalledProcessError as e:
logging.error(f"Conversion failed for {input_path}. Error: {e.stderr}")
return False
#Conversion #Calibre #Subprocess
---
Part 5: Database Logging Function
This helper function will connect to our SQLite database and insert a new record for each successful conversion.
# converter_bot.py (continued)
def log_to_db(user_id: int, original_file: str, converted_file: str, conv_type: str):
"""Logs a successful conversion to the SQLite database."""
try:
conn = sqlite3.connect('conversions.db')
cursor = conn.cursor()
cursor.execute(
"INSERT INTO conversions (user_id, original_filename, converted_filename, conversion_type) VALUES (?, ?, ?, ?)",
(user_id, original_file, converted_file, conv_type)
)
conn.commit()
conn.close()
except sqlite3.Error as e:
logging.error(f"Database error: {e}")
#Database #Logging #SQLite
---
Part 6: Handling Incoming Files
This is the main handler that will be triggered when a user sends a document. It downloads the file, determines the target format, calls the conversion function, sends the result back, logs it, and cleans up.
# converter_bot.py (continued)
async def handle_document(update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
doc = update.message.document
file_id = doc.file_id
file_name = doc.file_name
input_path = os.path.join("downloads", file_name)
os.makedirs("downloads", exist_ok=True) # Ensure download directory exists
new_file = await context.bot.get_file(file_id)
await new_file.download_to_drive(input_path)
await update.message.reply_text(f"Received '{file_name}'. Starting conversion...")
output_path = ""
conversion_type = ""
if file_name.lower().endswith('.pdf'):
output_path = input_path.rsplit('.', 1)[0] + '.epub'
conversion_type = "PDF -> EPUB"
elif file_name.lower().endswith('.epub'):
output_path = input_path.rsplit('.', 1)[0] + '.pdf'
conversion_type = "EPUB -> PDF"
else:
await update.message.reply_text("Sorry, I only support PDF and EPUB files.")
os.remove(input_path)
return
# Run the conversion
success = run_conversion(input_path, output_path)
if success and os.path.exists(output_path):
await update.message.reply_text("Conversion successful! Uploading your file...")
await context.bot.send_document(chat_id=update.effective_chat.id, document=open(output_path, 'rb'))
# Log to database
log_to_db(update.effective_user.id, file_name, os.path.basename(output_path), conversion_type)
else:
await update.message.reply_text("An error occurred during conversion. Please check the file and try again. The file might be corrupted or protected.")
# Cleanup
if os.path.exists(input_path):
os.remove(input_path)
if os.path.exists(output_path):
os.remove(output_path)
#FileHandler #BotLogic
---
Part 7: Main Execution Block
Finally, this block sets up the application, registers all our handlers, and starts the bot. This code goes at the end of
#Main #Execution #RunBot
---
Part 8: Results & Discussion
To Run:
• Run
• Replace
• Run
• Send a PDF or EPUB file to your bot on Telegram.
Expected Results:
• The bot will acknowledge the file.
• After a short processing time, it will send back the converted file.
• A new entry will be added to the
Viewing the Database:
You can inspect the
Discussion & Limitations:
• Dependency: The bot is entirely dependent on a local installation of Calibre. This makes it hard to deploy on simple hosting services. A Docker-based deployment would be a good solution.
• Conversion Quality: Converting from PDF, especially those with complex layouts, images, and columns, can result in poor EPUB formatting. This is a fundamental limitation of PDF-to-EPUB conversion, not just a flaw in the bot.
• Synchronous Processing: The bot handles one file at a time. If two users send files simultaneously, one has to wait. For a larger scale, a task queue system (like Celery with Redis) would be necessary to handle conversions asynchronously in the background.
• Error Handling: The current error messaging is generic. Advanced versions could parse Calibre's error output to give users more specific feedback (e.g., "This PDF is password-protected").
#Results #Discussion #Limitations #Scalability
━━━━━━━━━━━━━━━
By: @CodeProgrammer ✨
Finally, this block sets up the application, registers all our handlers, and starts the bot. This code goes at the end of
converter_bot.py.# converter_bot.py (continued)
def main() -> None:
"""Start the bot."""
application = Application.builder().token(TELEGRAM_TOKEN).build()
# Register handlers
application.add_handler(CommandHandler("start", start))
application.add_handler(CommandHandler("help", help_command))
application.add_handler(MessageHandler(filters.Document.ALL, handle_document))
# Run the bot until the user presses Ctrl-C
print("Bot is running...")
application.run_polling()
if __name__ == '__main__':
main()
#Main #Execution #RunBot
---
Part 8: Results & Discussion
To Run:
• Run
python database_setup.py once.• Replace
"YOUR_TELEGRAM_BOT_TOKEN" in converter_bot.py with your actual token from BotFather.• Run
python converter_bot.py.• Send a PDF or EPUB file to your bot on Telegram.
Expected Results:
• The bot will acknowledge the file.
• After a short processing time, it will send back the converted file.
• A new entry will be added to the
conversions.db file.Viewing the Database:
You can inspect the
conversions.db file using a tool like "DB Browser for SQLite" or the command line:sqlite3 conversions.db "SELECT * FROM conversions;"Discussion & Limitations:
• Dependency: The bot is entirely dependent on a local installation of Calibre. This makes it hard to deploy on simple hosting services. A Docker-based deployment would be a good solution.
• Conversion Quality: Converting from PDF, especially those with complex layouts, images, and columns, can result in poor EPUB formatting. This is a fundamental limitation of PDF-to-EPUB conversion, not just a flaw in the bot.
• Synchronous Processing: The bot handles one file at a time. If two users send files simultaneously, one has to wait. For a larger scale, a task queue system (like Celery with Redis) would be necessary to handle conversions asynchronously in the background.
• Error Handling: The current error messaging is generic. Advanced versions could parse Calibre's error output to give users more specific feedback (e.g., "This PDF is password-protected").
#Results #Discussion #Limitations #Scalability
━━━━━━━━━━━━━━━
By: @CodeProgrammer ✨
❤7👍1
Please open Telegram to view this post
VIEW IN TELEGRAM
❤3
Top 100 Data Analysis Commands & Functions
#DataAnalysis #Pandas #DataLoading #Inspection
Part 1: Pandas - Data Loading & Inspection
#1.
Reads a comma-separated values (csv) file into a Pandas DataFrame.
#2.
Returns the first n rows of the DataFrame (default is 5).
#3.
Returns the last n rows of theDataFrame (default is 5).
#4.
Prints a concise summary of a DataFrame, including data types and non-null values.
#5.
Generates descriptive statistics for numerical columns.
#6.
Returns a tuple representing the dimensionality (rows, columns) of the DataFrame.
#7.
Returns the column labels of the DataFrame.
#8.
Returns the data types of each column.
#9.
Returns a Series containing counts of unique values in a column.
#10.
Returns an array of the unique values in a column.
#11.
Returns the number of unique values in a column.
#DataAnalysis #Pandas #DataLoading #Inspection
Part 1: Pandas - Data Loading & Inspection
#1.
pd.read_csv()Reads a comma-separated values (csv) file into a Pandas DataFrame.
import pandas as pd
from io import StringIO
csv_data = "col1,col2,col3\n1,a,True\n2,b,False"
df = pd.read_csv(StringIO(csv_data))
print(df)
col1 col2 col3
0 1 a True
1 2 b False
#2.
df.head()Returns the first n rows of the DataFrame (default is 5).
import pandas as pd
df = pd.DataFrame({'A': range(10), 'B': list('abcdefghij')})
print(df.head(3))
A B
0 0 a
1 1 b
2 2 c
#3.
df.tail()Returns the last n rows of theDataFrame (default is 5).
import pandas as pd
df = pd.DataFrame({'A': range(10), 'B': list('abcdefghij')})
print(df.tail(3))
A B
7 7 h
8 8 i
9 9 j
#4.
df.info()Prints a concise summary of a DataFrame, including data types and non-null values.
import pandas as pd
import numpy as np
df = pd.DataFrame({'A': [1, 2, np.nan], 'B': ['x', 'y', 'z']})
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 A 2 non-null float64
1 B 3 non-null object
dtypes: float64(1), object(1)
memory usage: 176.0+ bytes
#5.
df.describe()Generates descriptive statistics for numerical columns.
import pandas as pd
df = pd.DataFrame({'A': [1, 2, 3, 4, 5]})
print(df.describe())
A
count 5.000000
mean 3.000000
std 1.581139
min 1.000000
25% 2.000000
50% 3.000000
75% 4.000000
max 5.000000
#6.
df.shapeReturns a tuple representing the dimensionality (rows, columns) of the DataFrame.
import pandas as pd
df = pd.DataFrame({'A': [1, 2], 'B': [3, 4], 'C': [5, 6]})
print(df.shape)
(2, 3)
#7.
df.columnsReturns the column labels of the DataFrame.
import pandas as pd
df = pd.DataFrame({'Name': ['Alice'], 'Age': [30]})
print(df.columns)
Index(['Name', 'Age'], dtype='object')
#8.
df.dtypesReturns the data types of each column.
import pandas as pd
df = pd.DataFrame({'A': [1, 2], 'B': [1.1, 2.2], 'C': ['x', 'y']})
print(df.dtypes)
A int64
B float64
C object
dtype: object
#9.
df['col'].value_counts()Returns a Series containing counts of unique values in a column.
import pandas as pd
df = pd.DataFrame({'Fruit': ['Apple', 'Banana', 'Apple', 'Orange', 'Banana', 'Apple']})
print(df['Fruit'].value_counts())
Apple 3
Banana 2
Orange 1
Name: Fruit, dtype: int64
#10.
df['col'].unique()Returns an array of the unique values in a column.
import pandas as pd
df = pd.DataFrame({'Fruit': ['Apple', 'Banana', 'Apple', 'Orange']})
print(df['Fruit'].unique())
['Apple' 'Banana' 'Orange']
#11.
df['col'].nunique()Returns the number of unique values in a column.
❤2
import pandas as pd
df = pd.DataFrame({'Fruit': ['Apple', 'Banana', 'Apple', 'Orange']})
print(df['Fruit'].nunique())
3
#12.
df.isnull()Returns a DataFrame of boolean values indicating missing values.
import pandas as pd
import numpy as np
df = pd.DataFrame({'A': [1, np.nan], 'B': [np.nan, 'x']})
print(df.isnull())
A B
0 False True
1 True False
#13.
df.isnull().sum()Returns the number of missing values in each column.
import pandas as pd
import numpy as np
df = pd.DataFrame({'A': [1, np.nan, 3, np.nan], 'B': [5, 6, 7, 8]})
print(df.isnull().sum())
A 2
B 0
dtype: int64
#14.
df.to_csv()Writes the DataFrame to a comma-separated values (csv) file.
import pandas as pd
df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
csv_output = df.to_csv(index=False)
print(csv_output)
A,B
1,3
2,4
#15.
df.copy()Creates a deep copy of a DataFrame.
import pandas as pd
df1 = pd.DataFrame({'A': [1]})
df2 = df1.copy()
df2.loc[0, 'A'] = 99
print(f"Original df1:\n{df1}")
print(f"Copied df2:\n{df2}")
Original df1:
A
0 1
Copied df2:
A
0 99
---
#DataAnalysis #Pandas #Selection #Indexing
Part 2: Pandas - Data Selection & Indexing
#16.
df['col']Selects a single column as a Series.
import pandas as pd
df = pd.DataFrame({'Name': ['Alice', 'Bob'], 'Age': [30, 25]})
print(df['Name'])
0 Alice
1 Bob
Name: Name, dtype: object
#17.
df[['col1', 'col2']]Selects multiple columns as a new DataFrame.
import pandas as pd
df = pd.DataFrame({'Name': ['Alice'], 'Age': [30], 'City': ['New York']})
print(df[['Name', 'City']])
Name City
0 Alice New York
#18.
df.loc[]Accesses a group of rows and columns by label(s) or a boolean array.
import pandas as pd
df = pd.DataFrame({'A': [1, 2, 3]}, index=['x', 'y', 'z'])
print(df.loc['y'])
A 2
Name: y, dtype: int64
#19.
df.iloc[]Accesses a group of rows and columns by integer position(s).
import pandas as pd
df = pd.DataFrame({'A': [10, 20, 30]})
print(df.iloc[1])
A 20
Name: 1, dtype: int64
#20.
df[df['col'] > value]Selects rows based on a boolean condition (boolean indexing).
import pandas as pd
df = pd.DataFrame({'Age': [22, 35, 18, 40]})
print(df[df['Age'] > 30])
Age
1 35
3 40
#21.
df.set_index()Sets the DataFrame index using existing columns.
import pandas as pd
df = pd.DataFrame({'Country': ['USA', 'UK'], 'Code': [1, 44]})
df_indexed = df.set_index('Country')
print(df_indexed)
Code
Country
USA 1
UK 44
#22.
df.reset_index()Resets the index of the DataFrame and uses the default integer index.
import pandas as pd
df = pd.DataFrame({'Code': [1, 44]}, index=['USA', 'UK'])
df_reset = df.reset_index()
print(df_reset)
index Code
0 USA 1
1 UK 44
#23.
df.at[]Accesses a single value by row/column label pair. Faster than
.loc.❤1
import pandas as pd
df = pd.DataFrame({'A': [1, 2, 3]}, index=['x', 'y', 'z'])
print(df.at['y', 'A'])
2
#24.
df.iat[]Accesses a single value by row/column integer position. Faster than
.iloc.import pandas as pd
df = pd.DataFrame({'A': [10, 20, 30]})
print(df.iat[1, 0])
20
#25.
df.sample()Returns a random sample of items from an axis of object.
import pandas as pd
df = pd.DataFrame({'A': range(10)})
print(df.sample(n=3))
A
8 8
2 2
5 5
(Note: Output rows will be random)
---
#DataAnalysis #Pandas #DataCleaning #Manipulation
Part 3: Pandas - Data Cleaning & Manipulation
#26.
df.dropna()Removes missing values.
import pandas as pd
import numpy as np
df = pd.DataFrame({'A': [1, np.nan, 3]})
print(df.dropna())
A
0 1.0
2 3.0
#27.
df.fillna()Fills missing (NA/NaN) values using a specified method.
import pandas as pd
import numpy as np
df = pd.DataFrame({'A': [1, np.nan, 3]})
print(df.fillna(0))
A
0 1.0
1 0.0
2 3.0
#28.
df.astype()Casts a pandas object to a specified dtype.
import pandas as pd
df = pd.DataFrame({'A': [1.1, 2.7, 3.5]})
df['A'] = df['A'].astype(int)
print(df)
A
0 1
1 2
2 3
#29.
df.rename()Alters axes labels.
import pandas as pd
df = pd.DataFrame({'a': [1], 'b': [2]})
df_renamed = df.rename(columns={'a': 'A', 'b': 'B'})
print(df_renamed)
A B
0 1 2
#30.
df.drop()Drops specified labels from rows or columns.
import pandas as pd
df = pd.DataFrame({'A': [1], 'B': [2], 'C': [3]})
df_dropped = df.drop(columns=['B'])
print(df_dropped)
A C
0 1 3
#31.
pd.to_datetime()Converts argument to datetime.
import pandas as pd
s = pd.Series(['2023-01-01', '2023-01-02'])
dt_s = pd.to_datetime(s)
print(dt_s)
0 2023-01-01
1 2023-01-02
dtype: datetime64[ns]
#32.
df.apply()Applies a function along an axis of the DataFrame.
import pandas as pd
df = pd.DataFrame({'A': [1, 2, 3]})
df['B'] = df['A'].apply(lambda x: x * 2)
print(df)
A B
0 1 2
1 2 4
2 3 6
#33.
df['col'].map()Maps values of a Series according to an input mapping or function.
import pandas as pd
df = pd.DataFrame({'Gender': ['M', 'F', 'M']})
df['Gender_Full'] = df['Gender'].map({'M': 'Male', 'F': 'Female'})
print(df)
Gender Gender_Full
0 M Male
1 F Female
2 M Male
#34.
df.replace()Replaces values given in
to_replace with value.import pandas as pd
df = pd.DataFrame({'Score': [10, -99, 15, -99]})
df_replaced = df.replace(-99, 0)
print(df_replaced)
Score
0 10
1 0
2 15
3 0
#35.
df.duplicated()Returns a boolean Series denoting duplicate rows.
import pandas as pd
df = pd.DataFrame({'A': [1, 2, 1], 'B': ['a', 'b', 'a']})
print(df.duplicated())
0 False
1 False
2 True
dtype: bool
#36.
df.drop_duplicates()Returns a DataFrame with duplicate rows removed.
import pandas as pd
df = pd.DataFrame({'A': [1, 2, 1], 'B': ['a', 'b', 'a']})
print(df.drop_duplicates())
A B
0 1 a
1 2 b
#37.
df.sort_values()Sorts by the values along either axis.
import pandas as pd
df = pd.DataFrame({'Age': [25, 22, 30]})
print(df.sort_values(by='Age'))
Age
1 22
0 25
2 30
#38.
df.sort_index()Sorts object by labels (along an axis).
import pandas as pd
df = pd.DataFrame({'A': [1, 2, 3]}, index=[10, 5, 8])
print(df.sort_index())
A
5 2
8 3
10 1
#39.
pd.cut()Bins values into discrete intervals.
import pandas as pd
ages = pd.Series([22, 35, 58, 8, 42])
age_bins = pd.cut(ages, bins=[0, 18, 35, 60], labels=['Child', 'Adult', 'Senior'])
print(age_bins)
0 Adult
1 Adult
2 Senior
3 Child
4 Senior
dtype: category
Categories (3, object): ['Child' < 'Adult' < 'Senior']
#40.
pd.qcut()Quantile-based discretization function (bins into equal-sized groups).
import pandas as pd
data = pd.Series([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
quartiles = pd.qcut(data, 4, labels=False)
print(quartiles)
0 0
1 0
2 0
3 1
4 1
5 2
6 2
7 3
8 3
9 3
dtype: int64
#41.
s.str.contains()Tests if a pattern or regex is contained within a string of a Series.
import pandas as pd
s = pd.Series(['apple', 'banana', 'apricot'])
print(s[s.str.contains('ap')])
0 apple
2 apricot
dtype: object
#42.
s.str.split()Splits strings around a given separator/delimiter.
import pandas as pd
s = pd.Series(['a_b', 'c_d'])
print(s.str.split('_', expand=True))
0 1
0 a b
1 c d
#43.
s.str.lower()Converts strings in the Series to lowercase.
import pandas as pd
s = pd.Series(['HELLO', 'World'])
print(s.str.lower())
0 hello
1 world
dtype: object
#44.
s.str.strip()Removes leading and trailing whitespace.
import pandas as pd
s = pd.Series([' hello ', ' world '])
print(s.str.strip())
0 hello
1 world
dtype: object
#45.
s.dt.yearExtracts the year from a datetime Series.
import pandas as pd
s = pd.to_datetime(pd.Series(['2023-01-01', '2024-05-10']))
print(s.dt.year)
0 2023
1 2024
dtype: int64
---
#DataAnalysis #Pandas #Grouping #Aggregation
Part 4: Pandas - Grouping & Aggregation
#46.
df.groupby()Groups a DataFrame using a mapper or by a Series of columns.
import pandas as pd
df = pd.DataFrame({'Team': ['A', 'B', 'A', 'B'], 'Points': [10, 8, 12, 6]})
grouped = df.groupby('Team')
print(grouped)
<pandas.core.groupby.generic.DataFrameGroupBy object at 0x...>
#47.
groupby.agg()Aggregates using one or more operations over the specified axis.
import pandas as pd
df = pd.DataFrame({'Team': ['A', 'B', 'A', 'B'], 'Points': [10, 8, 12, 6]})
agg_df = df.groupby('Team').agg(['mean', 'sum'])
print(agg_df)
Points
mean sum
Team
A 11 22
B 7 14
#48.
groupby.size()Computes group sizes.
❤1
import pandas as pd
df = pd.DataFrame({'Team': ['A', 'B', 'A', 'B', 'A']})
print(df.groupby('Team').size())
Team
A 3
B 2
dtype: int64
#49.
groupby.count()Computes the count of non-NA cells for each group.
import pandas as pd
import numpy as np
df = pd.DataFrame({'Team': ['A', 'B', 'A'], 'Score': [1, np.nan, 3]})
print(df.groupby('Team').count())
Score
Team
A 2
B 0
#50.
groupby.mean()Computes the mean of group values.
import pandas as pd
df = pd.DataFrame({'Team': ['A', 'B', 'A', 'B'], 'Points': [10, 8, 12, 6]})
print(df.groupby('Team').mean())
Points
Team
A 11
B 7
#51.
groupby.sum()Computes the sum of group values.
import pandas as pd
df = pd.DataFrame({'Team': ['A', 'B', 'A', 'B'], 'Points': [10, 8, 12, 6]})
print(df.groupby('Team').sum())
Points
Team
A 22
B 14
#52.
groupby.min()Computes the minimum of group values.
import pandas as pd
df = pd.DataFrame({'Team': ['A', 'B', 'A', 'B'], 'Points': [10, 8, 12, 6]})
print(df.groupby('Team').min())
Points
Team
A 10
B 6
#53.
groupby.max()Computes the maximum of group values.
import pandas as pd
df = pd.DataFrame({'Team': ['A', 'B', 'A', 'B'], 'Points': [10, 8, 12, 6]})
print(df.groupby('Team').max())
Points
Team
A 12
B 8
#54.
df.pivot_table()Creates a spreadsheet-style pivot table as a DataFrame.
import pandas as pd
df = pd.DataFrame({'A': ['foo', 'foo', 'bar'], 'B': ['one', 'two', 'one'], 'C': [1, 2, 3]})
pivot = df.pivot_table(values='C', index='A', columns='B')
print(pivot)
B one two
A
bar 3.0 NaN
foo 1.0 2.0
#55.
pd.crosstab()Computes a cross-tabulation of two (or more) factors.
import pandas as pd
df = pd.DataFrame({'A': ['foo', 'foo', 'bar'], 'B': ['one', 'two', 'one']})
crosstab = pd.crosstab(df.A, df.B)
print(crosstab)
B one two
A
bar 1 0
foo 1 1
---
#DataAnalysis #Pandas #Merging #Joining
Part 5: Pandas - Merging & Concatenating
#56.
pd.merge()Merges DataFrame or named Series objects with a database-style join.
import pandas as pd
df1 = pd.DataFrame({'key': ['A', 'B'], 'val1': [1, 2]})
df2 = pd.DataFrame({'key': ['A', 'B'], 'val2': [3, 4]})
merged = pd.merge(df1, df2, on='key')
print(merged)
key val1 val2
0 A 1 3
1 B 2 4
#57.
pd.concat()Concatenates pandas objects along a particular axis.
import pandas as pd
df1 = pd.DataFrame({'A': [1, 2]})
df2 = pd.DataFrame({'A': [3, 4]})
concatenated = pd.concat([df1, df2])
print(concatenated)
A
0 1
1 2
0 3
1 4
#58.
df.join()Joins columns with other DataFrame(s) on index or on a key column.
❤1
import pandas as pd
df1 = pd.DataFrame({'val1': [1, 2]}, index=['A', 'B'])
df2 = pd.DataFrame({'val2': [3, 4]}, index=['A', 'B'])
joined = df1.join(df2)
print(joined)
val1 val2
A 1 3
B 2 4
#59.
pd.get_dummies()Converts categorical variable into dummy/indicator variables (one-hot encoding).
import pandas as pd
s = pd.Series(list('abca'))
dummies = pd.get_dummies(s)
print(dummies)
a b c
0 1 0 0
1 0 1 0
2 0 0 1
3 1 0 0
#60.
df.nlargest()Returns the first n rows ordered by columns in descending order.
import pandas as pd
df = pd.DataFrame({'population': [100, 500, 200, 800]})
print(df.nlargest(2, 'population'))
population
3 800
1 500
---
#DataAnalysis #NumPy #Arrays
Part 6: NumPy - Array Creation & Manipulation
#61.
np.array()Creates a NumPy ndarray.
import numpy as np
arr = np.array([1, 2, 3])
print(arr)
[1 2 3]
#62.
np.arange()Returns an array with evenly spaced values within a given interval.
import numpy as np
arr = np.arange(0, 5)
print(arr)
[0 1 2 3 4]
#63.
np.linspace()Returns an array with evenly spaced numbers over a specified interval.
import numpy as np
arr = np.linspace(0, 10, 5)
print(arr)
[ 0. 2.5 5. 7.5 10. ]
#64.
np.zeros()Returns a new array of a given shape and type, filled with zeros.
import numpy as np
arr = np.zeros((2, 3))
print(arr)
[[0. 0. 0.]
[0. 0. 0.]]
#65.
np.ones()Returns a new array of a given shape and type, filled with ones.
import numpy as np
arr = np.ones((2, 3))
print(arr)
[[1. 1. 1.]
[1. 1. 1.]]
#66.
np.random.rand()Creates an array of the given shape and populates it with random samples from a uniform distribution over [0, 1).
import numpy as np
arr = np.random.rand(2, 2)
print(arr)
[[0.13949386 0.2921446 ]
[0.52273283 0.77122228]]
(Note: Output values will be random)
#67.
arr.reshape()Gives a new shape to an array without changing its data.
import numpy as np
arr = np.arange(6)
reshaped_arr = arr.reshape((2, 3))
print(reshaped_arr)
[[0 1 2]
[3 4 5]]
#68.
np.concatenate()Joins a sequence of arrays along an existing axis.
import numpy as np
a = np.array([[1, 2]])
b = np.array([[3, 4]])
print(np.concatenate((a, b), axis=0))
[[1 2]
[3 4]]
#69.
np.vstack()Stacks arrays in sequence vertically (row wise).
import numpy as np
a = np.array([1, 2])
b = np.array([3, 4])
print(np.vstack((a, b)))
[[1 2]
[3 4]]
#70.
np.hstack()Stacks arrays in sequence horizontally (column wise).
import numpy as np
a = np.array([1, 2])
b = np.array([3, 4])
print(np.hstack((a, b)))
[1 2 3 4]
---
#DataAnalysis #NumPy #Math #Statistics
Part 7: NumPy - Mathematical & Statistical Functions
#71.
np.mean()Computes the arithmetic mean along the specified axis.
import numpy as np
arr = np.array([1, 2, 3, 4, 5])
print(np.mean(arr))
3.0
#72.
np.median()Computes the median along the specified axis.
import numpy as np
arr = np.array([1, 2, 3, 4, 5])
print(np.median(arr))
3.0
#73.
np.std()Computes the standard deviation along the specified axis.
import numpy as np
arr = np.array([1, 2, 3, 4, 5])
print(np.std(arr))
1.4142135623730951
#74.
np.sum()Sums array elements over a given axis.
import numpy as np
arr = np.array([[1, 2], [3, 4]])
print(np.sum(arr))
10
#75.
np.min()Returns the minimum of an array or minimum along an axis.
import numpy as np
arr = np.array([5, 2, 8, 1])
print(np.min(arr))
1
#76.
np.max()Returns the maximum of an array or maximum along an axis.
import numpy as np
arr = np.array([5, 2, 8, 1])
print(np.max(arr))
8
#77.
np.sqrt()Returns the non-negative square-root of an array, element-wise.
import numpy as np
arr = np.array([4, 9, 16])
print(np.sqrt(arr))
[2. 3. 4.]
#78.
np.log()Calculates the natural logarithm, element-wise.
import numpy as np
arr = np.array([1, np.e, np.e**2])
print(np.log(arr))
[0. 1. 2.]
#79.
np.dot()Calculates the dot product of two arrays.
import numpy as np
a = np.array([1, 2])
b = np.array([3, 4])
print(np.dot(a, b))
11
#80.
np.where()Returns elements chosen from x or y depending on a condition.
import numpy as np
arr = np.array([10, 5, 20, 15])
print(np.where(arr > 12, 'High', 'Low'))
['Low' 'Low' 'High' 'High']
---
#DataAnalysis #Matplotlib #Seaborn #Visualization
Part 8: Matplotlib & Seaborn - Data Visualization
#81.
plt.plot()Plots y versus x as lines and/or markers.
import matplotlib.pyplot as plt
plt.plot([1, 2, 3, 4], [1, 4, 9, 16])
# In a real script, you would call plt.show()
print("Output: A figure window opens displaying a line plot.")
Output: A figure window opens displaying a line plot.
#82.
plt.scatter()A scatter plot of y vs. x with varying marker size and/or color.
import matplotlib.pyplot as plt
plt.scatter([1, 2, 3, 4], [1, 4, 9, 16])
print("Output: A figure window opens displaying a scatter plot.")
Output: A figure window opens displaying a scatter plot.
#83.
plt.hist()Computes and draws the histogram of x.
import matplotlib.pyplot as plt
import numpy as np
data = np.random.randn(1000)
plt.hist(data, bins=30)
print("Output: A figure window opens displaying a histogram.")
Output: A figure window opens displaying a histogram.
#84.
plt.bar()Makes a bar plot.
import matplotlib.pyplot as plt
plt.bar(['A', 'B', 'C'], [10, 15, 7])
print("Output: A figure window opens displaying a bar chart.")
Output: A figure window opens displaying a bar chart.
#85.
plt.boxplot()Makes a box and whisker plot.
import matplotlib.pyplot as plt
import numpy as np
data = [np.random.normal(0, std, 100) for std in range(1, 4)]
plt.boxplot(data)
print("Output: A figure window opens displaying a box plot.")
Output: A figure window opens displaying a box plot.
#86.
sns.heatmap()Plots rectangular data as a color-encoded matrix.
import seaborn as sns
import numpy as np
data = np.random.rand(10, 12)
sns.heatmap(data)
print("Output: A figure window opens displaying a heatmap.")
Output: A figure window opens displaying a heatmap.
#87.
sns.pairplot()Plots pairwise relationships in a dataset.
❤2