Python | Machine Learning | Coding

# 📚 Python Tutorial: Convert EPUB to PDF (Preserving Images)
#Python #EPUB #PDF #EbookConversion #Automation

This comprehensive guide will show you how to convert EPUB files (including those with images) to high-quality PDFs using Python.

---

## 🔹 Required Tools & Libraries
We'll use these Python packages:
- ebooklib - For EPUB parsing
- pdfkit (wrapper for wkhtmltopdf) - For PDF generation
- Pillow - For image handling (optional)

pip install ebooklib pdfkit pillow

Also install system dependencies:

# On Ubuntu/Debian
sudo apt-get install wkhtmltopdf

# On MacOS
brew install wkhtmltopdf

# On Windows (download from wkhtmltopdf.org)

---

## 🔹 Step 1: Extract EPUB Contents
First, we'll unpack the EPUB file to access its HTML and images.

from ebooklib import epub
from bs4 import BeautifulSoup
import os

def extract_epub(epub_path, output_dir):
    book = epub.read_epub(epub_path)
    
    # Create output directory
    os.makedirs(output_dir, exist_ok=True)
    
    # Extract all items (chapters, images, styles)
    for item in book.get_items():
        if item.get_type() == epub.ITEM_IMAGE:
            # Save images
            with open(os.path.join(output_dir, item.get_name()), 'wb') as f:
                f.write(item.get_content())
        elif item.get_type() == epub.ITEM_DOCUMENT:
            # Save HTML chapters
            with open(os.path.join(output_dir, item.get_name()), 'wb') as f:
                f.write(item.get_content())
    
    return [item.get_name() for item in book.get_items() if item.get_type() == epub.ITEM_DOCUMENT]

---

## 🔹 Step 2: Convert HTML to PDF
Now we'll convert the extracted HTML files to PDF while preserving images.

import pdfkit
from PIL import Image  # For image validation (optional)

def html_to_pdf(html_files, output_pdf, base_dir):
    options = {
        'encoding': "UTF-8",
        'quiet': '',
        'enable-local-file-access': '',  # Critical for local images
        'no-outline': None,
        'margin-top': '15mm',
        'margin-right': '15mm',
        'margin-bottom': '15mm',
        'margin-left': '15mm',
    }
    
    # Validate images (optional)
    for html_file in html_files:
        soup = BeautifulSoup(open(os.path.join(base_dir, html_file)), 'html.parser')
        for img in soup.find_all('img'):
            img_path = os.path.join(base_dir, img['src'])
            try:
                Image.open(img_path)  # Validate image
            except Exception as e:
                print(f"Image error in {html_file}: {e}")
                img.decompose()  # Remove broken images
    
    # Convert to PDF
    pdfkit.from_file(
        [os.path.join(base_dir, f) for f in html_files],
        output_pdf,
        options=options
    )

---

## 🔹 Step 3: Complete Conversion Function
Combine everything into a single workflow.

def epub_to_pdf(epub_path, output_pdf, temp_dir="temp_epub"):
    try:
        print(f"Converting {epub_path} to PDF...")
        
        # Step 1: Extract EPUB
        print("Extracting EPUB contents...")
        html_files = extract_epub(epub_path, temp_dir)
        
        # Step 2: Convert to PDF
        print("Generating PDF...")
        html_to_pdf(html_files, output_pdf, temp_dir)
        
        print(f"Success! PDF saved to {output_pdf}")
        return True
    
    except Exception as e:
        print(f"Conversion failed: {str(e)}")
        return False
    finally:
        # Clean up temporary files
        if os.path.exists(temp_dir):
            import shutil
            shutil.rmtree(temp_dir)

---

## 🔹 Advanced Options
### 1. Custom Styling
Add CSS to improve PDF appearance:

def html_to_pdf(html_files, output_pdf, base_dir):
    options = {
        # ... previous options ...
        'user-style-sheet': 'styles.css',  # Custom CSS
    }
    
    # Create CSS file if needed
    css = """
    body { font-family: "Times New Roman", serif; font-size: 12pt; }
    img { max-width: 100%; height: auto; }
    """
    with open(os.path.join(base_dir, 'styles.css'), 'w') as f:
        f.write(css)
    
    pdfkit.from_file(/* ... */)

❤7🔥2🎉1

4.24K views10:48

Python | Machine Learning | Coding | R

🚀 Comprehensive Tutorial: Build a Folder Monitoring & Intruder Detection System in Python

In this comprehensive, step-by-step tutorial, you will learn how to build a real-time folder monitoring and intruder detection system using Python.

🔐 Your Goal:
Create a background program that:
- Monitors a specific folder on your computer.
- Instantly captures a photo using the webcam whenever someone opens that folder.
- Saves the photo with a timestamp in a secure folder.
- Runs automatically when Windows starts.
- Keeps running until you manually stop it (e.g., via Task Manager or a hotkey).

Read and get code: https://hackmd.io/@husseinsheikho/Build-a-Folder-Monitoring

#Python #Security #FolderMonitoring #IntruderDetection #OpenCV #FaceCapture #Automation #Windows #TaskScheduler #ComputerVision

✉️ Our Telegram channels: https://t.me/addlist/0f6vfFbEMdAwODBk

📱 Our WhatsApp channel: https://whatsapp.com/channel/0029VaC7Weq29753hpcggW2A

Please open Telegram to view this post

VIEW IN TELEGRAM

❤7🔥1🎉1

5.22K viewsedited 05:41

Python | Machine Learning | Coding | R

0:09

This media is not supported in your browser

VIEW IN TELEGRAM

🥇

This repo is like gold for every data scientist!

✅ Just open your browser; a ton of interactive exercises and real experiences await you. Any question about statistics, probability, Python, or machine learning, you'll get the answer right there! With code, charts, even animations. This way, you don't waste time, and what you learn really sticks in your mind!

⬅️ Data science statistics and probability topics
⬅️ Clustering
⬅️ Principal Component Analysis (PCA)
⬅️ Bagging and Boosting techniques
⬅️ Linear regression
⬅️ Neural networks and more...

┌ 📂 Int Data Science Python Dash
└ 🐱 GitHub-Repos

👉

@codeprogrammer

#Python #OpenCV #Automation #ML #AI #DEEPLEARNING #MACHINELEARNING #ComputerVision

Please open Telegram to view this post

VIEW IN TELEGRAM

❤8👍4💯1🏆1

6.59K viewsedited 20:18

About

Blog

Apps

Platform