Learn Python Coding

📱

Cheat Sheet for Beautiful Soup 4

Beautiful Soup — a library for extracting data from HTML and XML files, which is perfect for web scraping.

1. Installation

pip install beautifulsoup4

2. Import

from bs4 import BeautifulSoup
import requests

3. Basic parsing

html_doc = "<html><body><p class='text'>Hello, world!</p></body></html>"
soup = BeautifulSoup(html_doc, 'html.parser')  # or 'lxml', 'html5lib'
print(soup.p.text)  # Hello, world!

4. Finding elements

# First found element
first_p = soup.find('p')

# Search by class or attribute
text_elem = soup.find('p', class_='text')
text_elem = soup.find('p', {'class': 'text'})

# All elements
all_p = soup.find_all('p')
all_text_class = soup.find_all(class_='text')

5. Working with attributes and text

a_tag = soup.find('a')
print(a_tag['href&#39])    # value of the href attribute
print(a_tag.get_text()) # text inside the tag
print(a_tag.text)       # alternative

6. Navigating the tree

# Moving to parent, children, siblings
parent = soup.p.parent
children = soup.ul.children
next_sibling = soup.p.next_sibling

# Finding the previous/next element
prev_elem = soup.find_previous('p')
next_elem = soup.find_next('div')

7. Parsing a real page

response = requests.get('https://example.com')
soup = BeautifulSoup(response.text, 'html. parser')
title = soup.title.text
links = [a['href'] for a in soup.find_all('a', href=True)]

8. CSS selectors

# More powerful and concise search
items = soup.select('div.content > p.text')
first_item = soup.select_one('a.button')

tags: #cheat_sheet #useful

➡

https://t.me/DataScience4

Please open Telegram to view this post

VIEW IN TELEGRAM

Code With Python

This channel delivers clear, practical content for developers, covering Python, Django, Data Structures, Algorithms, and DSA – perfect for learning, coding, and mastering key programming skills.
Admin: @HusseinSheikho || @Hussein_Sheikho

❤3👍1

1.47K views12:55

About

Blog

Apps

Platform