Learn Python Coding
38.7K subscribers
1.06K photos
37 videos
24 files
855 links
Learn Python through simple, practical examples and real coding ideas. Clear explanations, useful snippets, and hands-on learning for anyone starting or improving their programming skills.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
📱 Cheat Sheet for Beautiful Soup 4

Beautiful Soup — a library for extracting data from HTML and XML files, which is perfect for web scraping.

1. Installation
pip install beautifulsoup4


2. Import
from bs4 import BeautifulSoup
import requests


3. Basic parsing
html_doc = "<html><body><p class='text'>Hello, world!</p></body></html>"
soup = BeautifulSoup(html_doc, 'html.parser')  # or 'lxml', 'html5lib'
print(soup.p.text)  # Hello, world!


4. Finding elements
# First found element
first_p = soup.find('p')

# Search by class or attribute
text_elem = soup.find('p', class_='text')
text_elem = soup.find('p', {'class': 'text'})

# All elements
all_p = soup.find_all('p')
all_text_class = soup.find_all(class_='text')


5. Working with attributes and text
a_tag = soup.find('a')
print(a_tag['href&#39])    # value of the href attribute
print(a_tag.get_text()) # text inside the tag
print(a_tag.text)       # alternative


6. Navigating the tree
# Moving to parent, children, siblings
parent = soup.p.parent
children = soup.ul.children
next_sibling = soup.p.next_sibling

# Finding the previous/next element
prev_elem = soup.find_previous('p')
next_elem = soup.find_next('div')


7. Parsing a real page
response = requests.get('https://example.com')
soup = BeautifulSoup(response.text, 'html. parser')
title = soup.title.text
links = [a['href'] for a in soup.find_all('a', href=True)]


8. CSS selectors
# More powerful and concise search
items = soup.select('div.content > p.text')
first_item = soup.select_one('a.button')


tags: #cheat_sheet #useful

https://t.me/DataScience4
Please open Telegram to view this post
VIEW IN TELEGRAM
3👍1