Tech C**P
15 subscribers
161 photos
9 videos
59 files
304 links
مدرس و برنامه نویس پایتون و لینوکس @alirezastack
Download Telegram
In order to extract only text from an HTML website in the most robust way without using regex or urlib or so, use the python library below:
https://github.com/aaronsw/html2text

Usage in terminal is:
Usage: html2text.py [(filename|url) [encoding]]

If you want it to use inside of your python code:
import html2text
print html2text.html2text("<p>Hello, world.</p>")

#python #html2text #github #html #text