A Python library for pulling data out of HTML and XML files. It works with your favorite parser to provide idiomatic ways of navigating, searching, and modifying the parse tree.
Beautiful Soup

- Python
- Analyse des données
- Technologies de l'information (TI), Moteur de recherche, Logiciel, Software Development Kit (SDK), Interface utilisateur (UI)
Caractéristiques :
- HTML parsing, XML parsing, web scraping, data extraction, navigating parse trees, searching (by tag, attributes, text), modifying the parse tree, handling broken HTML
Prix :
- Gratuit
- Easy to use for web scraping, handles malformed markup well, good documentation, integrates with various parsers (lxml, html5lib, Python's html.parser).
- Does not fetch web pages (requires libraries like `requests`), can be slow for very large documents compared to other parsers if not using lxml, primarily for parsing, not rendering.
Idéal pour :
- Web scraping and extracting structured data from HTML and XML documents for data collection and analysis.