A Python library for pulling data out of HTML and XML files. It works with your favorite parser to provide idiomatic ways of navigating, searching, and modifying the parse tree.
Beautiful Soup

- Python
- Análisis de datos
- Tecnología de la información (TI), Motor de búsqueda, Software, Kit de desarrollo de software (SDK), Interfaz de usuario (UI)
Características:
- HTML parsing, XML parsing, web scraping, data extraction, navigating parse trees, searching (by tag, attributes, text), modifying the parse tree, handling broken HTML
Precios:
- Gratis
- Easy to use for web scraping, handles malformed markup well, good documentation, integrates with various parsers (lxml, html5lib, Python's html.parser).
- Does not fetch web pages (requires libraries like `requests`), can be slow for very large documents compared to other parsers if not using lxml, primarily for parsing, not rendering.
Ideal para:
- Web scraping and extracting structured data from HTML and XML documents for data collection and analysis.