html.soupparser

External interface to the BeautifulSoup HTML parser.

Module Contents

Classes

_PseudoTag(self,contents)

Functions

fromstring(data,beautifulsoup=None,makeelement=None,**bsargs) Parse a string of HTML data into an Element tree using the
parse(file,beautifulsoup=None,makeelement=None,**bsargs) Parse a file into an ElemenTree using the BeautifulSoup parser.
convert_tree(beautiful_soup_tree,makeelement=None) Convert a BeautifulSoup tree to a list of Element trees.
_parse(source,beautifulsoup,makeelement,**bsargs)
_convert_tree(beautiful_soup_tree,makeelement)
_init_node_converters(makeelement)
unescape(string)
fromstring(data, beautifulsoup=None, makeelement=None, **bsargs)

Parse a string of HTML data into an Element tree using the BeautifulSoup parser.

Returns the root <html> Element of the tree.

You can pass a different BeautifulSoup parser through the beautifulsoup keyword, and a diffent Element factory function through the makeelement keyword. By default, the standard BeautifulSoup class and the default factory of lxml.html are used.

parse(file, beautifulsoup=None, makeelement=None, **bsargs)

Parse a file into an ElemenTree using the BeautifulSoup parser.

You can pass a different BeautifulSoup parser through the beautifulsoup keyword, and a diffent Element factory function through the makeelement keyword. By default, the standard BeautifulSoup class and the default factory of lxml.html are used.

convert_tree(beautiful_soup_tree, makeelement=None)

Convert a BeautifulSoup tree to a list of Element trees.

Returns a list instead of a single root Element to support HTML-like soup with more than one root element.

You can pass a different Element factory through the makeelement keyword.

_parse(source, beautifulsoup, makeelement, **bsargs)
class _PseudoTag(contents)
__init__(contents)
__iter__()
_convert_tree(beautiful_soup_tree, makeelement)
_init_node_converters(makeelement)
unescape(string)