🥫 The simple, fast, and modern web scraping library
Soup
now automatically formats and indents (pretty print) HTML where possibleSoup.get("url")
alternative initializer.find
is now able to capture malformed void tags (<img />
, vs. <img>
) (thanks for the Issue @mallegrini!).find(..., strict=)
is now find(..., partial=)
.remove_tags
is now .strip
0.9.2 (2020-04-21)
find(..., mode='first')
to return None
and not an IndexError
(thanks, psyonara!)UnicodeEncodeError
lurking beneath get
(thanks for the "Issue" mlehotay!)find
method to properly handle non-closing HTML tagsremove_tags
method for isolating formatted text in a block of HTML