Htmldate Versions Save

Fast and robust date extraction from web pages, with Python or on the command-line

v1.8.1

1 month ago
  • fix: more restrictive YYYYMM pattern to prevent ValueError with @b3n4kh (#145)
  • maintenance: add pre-commit with checks with @nadasuhailAyesh12 (#142)

v1.8.0

2 months ago
  • change license to Apache 2.0 (#140)
  • compile XPath expressions (#136)
  • update docs with @EkaterineSheshelidze (#135)

v1.7.0

4 months ago
  • fix meta property updated vs. original behavior (#121)
  • support for LXML version 5.0+ (#127)
  • fix image links in Readme

v1.6.1

4 months ago
  • fix for MacOS: pin LXML dependency with @adamh-oai

v1.6.0

5 months ago
  • focus on precision, stricter extraction patterns (#103, #105, #106, #112)
  • simplified code base (#108, #109)
  • replaced lxml.html.Cleaner (#104)
  • extended evaluation

Full Changelog: https://github.com/adbar/htmldate/compare/v1.5.2...v1.6.0

v1.5.2

7 months ago
  • fix for missing months keys in custom extractor (#100)
  • fix for None in try_date_expr() (#101)

v1.5.1

8 months ago
  • fix regression for fast extraction introduced in e8b3538 (#96)
  • fix setup by making backports-datetime-fromisoformat optional (#95)

v1.5.0

8 months ago
  • slightly higher accuracy with revised heuristics
  • simplified code structure for better performance
  • setup: support for 3.12, fromisoformat backport if applicable
  • HTML parsing fixes: more lenient parsing, pinned LXML version for MacOS

v1.4.3

1 year ago
  • maintenance release: upgrade urllib3 dependency

v1.4.2

1 year ago
  • support min_date/max_date as datetimes or datetime strings with @kernc (#73)
  • add date attributes to HTML extraction with @kernc (#74)
  • fix for extraction of updated and original dates in time elements
  • code refactoring and maintenance

Full Changelog: https://github.com/adbar/htmldate/compare/v1.4.1...v1.4.2