Datatest Versions Save

Tools for test driven data-wrangling and data validation.

0.11.1

3 years ago
  • Fixed validation, predicate, and difference handling of non-comparable objects.
  • Fixed bug in normalization of Queries from squint package.
  • Changed failure output to improve error reporting with pandas accessors.
  • Changed predicate failure message to quote code objects using backticks.

0.11.0

3 years ago
  • Removed deprecated decorators: skip(), skipIf(), skipUnless() (use unittest.skip(), etc. instead).
  • Removed deprecated aliases Selector and ProxyGroup.
  • Removed the long-deprecated allowed interface.
  • Removed deprecated acceptances: "specific", "limit", etc.
  • Removed deprecated Select, Query, and Result API. Use squint instead:
  • Removed deprecated get_reader() function. Use get-reader instead:

0.10.0

3 years ago
  • Fixed bug where ValidationErrors were crashing pytest-xdist workers.

  • Added tighter Pandas integration using Pandas' extension API.

    After calling the new register_accessors() function, your existing DataFrame, Series, Index, and MultiIndex objects will have a validate() method that can be used instead of the validate() function:

    import padas as pd
    import datatest as dt
    
    dt.register_accessors()  # <- Activate Pandas integration.
    
    df = pd.DataFrame(...)
    df[['A', 'B']].validate((str, int))  # <- New accessor method.
    
  • Changed Pandas validation behavior:

    • DataFrame and Series: These objects are treated as sequences when they use a RangeIndex index (this is the default type assigned when no index is specified). And they are treated as dictionaries when they use an index of any other type--the index values become the dictionary keys.

    • Index and MultiIndex: These objects are treated as sequences.

  • Changed repr behavior of Deviation to make timedeltas more readable.

  • Added Predicate matching support for NumPy types np.character, np.integer, np.floating, and np.complexfloating.

  • Added improved NaN handling:

    • Added NaN support to accepted.keys(), accepted.args(), and validate.interval().
    • Improved existing NaN support for difference comparisons.
    • Added how-to documentation for NaN handling.
  • Added data handling support for squint.Select objects.

  • Added deprecation warnings for soon-to-be-removed functions and classes:

    • Added DeprecationWarning to get_reader function. This function is now available from the get-reader package on PyPI:

      https://pypi.org/project/get-reader/

    • Added DeprecationWarning to Select, Query, and Result classes. These classes will be deprecated in the next release but are now available from the squint package on PyPI:

      https://pypi.org/project/squint/

  • Changed validate.subset() and validate.superset() behavior:

    The semantics are now inverted. This behavior was flipped to more closely match user expectations. The previous semantics were used because they reflect the internal structure of datatest more precisely. But these are implementation details that and they are not as important as having a more intuitive API.

  • Added temporary a warning when using the new subset superset methods to alert users to the new behavior. This warning will be removed from future versions of datatest.

  • Added Python 3.9 and 3.10 testing and support.

  • Removed Python 3.1 testing and support. If you were still using this version of Python, please email me--this is a story I need to hear.

0.9.6

4 years ago
  • Changed acceptance API to make it both less verbose and more expressive:

    • Consolidated specific-instance and class-based acceptances into a single interface.

    • Added a new accepted.tolerance() method that subsumes the behavior of accepted.deviation() by supporting Missing and Extra quantities in addition to Deviation objects.

    • Deprecated old methods:

      Old SyntaxNew Syntax
      accepted.specific(...)accepted(...)
      accepted.missing()accepted(Missing)
      accepted.extra()accepted(Extra)
      NO EQUIVALENTaccepted(CustomDifferenceClass)
      accepted.deviation(...)accepted.tolerance(...)
      accepted.limit(...)accepted.count(...)
      NO EQUIVALENTaccepted.count(..., scope='group')

      Other methods--accepted.args(), accepted.keys(), etc.--remain unchanged.

  • Changed validation to generate Deviation objects for a broader definition of quantitative values (like datetime objects)--not just for subclasses of numbers.Number.

  • Changed handling for pandas.Series objects to treat them as sequences instead of mappings.

  • Added handling for DBAPI2 cursor objects to automatically unwrap single-value rows.

  • Removed acceptance classes from datatest namespace--these were inadvertently added in a previous version but were never part of the documented API. They can still be referenced via the acceptances module:

    from datatest.acceptances import ...

0.9.5

5 years ago
  • Changed difference objects to make them hashable (can now be used as set members or as dict keys).
  • Added __slots__ to difference objects to reduce memory consumption.
  • Changed name of Selector class to Select (Selector now deprecated).
  • Changed language and class names from allowed and allowance to accepted and acceptance to bring datatest more inline with manufacturing and engineering terminology. The existing allowed API is now deprecated.

0.9.4

5 years ago
  • Added Python 3.8 testing and support.
  • Added new validate methods (moved from how-to recipes into core module):
    • Added approx() method to require for approximate numeric equality.
    • Added fuzzy() method to require strings by approximate match.
    • Added interval() method to require elements within a given interval.
    • Added set(), subset(), and superset() methods for explicit membership checking.
    • Added unique() method to require unique elements.
    • Added order() method to require elements by relative order.
  • Changed default sequence validation to check elements by index position rather than checking by relative order.
  • Added fuzzy-matching allowance to allow strings by approximate match.
  • Added Predicate class to formalize behavior--also provides inverse-matching with the inversion operator (~).
  • Added new methods to Query class:
    • Added unwrap() to remove single-element containers and return their unwrapped contents.
    • Added starmap() to unpack grouped arguments when applying a function to elements.
  • Fixed improper use of assert statements with appropriate conditional checks and error behavior.
  • Added requirement class hierarchy (using BaseRequirement). This gives users a cleaner way to implement custom validation behavior and makes the underlying codebase easier to maintain.
  • Changed name of ProxyGroup to RepeatingContainer.
  • Changed "How To" examples to use the new validation methods.

0.9.3

5 years ago
  • Changed bundled pytest plugin to version 0.1.3:
    • This update adds testing and support for latest versions of Pytest and Python (now tested using Pytest 3.3 to 4.1 and Python 2.7 to 3.7).
    • Changed handling for 'mandatory' marker to support older and newer Pytest versions.

0.9.2

5 years ago

Improved data handling features and support for Python 3.7:

  • Changed Query class:
    • Added flatten() method to serialize dictionary results.
    • Added to_csv() method to quickly save results as a CSV file.
    • Changed reduce() method to accept initializer_factory as an optional argument.
    • Changed filter() method to support predicate matching.
  • Added True and False as predicates to support "truth value testing" on arbitrary objects (to match on truthy or falsy).
  • Added ProxyGroup class for performing the same operations on groups of objects at the same time (a common need when testing against reference data).
  • Changed Selector class keyword filtering to support predicate matching.
  • Added handling to get_reader() to support datatest's Selector and Result objects.
  • Fixed get_reader() bug that prevented encoding-fallback recovery when reading from StringIO buffers in Python 2.

0.9.1

5 years ago
  • Added impoved docstrings and other documentation.
  • Changed bundled pytest plugin to version 0.1.2:
    • Added handling for a mandatory marker to support incremental testing (stops session early when a mandatory test fails).
    • Added --ignore-mandatory option to continue tests even when a mandatory test fails.

0.9.0

6 years ago
  • Added bundled version pytest plugin to base installation.
  • Added universal composability for all allowances (using UNION and INTERSECTION via "|" and "&" operators).
  • Added allowed factory class to simplify allowance imports.
  • Changed is_valid() to valid().
  • Changed ValidationError to display differences in sorted order.
  • Added Python 2 and 3 compatible get_reader() to quickly load csv.reader-like interface for Unicode CSV, MS Excel, pandas.DataFrame, DBF, etc.
  • Added formal order of operations for allowance resolution.
  • Added formal predicate object handling.
  • Added Sphinx-tabs style docs for clear separation of pytest and unittest style examples.
  • Changed DataSource to Selector, DataQuery to Query, and DataResult to Result.