Pyspi Versions Save

Comparative analysis of pairwise interactions in multivariate time series.

v1.0.3

1 month ago

SPI Reproducibility Fix

pyspi v1.0.3 is a patch update that addresses inconsistent results observed in several Information Theoretic and Convergent Cross-Mapping (ccm) SPIs when running multiple trials on the same benchmarking dataset. As of this update, all 284 SPIs should now produce identical results across multiple runs on the same dataset.

Key Changes:

  • Upgraded jidt dependency from v1.5 to v1.6.1. jidt v1.6.1 includes a new NOISE_SEED property for all jidt calculators, enabling consistent results across multiple runs. For more information, see here. Since jidt is self-contained within the pyspi package, upgrading the jidt version should not introduce any breaking changes for users who have already installed pyspi.
  • Added random seed support within pyspi for jidt-based SPIs. All SPIs that rely on the jidt library now utilise a fixed random seed to ensure reproducibility across runs.
  • Introduced random seed for Convergent Cross-Mapping (ccm) SPI. The ccm SPI now uses a fixed random seed, addressing the previously observed stochasticity in its outputs.

Important Note to Users: The addition of fixed random seeds for the affected SPIs may result in slightly different output values compared to previous versions of pyspi. This change is due to the improved consistency and reproducibility of the SPI outputs. Please keep this in mind if making exact numerical comparisons with previous versions of pyspi.

Affected SPIs: The following SPIs, which previously produced varying outputs across multiple trials, should now yield consistent results:

  • ccm (all 9 estimators)
  • cce_kozachenko
  • ce_kozachenko
  • di_kozachenko
  • je_kozachenko
  • si_kozachenko_k-1

v1.0.2

1 month ago

New SPI - Gromov-Wasserstein Distance (GWτ)

This minor patch update introduces a new distance-based SPI, GWτ (called gwtau in pyspi). An in-depth tutorial for incorporating new SPIs into the existing pyspi framework, using gwtau as a prototypical example, is now available in the documentation.

What is it?

Based on the algorithm proposed by Kravtsova et al. (2023), GWτ is a new distance measure for comparing time series data, especially suited for biological applications. It works by representing each time series as a metric space and computing the distances from the start of each time series to every point. These distance distributions are then compared using the Wasserstein distance, which finds the optimal way to match the distances between two time series, making it robust to shifts and perturbations. The "tau" in GWτ emphasises that this distance measure is based on comparing the distributions of distances from the root (i.e., the starting point) to all other points in each time series, which is analogous to comparing the branch lengths in two tree-like structures. GWτ can be computed efficiently and is scalable.

How can I use it?

Currently, the default (subset = all) SPI set and fast (subset = fast) subset include gwtau. This means you do not have to do anything, unless you would like to compute gwtau in isolation. Simply instantiate the calculator object and compute SPIs as usual. You can access the matrix of pairwise interactions for gwtau using it's identifier in the results table:

calc = Calculator(dataset=...)
calc.compute()
gwtau_results = calc.table['gwtau']

For technical details about the specific implementation of gwtau, such as theoretical properties of this distance measure, see the original paper by Kravtsova et al. (2023). You can also find the original implementation of the algorithm in MATLAB in this GitHub repository.

v1.0.1

2 months ago

Bug Fixes

File location handling improvement for the filter_spis function:

  • Modified the filter_spis function to allow the user to specify the exact location of the source config YAML file.
  • Implemented a default file mechanism where, if no file is specified by the user, the function defaults to using the pre-defined config.yaml file located in the script's directory as the source file.
  • Updated unit tests to reflect the changes.

v1.0.0

2 months ago

Introduction to pyspi v1.0.0

This major release (1.0.0) brings several updates to pyspi including optional dependency checks and the ability to filter SPIs based on keywords.

Highlights of this release

  • SPI Filtering: A new filter_spis function has been added to the pyspi.utils module. This function allows users to create subsets of SPIs based on keywords (e.g., "linear", "non-linear"). It takes three arguments:
    • keywords: a list of one or more labels to filter the SPIs, e.g., ["linear", "signed"].
    • output_name: the name of the output YAML file, defaulting to {random_string}_config.yaml if no name is provided as an argument.
    • configfile: the path to the source config file. If no configfile is provided, defaults to using config.yaml in the pyspi directory.

Example usage:

# using the default config.yaml as the source file
filter_spis(keywords=["linear", "signed"], output_name="linear_signed") # returns `linear_signed.yaml` in cwd 

# or using a user-specified configfile as the source file 
filter_spis(keywords=["linear", "signed"], output_name="linear_signed", configfile="myconfig.yaml")

A new yaml file is saved in the current working directory with the filtered subset of SPIs. This filtered config file can be loaded into the Calculator object using the configfile argument as would be the case for a typical custom YAML file (see the docs for more info):

calc = Calculator(configfile="./linear_signed.yaml")
  • Optional Dependency Checks: When instantiating a Calculator object, pyspi now automatically performs checks for optional dependencies (Java and Octave). If any dependencies are missing, the user will be notified about which SPIs will be excluded and due to which dependencies. The user can then choose to proceed with a reduced set of SPIs or install the missing dependencies.
  • Restructured SPI Config File: The SPI configuration YAML file has been restructured to include the following keys for each base SPI:
    • labels: base SPI specific labels (e.g., linear, non-linear, signed, etc.) that can be used by the filter function to create user-specified subsets of SPIs.
    • dependencies: external/system dependencies required by the base SPI (e.g., Octave for integrated information SPIs).
    • config: estimator settings and configurations e.g., EmpiricalCovariance for the Covariance base SPI.

Example YAML: Here is an example of how the phi_star_t1_norm-0 SPI would be specified

IntegratedInformation:
    labels:
      - undirected
      - nonlinear
      - unsigned
      - bivariate
      - time-dependent
    dependencies: 
      - octave
    configs:
      - phitype: "star"

Breaking Changes

This major version release introduces breaking changes for users who rely on custom SPI subsets (i.e., custom YAML files). Users relying on the pyspi default and pre-defined subsets are unaffected by these changes.

  • The octaveless subset has been removed, as it is no longer necessary due to the automatic dependency checks. Users without Octave installed can now run pyspi without specifying octaveless as a subset in the Calculator object.
  • Users who want to define a custom subset of SPIs should follow the new guide in the documentation to ensure their custom YAML file conforms to the new structure with labels, dependencies, and configs as keys.

Migration Guide

If you are an existing user of pyspi and have custom SPI subsets (custom YAML files), follow these steps to migrate to the new version:

  1. Review the updated structure of the SPI configuration YAML file (see the above example), which now includes labels, dependencies, and configs keys for each base SPI.
  2. Update your custom YAML files to match the new structure.
  3. If you were previously using the octaveless subset, you no longer need to specify it when instantiating the Calculator object. The dependency checks will automatically exclude Octave-dependent SPIs if Octave is not installed.

For more detailed instructions and examples, refer to the updated documentation.

Documentation

The pyspi documentation has been updated to reflect the new features and changes introduced in this release. You can find the latest documentation here.

Testing

  • Added unit tests for the new filter_spis function.
  • Added unit tests for the CalculatorFrame and CorrelationFrame.
  • Updated workflow file for Git Actions to use the latest checkout and python setup actions.

v0.4.2

3 months ago

Introduction

This patch release brings a few minor updates including a new high contrast logo for dark mode users, improved SPI unit testing (with a new benchmarking dataset) and fixes for potential security vulnerability issues.

Highlights of this release

  • New high contrast logo for dark-mode users.
  • Improved SPI unit testing with z-scoring approach to flag SPIs with differing outputs.
  • New coupled map lattice (CML) benchmarking dataset.
  • Fix for potential security vulnerability issues in scikit-learn.

What's Changed

  • Replaced the old standard_normal.npy benchmarking dataset with a coupled map lattice (cml7.npy), along with its associated .pkl file containing the benchmark values (CML7_benchmark_tables.pkl) generated in a fresh Ubuntu environment.
  • Updated the README to automatically select either the regular or new dark mode logo based on the user's theme.
  • Added new conftest.py file for pytest to customise the unit testing outputs.
  • Added a new pyproject.toml file for configuring the package for publishing to PyPI.

New features

  • Improved SPI unit testing with a new coupled map lattice benchmarking dataset (cml7.npy) consisting of 7 processes and 100 observations per process.
  • Z-scoring approach in unit testing pipeline to flag potential changes in SPI outputs as a result of algorithmic changes, etc. SPIs with outputs differing by more than a specified threshold are "flagged" and summarised in a table.
  • Added a darkmode pyspi logo to the README which is shown for users with the dark-mode GitHub theme.

Bug Fixes

  • Fixed a scikit-learn security vulnerability issue with severity "high" (pertaining to denial of service) by upgrading scikit-learn from version 0.24.1 to version 1.0.1.
  • Fixed Int64 deprecation issue (cannot import name Int64Index from pandas) by fixing pandas to version 1.5.0.
  • Fixed unknown character issue for Windows users resulting from not specifying an encoding when loading the "README" in setup.py. Now fixed to utf-8 for consistency across platforms.

v0.4.1

3 months ago

Introduction

PySPI v0.4.1 introduces several minor changes to the existing README, as well as migrating documentation from "readthedocs" to an all new "GitBook" page. Simple unit testing has also been incorporated for each of the SPIs using a benchmarking dataset to check for the consistency of outputs.

Highlights of this release

What's Changed

  • Removal of old /docs directory
  • Addition of a /tests directory for unit testing
  • Updated README
  • Addition of CODE_OF_CONDUCT.md and SECURITY.md

New features

  • Basic unit testing incorporated into a GitHub Actions workflow.
  • Updated README file with links to the new GitBooks hosted documentation to replace the old "readthedocs" documentation.
  • Added a code of conduct markdown
  • Added a security policy markdown

Bug Fixes

  • Fixed a PyTorch security vulnerability issue with severity "critical" (pertaining to arbitrary code execution) by updating torch from version 1.10.0 to 1.13.1.

v0.4

1 year ago
  • The directed info measure now uses entropy rate in its calculation to closer resemble the streaming method described in literature.
  • The code (mostly) uses black formatting now for readability

pynats-v0.1

1 year ago

This release is the version that was used for computing the results in the paper.

v0.3

2 years ago
  • Included fast compute option for the calculator (~10x speedup for bivariate time series)
  • Minor bug fixes

v0.2.0

2 years ago