Disent Versions Save

🧶 Modular VAE disentanglement framework for python built with PyTorch Lightning ▸ Including metrics and datasets ▸ With strongly supervised, weakly supervised and unsupervised methods ▸ Easily configured and run with Hydra config ▸ Inspired by disentanglement_lib

v0.7.2

1 year ago

What's Changed

Full Changelog: https://github.com/nmichlo/disent/compare/v0.7.1...v0.7.2

v0.7.1

1 year ago

What's Changed

Full Changelog: https://github.com/nmichlo/disent/compare/v0.7.0...v0.7.1

v0.8.0

1 year ago

What's Changed

Full Changelog: https://github.com/nmichlo/disent/compare/v0.7.2...v0.8.0

v0.7.0

1 year ago

What's Changed

Full Changelog: https://github.com/nmichlo/disent/compare/v0.6.3...v0.7.0

v0.6.3

1 year ago

What's Changed

New Contributors

Full Changelog: https://github.com/nmichlo/disent/compare/v0.6.2...v0.6.3

v0.6.2

1 year ago

Fixes

  • Fix examples that use num_workers != 0. When using DataLoaders with multiple workers, these need to be run from within if __name__ == '__main__': ...
  • Fix torch_optimizer>=0.1.0,!=0.2 in requirements.txt

v0.6.1

1 year ago

Fixes

  • fix circular import in disent.frameworks.vae and disent.frameworks.ae
  • fix requirements.txt, limit: pytorch-lightning>=1.4.0,<1.7

Tests

  • Add test for circular import

Additions

  • add dfc pairwise loss to dfc loss module

v0.6.0

1 year ago

Fixes

  • MPI3D was not correctly loaded, first few factors were misaligned
    • Recomputed statistics for new datasets and updated configs

Additions

  • Added disent.data.data.DataFileSprites, a custom version of sprites
    • Added experiment configs and computed dataset statistics
  • Multiple version of disent.dataset.data.Mpi3dData now exist, for different use cases because the dataset is so large
    • added Mpi3dHdf5Data -- converts the files to hdf5 to stream from disk, but very slow to load into memory directly
    • added Mpi3dNumpyData -- loads the files directly into memory (quick), cannot read from disk
    • changed: Mpi3dData is now a wrapper around both of the above, and the mode can be specified with in_memory
  • disent.dataset.util.state_space.StateSpace
    • Added init checks
    • Added helper method invert_factor_idxs that returns the unspecified factor indices, or the inverse set.
    • Added helper method sample_indices that samples valid indices in the range of the dataset.
    • Improved sampling and other methods that take in factors to first call normalise_factor_idxs so that we can use factor names in these functions instead.
    • Added helper method sample_random_factor_traversal_grid that samples a grid of traversals, one for each ground-truth factor.
  • Added disent.util.inout.paths.modify_ext(...) that modifies the extension of a path

Breaking changes

  • move disent.dataset.util.npz to disent.dataset.util.formats.npz
  • move disent.dataset.util.hdf5 to disent.dataset.util.formats.hdf5
  • disent.util.inout.hashing.hash_file now has missing_ok=False by default

Minor Fixes

  • Fix stalefile now correctly handles missing files
  • Various plotting fixes, now functions support RGBa images not just grey or RGB images.

New Tests

  • Added some new tests for both dataset formats and state spaces

TODO:

  • Added Teapots3dData but it is not complete, needs to be converted to a "random" dataset, as this dataset does not actually have valid ground truth factors in the form of a state space, rather they are randomly sampled.

v0.5.1

1 year ago

Fixes

  • Fix #32 Ada-GVAE averaging regressions

v0.5.0

2 years ago

This release marks the end of my MSc. and splitting the research out into its own repository!

  • The repo was previously setup such that development took place on an xdev branch. An automated script was then used to clean this branch of research code and commit the changes to the dev branch, which was then published.
  • This has now been disabled in favour of standard dev practice. I no longer need to maintain the old research code and can incorporate this functionality directly into disent.

MSc. Additions

  • disent.dataset.data - Various new datasets!

    • XYObjectData and XYObjectShadedData equivalent datasets with different representations of their ground-truth factors. Disentanglement performance is affected by the choice of ground-truth factors even if the data is exactly the same!
    • XYSquaresData is an adversarial dataset for VAEs that use pixel-wise reconstruction losses. VAEs usually perform terribly on this dataset in terms of disentanglement performance. This dataset contains three squares that can move across a non-overlapping grid.
    • XYSingleSquareData is like XYSquaresData but only has a single square that can move across the image.
    • XColumnsData is a simplistic version of XYSquaresData that is still adversarial, but only moves columns left and right instead of an object across a grid.
  • disent.frameworks.vae

    • AdaNegTripletVae aka. "ada_tvae": Supervised disentanglement framework that uses our proposed Adaptive Triplet Loss to disentangle representations and introduce axis-alignment. Triplets are constructed using the L1 distance between ground-truth factors.
    • DataOverlapTripletVae aka. "ada_tvae_d": Unsupervised version of the AdaNegTripletVae that order triplets using the distances between datapoints in terms of the reconstruction loss. Distances within disentanglement datasets often correspond to the distances between ground-truth factors, suggesting disentanglement is accidental!
  • disent.frameworks.ae

    • AdaNegTripletAe aka. "ada_tae" - The AE version of AdaNegTripletVae
    • DataOverlapTripletAe aka. "ada_tae_d" - The AE version of DataOverlapTripletVae
    • AdaAe - The AE version of the AdaVae
  • disent.metrics

    • flatness_components consists of three separate metrics
      • distances: measure the rank correlation between ground-truth distances and latent distances
      • linearity: measure how well factor traversal embeddings lie on an arbitrarily rotated n-dimensional line
      • axis-alignment: measure how well factor traversal embeddings correspond to a single latent variable (ie. an n-dimensional line that is axis-aligned).
    • flatness an older metric that measures the path length of factor traversal embeddings over the max distance between points.
  • experiment/configs updated to included configs for all the added classes, frameworks, datasets, metrics and features!

    • new schedules schedule/adanegtvae_*.yaml that should be used with the Adaptive Triplet frameworks. Otherwise these frameworks do not learn.

MSc. Removals

  • All the remaining research code contained in research/* has been deleted

Add Examples

  • Added an example docs/examples/extend_experiment of how to override or extend the disent experiment conifigs! This is useful for creating your own research!
  • Added an example of plotting various aspects of disent docs/examples/plotting_examples.

Fixes

  • Fixed tests for new locations
  • Added appropriate entries to the registry