PySCENIC Versions Save

pySCENIC is a lightning-fast python implementation of the SCENIC pipeline (Single-Cell rEgulatory Network Inference and Clustering) which enables biologists to infer transcription factors, gene regulatory networks and cell types from single-cell RNA-seq data.

0.12.1

1 year ago

Updates:

  • Add support for running arboreto_with_multiprocessing.py with spawn instead of fork as multiprocessing method.Pool
  • Use ravel instead of flatten to avoid unnecessary memory copy in aucell
  • Update Docker image file and add separated Docker file for pySCENIC with scanpy.

0.12.0

1 year ago

Updates:

  • Only databases in Feather v2 format are supported now (ctxcore >= 0.2), which allow uses recent versions of pyarrow (>=8.0.0) instead of very old ones (<0.17). Databases in the new format can be downloaded from https://resources.aertslab.org/cistarget/databases/ and end with *.genes_vs_motifs.rankings.feather or *.genes_vs_tracks.rankings.feather.
  • Support clustered motif databases.
  • Use custom multiprocessing instead of dask, by default.
  • Docker image uses python 3.10 and contains only needed pySCENIC dependencies for CLI usage.
  • Remove unneeded scripts and notebooks for unused/deprecated database formats.

0.11.2

3 years ago

Major changes:

  • Split some core cisTarget functions out into a separate repository, ctxcore. This is now a required package for pySCENIC.
  • Documentation updates

0.11.1

3 years ago
  • Fix bug in motif url construction (#275)
  • Fix for export2loom with sparse dataframe (#278)
  • Fix sklearn t-SNE import (#285)
  • Updates to Docker image (expose port 8787 for Dask dashboard)

0.11.0

3 years ago

Major features:

  • Updated Arboreto release (GRN inference step) includes:

    • Support for sparse matrices (using the --sparse flag in pyscenic grn, or passing a sparse matrix to grnboost2/genie3).
    • Fixes to avoid dask metadata mismatch error
  • Updated cisTarget:

    • Fix for metadata mismatch in ctx prune2df step
    • Support for databases Apache Parquet format
    • Faster loading from feather databases
    • Bugfix: loading genes from a database (previously missing the last gene name in the database)
  • Support for Anndata input and output

  • Package updates:

    • Upgrade to newer pandas version
    • Upgrade to newer numba version
    • Upgrade to newer versions of dask, distributed
  • Input checks and more descriptive error messages.

    • Check that regulons loaded are not empty.
  • Bugfixes:

    • In the regulons output from the cisTarget step, the gene weights were incorrectly assigned to their respective target genes (PR #254).
    • Motif url construction fixed when running ctx without pruning
    • Compression of intermediate files in the CLI steps
    • Handle loom files with non-standard gene/cell attribute names
    • Reformat the genesig gmt input/output
    • Fix AUCell output to loom with non-standard loom attributes

0.10.4

3 years ago

Updates:

  • Included new (optional) CLI option to add correlation information to the GRN adjacencies file. This can be called with pyscenic add_cor. (vib-singlecell-nf/vsn-pipelines/issues/254)
    • The correlation calculation is subsequently skipped if using this adjacencies + correlations file as the input into pyscenic ctx.

0.10.3

3 years ago

Updates:

  • Fix bug in motif url construction (#158)
  • Integrate arboreto multiprocessing script into pySCENIC CLI
  • cisTarget step: Check for modules with zero db overlap and skip them (#158, #177, #132, #85)
  • Bugfix in TF-gene correlation calculation. Quit with error if there is a mismatch between the genes present in the GRN and the expression matrix (#103, #149)
  • Error message when regulons file is empty (#133)

0.10.2

3 years ago

Updates:

  • Bugfix for CLI grn step

0.10.1

3 years ago

Updates:

  • CLI: file compression (optionally) enabled for intermediate files for the major steps: grn (adjacencies matrix), ctx (regulons), and aucell (auc matrix). Compression is used when the file name argument has a .gz ending.
  • Restrict packages (pyarrow, pandas) for compatibility.

0.10.0

4 years ago

Updates:

  • Added a helper script scripts/arboreto_with_multiprocessing.py that runs the Arboreto GRN algorithms (GRNBoost2, GENIE3) without Dask for compatibility.
  • Initial support for the use of sparse expression matrices (applies only to the GRN step in the CLI currently). Using sparse matrices with the GRN step requires a patch to Arboreto (tmoerman/arboreto#20)
  • AUCell uses a random sampling of the expression matrix to break ties in the ranking step. The CLI parameter --seed or aucell function parameter seed uses a fixed seed for this step. The regulon thresholds also depend on random sampling in the binarization step (bimodality test: np.random.uniform), and the seed parameters apply here as well.
  • Fixed typo in regulon threshold labeling
  • Security patch: Bump bleach from 3.1.0 to 3.1.1.