PySCENIC Versions Save

pySCENIC is a lightning-fast python implementation of the SCENIC pipeline (Single-Cell rEgulatory Network Inference and Clustering) which enables biologists to infer transcription factors, gene regulatory networks and cell types from single-cell RNA-seq data.

0.12.1

1 year ago

Updates:

Add support for running arboreto_with_multiprocessing.py with spawn instead of fork as multiprocessing method.Pool
Use ravel instead of flatten to avoid unnecessary memory copy in aucell
Update Docker image file and add separated Docker file for pySCENIC with scanpy.

0.12.0

1 year ago

Updates:

Only databases in Feather v2 format are supported now (ctxcore >= 0.2), which allow uses recent versions of pyarrow (>=8.0.0) instead of very old ones (<0.17). Databases in the new format can be downloaded from https://resources.aertslab.org/cistarget/databases/ and end with *.genes_vs_motifs.rankings.feather or *.genes_vs_tracks.rankings.feather.
Support clustered motif databases.
Use custom multiprocessing instead of dask, by default.
Docker image uses python 3.10 and contains only needed pySCENIC dependencies for CLI usage.
Remove unneeded scripts and notebooks for unused/deprecated database formats.

0.11.2

3 years ago

Major changes:

Split some core cisTarget functions out into a separate repository, ctxcore. This is now a required package for pySCENIC.
Documentation updates

0.11.1

3 years ago

Fix bug in motif url construction (#275)
Fix for export2loom with sparse dataframe (#278)
Fix sklearn t-SNE import (#285)
Updates to Docker image (expose port 8787 for Dask dashboard)

0.11.0

3 years ago

Major features:

Updated Arboreto release (GRN inference step) includes:
- Support for sparse matrices (using the --sparse flag in pyscenic grn, or passing a sparse matrix to grnboost2/genie3).
- Fixes to avoid dask metadata mismatch error
Updated cisTarget:
- Fix for metadata mismatch in ctx prune2df step
- Support for databases Apache Parquet format
- Faster loading from feather databases
- Bugfix: loading genes from a database (previously missing the last gene name in the database)
Support for Anndata input and output
Package updates:
- Upgrade to newer pandas version
- Upgrade to newer numba version
- Upgrade to newer versions of dask, distributed
Input checks and more descriptive error messages.
- Check that regulons loaded are not empty.
Bugfixes:
- In the regulons output from the cisTarget step, the gene weights were incorrectly assigned to their respective target genes (PR #254).
- Motif url construction fixed when running ctx without pruning
- Compression of intermediate files in the CLI steps
- Handle loom files with non-standard gene/cell attribute names
- Reformat the genesig gmt input/output
- Fix AUCell output to loom with non-standard loom attributes

0.10.4

3 years ago

Updates:

Included new (optional) CLI option to add correlation information to the GRN adjacencies file. This can be called with pyscenic add_cor. (vib-singlecell-nf/vsn-pipelines/issues/254)
- The correlation calculation is subsequently skipped if using this adjacencies + correlations file as the input into pyscenic ctx.

0.10.3

3 years ago

Updates:

Fix bug in motif url construction (#158)
Integrate arboreto multiprocessing script into pySCENIC CLI
cisTarget step: Check for modules with zero db overlap and skip them (#158, #177, #132, #85)
Bugfix in TF-gene correlation calculation. Quit with error if there is a mismatch between the genes present in the GRN and the expression matrix (#103, #149)
Error message when regulons file is empty (#133)

0.10.2

3 years ago

Updates:

Bugfix for CLI grn step

0.10.1

3 years ago

Updates:

CLI: file compression (optionally) enabled for intermediate files for the major steps: grn (adjacencies matrix), ctx (regulons), and aucell (auc matrix). Compression is used when the file name argument has a .gz ending.
Restrict packages (pyarrow, pandas) for compatibility.

0.10.0

4 years ago

Updates:

Added a helper script scripts/arboreto_with_multiprocessing.py that runs the Arboreto GRN algorithms (GRNBoost2, GENIE3) without Dask for compatibility.
Initial support for the use of sparse expression matrices (applies only to the GRN step in the CLI currently). Using sparse matrices with the GRN step requires a patch to Arboreto (tmoerman/arboreto#20)
AUCell uses a random sampling of the expression matrix to break ties in the ranking step. The CLI parameter --seed or aucell function parameter seed uses a fixed seed for this step. The regulon thresholds also depend on random sampling in the binarization step (bimodality test: np.random.uniform), and the seed parameters apply here as well.
Fixed typo in regulon threshold labeling
Security patch: Bump bleach from 3.1.0 to 3.1.1.