Near-optimal RNA-Seq quantification
Kallisto index version is now index 13 (kallisto v0.50.0 had index version 12)
New features (kallisto index):
New features (kallisto bus technologies):
The improved kallisto index reduces memory consumption for large FASTA files and features a d-list option to improve k-mer mapping specificity. Additionally, new input and output features have been added as well as support for sample barcodes (which can be recorded in addition to cell barcodes).
For this release HDF5 is not a required dependency for running kallisto bus
for single cell RNA-seq analysis. It is still required for compatibility with sleuth and other downstream tools. By default kallisto will not be built with HDF5 support, this can be enabled by running
cmake .. -DUSE_HDF5=ON
The binaries for this release are compiled with HDF5 built in, but we will switch from using HDF5 in future versions (coordinated with sleuth).
When running kallisto quant
without HDF5 support
quant
without bootstrapping will create the same files as before, except for abundance.h5
quant
with bootstrapping, -b
, will not perform bootstrapping but displays the following warning
Warning: kallisto was not compiled with HDF5 support so no bootstrapping will be performed. Run quant with --plaintext option or recompile with HDF5 support to obtain bootstrap estimates.
quant
with -b k
and --plaintext
will create the bootstrap values in files bs_abundance_i.tsv
for i=0..k-1
For users relying on HDF5 support we recommend compiling kallilsto with HDF5 or downloading the kallisto binaries.
Over the next releases HDF5 will gradually be phased out and information on bootstraps will be replaced with a new format.
kallisto pseudo
outputs a file of transcript idskallisto bus
allows having sequence split across more than one file, closes #226This release adds options for parsing the inDrops technology (versions 2 and 3 are new) as well as specifying input from BAM files rather than raw FASTQ files.
This version adds the option of specifying an arbitrary single cell technology for the bus
command in kallisto.
This release adds 10xv3 as a technology option for the bus
command.
Bug fixes
pseudo
mode.-l
flag for bus
was inactive.Changes from v0.44.0
kallisto
can now process raw FASTQ files for single cell RNA-Seq and create an output in BUS format which can be further processed using bustools
To process single cell data run kallisto
with the bus
command. To see a list of supported technologies, run with the --list
option
> kallisto bus --list
List of supported single cell technologies
short name description
---------- -----------
10Xv1 10X chemistry version 1
10Xv2 10X chemistry verison 2
DropSeq DropSeq
inDrop inDrop
CELSeq CEL-Seq
CELSeq2 CEL-Seq version 2
SCRBSeq SCRB-Seq
Changes from v0.43.1
kallisto
can now project pseudoalignments from transcripts down to genomic coordinates. This requires a GTF file corresponding to the transcriptome used to construct the index. The resulting BAM file is sorted by genomic coordinates and indexed.
--pseudobam
option works as before in transcript coordinates, but creates a single output pseudoalignments.bam
in the output folder. This mode no longer writes SAM format to standard output, but writes the binary BAM file directly. Multithreaded --pseudobam
works now--genomebam
option writes pseudoalignments to the file pseudoalignments.bam
in sorted genomic coordinates, requires a --gtf
option and optionally a --chromosomes
options set.Adds a --single-overhang
option that does not discard reads where unobserved rest of fragment is predicted to lie outside a transcript. This is mainly useful for mapping 3' biased reads from single cell experiments.
Adds QC information to run_info.json
in the output folder
The added fields are
n_pseudoaligned
: number of fragments that could be pseudoalignedp_pseudoaligned
: percentage of fragments that could be pseudoalignedn_unique
: number of fragments that could be pseudoaligned to a unique target sequencep_unique
: percentage of fragments that could be pseudoaligned to a unique target sequenceChanges from v0.43.0
kallisto can now find reads which span potential fusion breakpoints. The quant
mode adds a --fusion
flag which identifies read pairs involved in fusions and writes output to fusion.txt
, this file is then processed by pizzly
for downstream analysis.
Switched to a uniform point for the EM algorithm that works better in highly ambiguous cases.
Several fixes to the pseudobam output so that the resulting SAM/BAM file can be validated with picard.
nan
for tpm values).