goleft is a collection of bioinformatics tools distributed under MIT license in a single static binary
Indexcov was tuned on bam indexes. The cram (crai) indexes always looked much worse. See below for a before and after on the same data.
This required indexcov to be run with the --extranormalize
flag. This is highly recommended if you are using crai files. It can clean up bam data as well. A reason to not use it would be if you want to do your own scaling/cleanup on the (more) raw data.
before:
after:
indexcov
: better message on empty craiindexcov
: allow sending indexes as globs (to avoid argument length limit)indexcov
: if expected sex chroms are X,Y also find chrX,chrY (and use sorted order for output)indexcov
:more defense for bad crai'sindexcov
: now works on just *.bai
if a .fai is also given with -f.indexcov
: better error message and handling of excluded chromosomes.indexcov
: if given crais and no fai, indexcov will try to read the cram header using samtools view
.indexsplit
: fix rare panic in CRAI files due to an off-by-one error (thanks Lavanya for reporting and providing a test-case).indexcov
: fix for long reads with cram. (#43)indexsplit
: fix off-by-one that resulted in double-counting some regions.indexcov
: cram edge-cases.indexcov
: better normalization to 1 for all cases. Fixes bug for bams with
many (e.g. > 10K) chromosomes of which many have very low or normalization
coverage. (#36)indexcov
: dont error when no sex chromosomes are found (#27).indexcov
: dont error when some chromosomes have a single region and others have 0.indexcov
: better checking on short sex chroms and other CRAI fixes.(thanks to Javier Prado for several test-cases on CRAM)
indexcov
: report and plot number of mapped and unmapped reads as reported by the index.covmed
: rename to covstats
covstats
: report samplename(s) from read groups as well as bam path.covstats
: skip first 100K reads to give better estimates of depth.covstats
: report percent of bad (QC-Fail|Duplicate) and of umapped reads.indexcov
: automatically exclude chromosomes that match pattern: ^chrEBV$|^NC|_random$|Un_|^HLA\-|_alt$|hap\d$
.
this can be adjusted from the command-line.