Seqkit Versions Save

A cross-platform and ultrafast toolkit for FASTA/Q file manipulation

v2.8.1

1 month ago

Notice: I forgot to update the version number, so seqkit version will return 2.8.0.

Changelog

SeqKit v2.8.1 - 2024-04-07
- seqkit sana:
  - Add support for FASTQ files with IDs in the separator (+, 3rd) lines.. #446, #429, #408
- seqkit subseq:
  - Add some docs to show how to keep the original order of sequences when extracting with BED: compress the input FASTA file. #451

Links

OS	Arch	File, 中国镜像
Linux	32-bit	seqkit_linux_386.tar.gz, 中国镜像
Linux	64-bit	seqkit_linux_amd64.tar.gz, 中国镜像
Linux	arm64	seqkit_linux_arm64.tar.gz, 中国镜像
macOS	64-bit	seqkit_darwin_amd64.tar.gz, 中国镜像
macOS	arm64	seqkit_darwin_arm64.tar.gz, 中国镜像
Windows	32-bit	seqkit_windows_386.exe.tar.gz, 中国镜像
Windows	64-bit	seqkit_windows_amd64.exe.tar.gz, 中国镜像

Notes

please open an issuse to request binaries for other platforms.
run seqkit version to check update !!!
run seqkit genautocomplete to update shell autocompletion script !!!

Please cite:

Wei Shen*, Botond Sipos, and Liuyang Zhao. 2024. SeqKit2: A Swiss Army Knife for Sequence and Alignment Processing. iMeta e191. doi:10.1002/imt2.191.
Wei Shen, Shuai Le, Yan Li*, and Fuquan Hu*. SeqKit: a cross-platform and ultrafast toolkit for FASTA/Q file manipulation. PLOS ONE. doi:10.1371/journal.pone.0163962.

v2.8.0

2 months ago

Changelog

SeqKit v2.8.0 - 2024-03-11
- seqkit stats:
  - Add column N50_num, an alias of L50, #15.
- seqkit seq/locate/fish/watch:
  - Removing the flag -V/--validate-seq-length. Now the whole sequence will be checked if -v/--validate-seq is given.
- seqkit amplicon:
  - Fix the speed problem, introduced in v2.7.0. #439.
  - Slightly faster by reusing objects.
- seqkit seq:
  - Change the threshold sequence length for parallelizing complement sequence computation, 1kb->1Mb.

v2.7.0

3 months ago

Current Version

SeqKit v2.7.0 - 2024-01-31
- seqkit:
  - Grouping subcommands in help message, which is intuitive for beginners.
- seqkit grep:
  - New flag: -D/--allow-duplicated-patterns for outputting records multiple times when duplicated patterns are given. #427
- seqkit subseq:
  - Use the ID regular expression from the option --id-regexp to create FASTA index file. This solves the panic happened for sequences containing tabs in the headers. #432
- seqkit split/sort/shuffle:
  - When using the two-pass mode (-2/--two-pass), replace possible tabs in the sequence header.
- seqkit rmdup:
  - Write an empty file of duplicate numbers and lists of IDs even if there's no duplicates when using -D/--dup-num-file. #436
- seqkit stats:
  - New flag -S/--skip-file-check to skip input file checking when given files or a file list. It's very useful if you run it with millions of files.

v2.6.1

6 months ago

Changelog

SeqKit v2.6.1 - 2023-11-18
- seqkit:
  - fix panic of nil pointer introduced in v2.6.0, which happens when handling multiple input files and some of them have file sizes of zero.
- seqkit seq:
  - fix panic (close of closed channel) when using -v to checking sequences.

v2.6.0

6 months ago

Changes

SeqKit v2.6.0 - 2023-11-09
- seqkit:
  - add the shortcut -X for the flag --infile-list.
- seqkit common:
  - add a new flag -e/--check-embedded-seqs for detecting embedded sequences.
  - for matching by sequences: reduced the memory occupation and corrected numbers in the log. #416
- seqkit stat:
  - add a new column AvgQual for average quality score. #411
- seqkit split2:
  - fix the panic for invalid input.
- seqkit subseq:
  - add a new flag -R/--region-coord for appending coordinates to sequence ID for -r/--region. #413
- seqkit locate:
  - add a new flag -s/--max-len-to-show to show at most X characters for the search pattern or matched sequences.
- seqkit seq:
  - change the nucleotide color theme. #412

v2.5.1

9 months ago

Changes

SeqKit v2.5.1 - 2023-08-09
- seqkit stats:
  - fix a concurrency bug (file name error) introduced in v2.5.0. #405
- seqkit subseq:
  - sequence/chromosome IDs are case-sensitive now. #400

v2.5.0

10 months ago

Changes

SeqKit v2.5.0 - 2023-07-16
- new command seqkit merge-slides: merge sliding windows generated from seqkit sliding. #390
- seqkit stats:
  - added a new flag -N/--N for appending other N50-like stats as new columns. #393
  - added a progress bar for > 1 input files.
  - write the result of each file immediately (no output buffer) when using -T/--tabular.
- seqkit translate:
  - add options -s/--out-subseqs and -m/--min-len to write ORFs longer than x amino acids as individual records. #389
- seqkit sum:
  - do not remove possible '*' by default and delete confusing warnings. Thanks to @photocyte. #399
  - added a progress bar for > 1 input files.
- seqkit pair:
  - remove the restriction of requiring FASTQ format, i.e., FASTA files are also supported.
- seqkit seq:
  - update help messages. #387
- seqkit fxtab:
  - faster alphabet computation (-a/--alphabet) with a new data structure. Thanks to @elliotwutingfeng #388
- seqkit subseq:
  - accept reverse coordinates in BED/GTF. #392

v2.4.0

1 year ago

Changes

SeqKit v2.4.0 - 2023-03-17
- seqkit:
  - support bzip2 format. #361
  - support setting compression level for gzip, zstd, and bzip2 format via --compress-level. #320
  - the global flag --infile-list accepts stdin (-) now.
  - wrap the help message of flags.
- seqkit locate:
  - do not remove embeded regions when searching with regular expressions. #368
- seqkit amplicon:
  - fix BED coordinates for amplicons found in the minus strand. #367
- seqkit split:
  - fix forgetting to add extension for --two-pass. #332
- seqkit stats:
  - fix compute Q1 and Q3 of sequence length for one record. #353
- seqkit grep:
  - fix count number (-C) for matching with mismatch (-m > 0). #370
- seqkit replace:
  - add some flags to match partly records to edit; these flags are transplanted from seqkit grep. #348
- seqkit faidx:
  - allow empty lines at the end of sequences.
- seqkit faidx/sort/shuffle/split/subseq:
  - new flag -U/--update-faidx: update the FASTA index file if it exists, to guarantee the index file matches the FASTA files. #364
  - improve log info and update help message. #365
- seqkit seq:
  - allow filtering sequences of length zero. thanks to @penglbio.
- seqkit rename:
  - new flag -s/--separator for setting separator between original ID/name and the counter (default "_"). #360
  - new flag -N/--start-num for setting starting count number for duplicated IDs/names (default 2). #360
  - new flag -1/--rename-1st-rec for renaming the first record as well. #360
  - do not append space if there's no description after the sequene ID.
- seqkit sliding:
  - new flag -S/--suffix for change the suffix added to the sequence ID (default: "_sliding").

v2.3.1

1 year ago

Changes

SeqKit v2.3.1 - 2022-09-22
- seqkit grep/locate: fix bug of FMIndex building for empty sequences. #321
- seqkit split2: fix bug of splitting two FASTA files. #325
- seqkit faidx: --id-regexp works now.

v2.3.0

1 year ago

Changes

SeqKit v2.3.0 - 2022-08-12
- seqkit grep/rename:
  - reduce memory comsumption for a lot of searching patterns, and it's faster. #305
  - 2X faster -s/--by-seq.
- seqkit split
  - fix outputting an empty file when the number of sequence equal to the split size. #293
  - add options to set output file prefix and extention. #296
- seqkit split2
  - reduce memory consumption. #304
  - add options to set output file prefix
- seqkit stats:
  - add GC content. #294