Ncbi Genome Download Versions Save

Scripts to download genomes from the NCBI FTP servers

0.3.3

9 months ago

This is release 0.3.3 of ncbi-genome-download.

This release has no new features on top of 0.3.2 but adds some information on how to cite the software.

Detailed changes:

Kai Blin (5):
      README: Add citation information
      CITATION: Add a citation metadata file
      CITATION: Second attempt at generating a valid CITATION.rff file
      CITATION: use the correct file name for the citation metadata file
      Bump version to 0.3.3

0.3.2.zenodo-init

9 months ago

This is a re-release 0.3.2 of ncbi-genome-download, to get a Zenodo DOI generated. It's functionally identical to the existing 0.3.2 release.

Major changes of this release are:

  • Add support for the new format assembly info headers, fixing downloads.
  • Support for the translated-cds format (thanks @SwiftSeal)
  • Allow fuzzy searches for accessions
  • Cache the MD5SUMS files for a day as well, to make re-starting the download easier.

Detailed changes:

Kai Blin (9):
      core: Actually expose the --fuzzy-accessions logic built back in 2019 on the command line
      core: Improve the --refseq-categories tooltip
      core: Only re-download MD5SUMS if they're more than a day old
      core: Show progress bar both for downloading MD5SUMS and data files
      chore: update the python versions for the CI workflow
      chore: Remove old drone CI integration
      chore: Reformat README.md to fix markdownlint errors
      summary: Support the new format summary files
      Bump version to 0.3.2

Moray Smith (1):
      Update config.py

0.3.2

9 months ago

This is release 0.3.2 of ncbi-genome-download.

Major changes of this release are:

  • Add support for the new format assembly info headers, fixing downloads.
  • Support for the translated-cds format (thanks @SwiftSeal)
  • Allow fuzzy searches for accessions
  • Cache the MD5SUMS files for a day as well, to make re-starting the download easier.

Thanks also to @chasemc and @twelvesummer for submitting patches for the header format.

Detailed changes:

Kai Blin (9):
      core: Actually expose the --fuzzy-accessions logic built back in 2019 on the command line
      core: Improve the --refseq-categories tooltip
      core: Only re-download MD5SUMS if they're more than a day old
      core: Show progress bar both for downloading MD5SUMS and data files
      chore: update the python versions for the CI workflow
      chore: Remove old drone CI integration
      chore: Reformat README.md to fix markdownlint errors
      summary: Support the new format summary files
      Bump version to 0.3.2

Moray Smith (1):
      Update config.py

0.3.1

2 years ago

This is release 0.3.1 of ncbi-acc-download.

Main features of this release are:

  • support for progress bars (thanks to @444thLiao)
  • various bug fixes (thanks @peterjc and @jrjhealey)

Detailed changes:

Joe Healey (1):
      remove unused function and update email

Kai Blin (8):
      core: Change the progress bar shorthand to -P, default to no progress bar
      chore: Use pytest.fixture instead of deprecated pytest.yield_fixture
      core: Don't attempt to download metagenome info from refseq even if group is all
      chore: Ignore more IDE files
      chore: Fix all linter errors reported by flake8
      Makefile: switch linting to flake8 to match CI setup
      chore: Update mailmap
      Bump version number to 0.3.1

Peter Cock (1):
      Fixed typo in command line API help

Tianhua Liao (2):
      add progress bar
      fix repeat bars

0.3.0

3 years ago

This is release 0.3.0 of ncbi-genome-download.

This is a release breaking backwards compatibility a bit, hence the new minor relase number. If you are just using the command line tool, everything should still work, but note that some of the options have changed to their plural forms. If you are using the API, you need to update your code to use the new plural forms of the option names.

This version also no longer supports Python 2.7.

In addition, this version also contains some contributed features or bugfixes:

  • gimme_taxa.py now is installable, thanks Istvan (@ialbert)
  • We no longer break on FTP entries without an FTP path, thanks Paul (@openpaul)
  • We now raise an error if you try to download metagenomes from RefSeq. Thanks again, Paul (@openpaul)
  • Updated Chinese README file, thanks James (@jamesyangget)
  • We no longer leak pool workers when running parallel downloads, thanks Gerrit (@Wrzlprmft)

Detailed changes:

Gerrit Ansmann (2):
      Using context manager for pool. This should at least partially fix Issue #120.
      Restructuring to avoid excessively long and complicated line.

Istvan Albert (1):
      made gimme_taxa.py an installable script

James Yang (3):
      fix and update translation
      Update README-CN.md
      Update README-CN.md

Kai Blin (18):
      core: Make get_name_and_checksum not skip the wrong files
      config: Init section before group
      main: Print nicer error messages on invalid arguments
      config: Add tests for new 'no metagenomes in refseq' check
      chore: Update contributor map
      chore: Drop python2 compatibility code
      chore: Style pass to make flake8 happy
      chore: Update supported Python versions in README
      chore: set up GitHub workflow for testing and publishing
      chore: Add vim config directory to gitignore
      core: Make acceptable --refseq-categories a list
      chore: Use py3 to test in drone as well
      core: Fix default for new list-based --refseq-category parameter
      core: Split strain parsing from strain label generation
      core: Also allow to filter by strain
      core: Also show strain in the dry-run listing
      core: Break the API so all list types now use the plural form.
      Bump version number to 0.3.0

Paul Saary (2):
      move na check into filter function
      raise warning if refseq metagenome is requested, as there is no such thing at the moment

0.2.12

4 years ago

This is release 0.2.12 of ncbi-genome-download.

Highlights of this release are:

  • Parallel downloads of checksum files (Thanks to Adelme Bazin (@axbazin))
  • New --flat-output option to dump all downloaded files into a single directory
  • We now have a Chinese translation of the README (Thanks James Yang (@jamesyangget))

Detailed changes:

Adelme Bazin (5):
      core : Checking MD5SUM in parallel when more than one process is allowed
      add integration test to check if metadata table was filled properly. Expected failure when using multiprocessing.
      fill metadata table in config_download instead of download_file_job to avoid problems with multiprocessing
      modify the metadata_fill test functions so that they follow the same logic than the current code
      fix the call to Pool with the 'with' statement for python2

Kai Blin (13):
      core: Allow keeping the downloaded files in a flat hierarchy
      chore: Break long help text lines
      core: Fix the --flat-output description
      README: Update install documentation
      core: Add tests for type material downloads
      chore: Add docstring and coverage skip to `downloadjob_creator_caller`
      config: Test `is_compatible_assembly_accession()` in fuzzy_accession mode
      core: Add a docstring to new fill_metadata function
      core: Acquire the metadata table object outside of the download loops
      config: Also support downloading metagenomes
      core: Check for exact match on genus name before trying to capitalise it
      chore: Update README to note that 0.2.12 is the last version to support Python 2
      Bump version number to 0.2.12

James Yang (1):
      translate README.md into Chinese (#97)

0.2.11

4 years ago

This is release 0.2.11 of ncbi-genome-download which fixes two logging issues.

Thanks to David Morgan (@Cptmorgan27) for providing a patch.

Detailed changes:

David Morgan (1):
      core: remove print statement for type material

Kai Blin (4):
      chore: Use a named logger instead of the root logger
      README: Make it clearer that more than just bacteria and viral groups are available
      chore: Remove landscape.io link, as that service seems dead
      Bump version number to 0.2.11

0.2.10

4 years ago

This is a bugfix release to ncbi-genome-download also adding two convenience features.

Major changes are:

  • Use realtive instead of absolute symlinks for human-readable output (thanks @chrisgulvik)
  • No longer crash on abnormal organism names (thanks to @andrewsanchez for the initial pull request)
  • Allow for fuzzy matching of both organism name and accessions

Detailed changes:

Chris Gulvik (1):
      create_symlink func modified to create relative rather than absolute symbolic links; resolves #62

Kai Blin (5):
      core: Allow for fuzzy matching for specified organism names
      core: Allow for fuzzy matching of specified accessions
      chore: Fix two whitespace issues
      core: Deal with organism names that don't contain a species part
      Bump version number to 0.2.10

0.2.9

5 years ago

This release adds the "relation to type material filter" contributed by Jason Davis-Cooke. Thanks for that.

Detailed changes:

Jason Davis-Cooke (1):
      feat(core): add 'relation to type material' as as filtering option (#82)

Kai Blin (2):
      README: Document the type material filter option
      Bump version number to 0.2.9

0.2.8

5 years ago

This is mainly a bugfix release fixing a UnicodeEncodeError when writing to a --metadata-table file with non-ASCII entries like in record GCF_000234725.1.

Thanks to @danudwary and @jananiravi for the error reports.

Also thanks to Tessa Pierce and Joe Healey for their contributions.

Detailed changes:

Joe Healey (1):
      update readme with conda install

Kai Blin (3):
      config: Change a tab indent to spaces
      core: Open metatable file with utf-8 encoding
      Bump version number to 0.2.8

Tessa Pierce (1):
      add support for rm (repeat masked) eukaryotic genomes