Pachterlab Gget Versions Save

🧬 gget enables efficient querying of genomic reference databases

v0.28.4

3 months ago

Fix Windows bug in gget elm setup

v0.28.3

3 months ago
  • gget search and gget ref now also support fungi 🍄, protists 🌝, and invertebrate metazoa 🐝 🐜 🐌 🐙 (in addition to vertebrates and plants)
  • New module: gget cosmic
  • gget enrichr: Fix duplicate scatter dots in plot when pathway names are duplicated
  • gget elm:
    • Changed ortho results column name 'Ortholog_UniProt_ID' to 'Ortholog_UniProt_Acc' to correctly reflect the column contents, which are UniProt Accessions. 'UniProt ID' was changed to 'UniProt Acc' in the documentation for all gget modules.
    • Changed ortho results column name 'motif_in_query' to 'motif_inside_subject_query_overlap'.
    • Added interaction domain information to results (new columns: "InteractionDomainId", "InteractionDomainDescription", "InteractionDomainName").
    • The regex string for regular expression matches was encapsulated as follows: "(?=(regex))" (instead of directly passing the regex string "regex") to enable capturing all occurrences of a motif when the motif length is variable and there are repeats in the sequence (https://regex101.com/r/HUWLlZ/1).
  • gget setup: Use the out argument to specify a directory the ELM database will be downloaded into. Completes this feature request.
  • gget diamond: The DIAMOND command is now run with --ignore-warnings flag, allowing niche sequences such as amino acid sequences that only contain nucleotide characters and repeated sequences. This is also true for DIAMOND alignments performed within gget elm.
  • gget ref and gget search back-end change: the current Ensembl release is fetched from the new release file on the Ensembl FTP site to avoid errors during uploads of new releases.
  • gget search:
    • FTP link results (--ftp) are saved in txt file format instead of json.
    • Fix URL links to Ensembl gene summary for species with a subspecies name and invertebrates.
  • gget ref:
    • Back-end changes to increase speed
    • New argument: list_iv_species to list all available invertebrate species (can be combined with the release argument to fetch all species available from a specific Ensembl release)

v0.28.2

5 months ago
  • gget info: Return a logging error message when the NCBI server fails for a reason other than a fetch fail (this is an error on the server side rather than an error with gget)
  • Replace deprecated 'text' argument to find()-type methods whenever used with dependency BeautifulSoup
  • gget elm: Remove false positive and true negative instances from returned results
  • gget elm: Add expand argument

v0.28.0

6 months ago

co-authored-by: @anhchi172

v0.27.9

9 months ago
  • gget enrichr background genes
  • expand gget search results to include synonym hits

Resolves #90 , resolves #9

Co-authored-by: @anhchi172

v0.27.8

10 months ago
  • Fixed bug in gget pdb
  • Added new release argument to gget search

Also see: https://pachterlab.github.io/gget/updates.html

Co-contributor: @anhchi172

v0.27.7

11 months ago

Moved dependencies for modules gget gpt and gget cellxgene from automatically installed requirements to gget setup. Updated gget alphafold dependencies for compatibility with Python >= 3.10. Added census_version argument to gget cellxgene.

v0.27.5

1 year ago

Updated gget search to function correctly with new Pandas version 2.0.0 (released on April 3rd, 2023) as well as older versions of Pandas

Updated gget info with new flags uniprot and ncbi which allow turning off results from these databases independently to save runtime (note: flag ensembl_only was deprecated)

All gget modules now feature a -q / --quiet (Python: verbose=False) flag to turn off progress information

Co-author of this release: @anhchi172

v0.27.4

1 year ago

v0.27.3

1 year ago