Biotite Versions Save

A comprehensive library for computational molecular biology

v0.40.0

1 month ago

Changelog

Additions

Refactored struc.superimpose() (#526)
- Multiple fixed models are allowed
- Increased performance for multiple models
Support for BinaryCIF file format (#531)
- Added 'bcif' format to database.rcsb.fetch()
- Added structure.io.pdbx.BinaryCIFFile to parse BinaryCIF files
- Added structure.io.pdbx.CIFFile to parse CIF files with analogous API to BinaryCIFFile
- High-level PDBx API (get_structure(), get_assembly(), etc.) supports these new file classes
- Added include_bondsparameter to structure.io.pdbx.get_structure() and structure.io.pdbx.get_assembly() to parse bond information from file
Refactored structure.info subpackage (#540)
- Decreased initial loading time when package is imported
- The component dataset is now stored as compressed BinaryCIF decreasing the Biotite package size
- The component dataset is updated to the current version, i.e. the latest chemical components from the wwPDB are included
- The project now contains the setup_ccd.py script, enabling the user to get an up-to-date version of the component dataset

Changes

Removed structure.info.bond_order() and structure.info.bond_dataset (#540)
struc.superimpose returns now an AffineTransformation object instead of a transformation tuple (#526)
- superimpose_apply() is deprecated in favor of AffineTransformation.apply()
structure.io.pdbx.PDBxFile is deprecated and superseded by CIFFile (#531)
structure.io.mmtf is deprecated and superseded by BinaryCIFFile (#531)
- This is reflected by the RCSB announcement to deprecate MMTF

Fixes

Handle invalid CRYST1 records in PDB files correctly (#523)
Ensure that NumPy 1.x is used (#537)
- Support for 2.x will be added in the future

v0.39.0

4 months ago

Changelog

Additions

Add build for Python 3.12 (#513)
Added modern fast k-mer subsetting methods to sequence.align (#510)
- These include:
  - MinimizerSelector
  - SyncmerSelector
  - CachedSyncmerSelector
  - MincodeSelector
- The following k-mer ordering methods are available:
  - RandomPermutation
  - FrequencyPermutation
- Added BucketKmerTable to support indexing of long k-mers with reasonable memory consumption
Support conversion of biotite.sequence.align.Alignment from/to CIGAR strings (#516)
- read_alignment_from_cigar()
- write_alignment_to_cigar()
Added sequence.graphics.plot_alignment_array() (#485)
Support new 5-character residue names in structures from PDB (#512)
Support NCBI API keys in database.entrez to increase download limits (#514)
Increased performance of application.sra(#504).
- prefetch is called before fasterq_dump, as suggested here
- FastaDumpApp is added, which decreases computation time by writing as FASTA instead of a FASTQ file, which omits the scores

Changes

application.sra.FastaDumpApp.get_sequences() now only returns sequence (#504) strings and not scores anymore (#504)
- Use get_sequences_and_scores() instead

Fixes

Fixed memory leak in sequence.align.KmerTable.from_tables() (#510)
Fixed problems of plotting functionalities with recent Matplotlib versions (#518)

v0.38.0

8 months ago

Changelog

Additions

Faster k-mer decomposition in sequence.align.KmerAlphabet.create_kmers() (#475)
Sequence type can be set when reading sequences and alignments using sequence.io.fasta ( #478)

Fixes

Fixed error that appeared when indexing an sequence.AnnotatedSequence with a slice (#479)
Fixed reading MOL/SDF files with more than 100 bonds (#480)
Fixed compilation of Biotite with Cython 3.x (#493)
Fixed usage of box parameter in structure.rdf() (#494)

v0.37.0

1 year ago

Changelog

Additions

Added PubChem database interface with database.pubchem (#472)
- Analogous to the other database subpackages, it supports, search() and fetch()
- fetch_property() can be used to quickly obtain a wide range of properties for a given list of compound IDs
- Automatic throttle control ensures that the PubChem usage control is obeyed
Extended functionality for database.rcsb.search() and database.rcsb.count() (#466):
- Added support for computational structures (e.g. from Alphafold DB) via the content_types parameter
- Added support for grouping via the new group_by and return_groups parameters
  - the type of grouping is selected via Grouping subclasses
- Added support for ascending sorting with the Sorting class
database.entrez.search() now also accepts the common database name in addition to the E-utility database name (#471)
- This is now consistent with the behavior in database.entrez.fetch()
Added structure.io.pdb.PDBFile.get_b_factor() analogous to structure.io.pdb.PDBFile.get_coord() (#469)
Added structure.io.pdbx.get_component() and set_component() (#468)
- Allows getting/setting chemical components from/to PDBx files via their chem_comp group of categories instead of atom_site

Changes

Deprecate atom_mask parameter in structure.connect_via_residue_names() and structure.connect_via_distances() (#474)
- It has no effect anymore
In structure.BondList.merge() the BondList given as parameter takes precedence, if both BondLists contain the same bond with different BondType (#473)
- Previously it was the other way round
The BondList returned by structure.io.pdb.PDBFile.get_structure() (if include_bonds is True) gives appropriate BondTypes, if they can be determined using the CCD (#473)
- Otherwise the BondType is BondType.ANY
- Previously it was BondType.ANY for all bonds
Refactored structure.remove_pbc()(#460)
- PCB removal is conducted for each molecule separately
- Not the first atom but the centroid of a molecule is placed within the box
- The selection can only be a boolean matrix

Fixes

Fixed a bug in structure.connect_via_distances() and structure.connect_via_residue_names() that allowed unexpected bonds between polymer and non-polymer residues (#473)

v0.36.1

1 year ago

Changelog

Fixes

Fixed parsing of remarks < 100 in structure.io.PDBFile (#457)
Bonds can now be read and written using hybrid-36 encoding in structure.io.PDBFile (#456)

v0.36.0

1 year ago

Changelog

Additions

Added Python 3.11 build
Better support for macromolecular assemblies and symmetry mates (#450)
- biotite.structure.io.pdb and biotite.structure.io.mmtf now support parsing of assemblies via list_assemblies() and get_assembly()
- biotite.structure.io.pdb is able to parse all atoms within a single unit cell via get_symmetry_mates()
Added structure.rmspd() to compute the root-mean-square-pairwise-deviation
- This is a method to determine deviations between to models without the need of prior structure superimposition
Refactored structure.annotate_sse() (#448)
- Higher performance due to more vectorization
- Multiple chains can be processed at once
More granular macromolecule filters in structure subpackage (#436)
- Added filter_peptide_backbone() and filter_phosphate_backbone() to filter backbone atoms of proteins and nucleotides, respectively
- Added filter_linear_bond_continuity() that filters atoms that are within distance boundaries to the next atom
- Added filter_polymer() that filters biomacromolecules of the given type (peptide, nucleotide, carbohydrate) and minimum length
More integrity checks in structure subpackage (#436)
- check_linear_continuity() gives positions in a structure where atoms are not within distance boundaries to the next atom
- check_backbone_continuity() does the same exclusively for peptide/nucleotide backbone atoms
Added sequence.common_alphabet() to determine the Alphabet from a list of alphabets that extends all other alphabets from this list (#446)
sequence.phylo.Tree.to_newick() and sequence.phylo.TreeNode.to_newick() allow rounding of distance labels (#439)
application.TantanApp is able to process multiple sequences in a single call (#446)
- This significantly improves the performance especially for short sequences

Changes

structure.filter_backbone() is deprecated and replaced by filter_peptide_backbone() (#436)
structure.check_bond_continuity() is deprecated and replaced by check_backbone_continuity() (#436)
Deprecated chain_id parameter in structure.annotate_sse(), multiple chains can now be processed at once (#448)

Fixes

structure.CellList accepts empty query coordinates in get_atoms() and get_atoms_in_cells() (#448)
Fixed padding of CRYST1 records to 80 instead of 70 characters (#453)
Fixed issue, where application.dssp.DSSPApp did not give correct number of secondary structure elements for multi-chain structures (#444)
Resolved MemoryError in structure.repeat_box() (#450)

v0.35.0

1 year ago

Changelog

Additions

Support stack-wise iteration over trajectory files (#420)
Support Path objects in File.read()
Improved filters for different types of residues in structure subpackage (#425)
- filter_amino_acids() now also filters for non-canonical amino acids
- filter_nucleotides() uses an updated list of nucleotides
- New filter_carbohydrates() filters for saccharides
- filter_canonical_amino_acids() and filter_canonical_nucleotides() filter the respective canonical residues
- New structure.info.carbohydrate_names() and structure.info.amino_acid_names() give a list of residue names considered as carbohydrates and amino acids, respectively
application.LocalApp now supports input to STDIN
Improved ViennaRNA interfaces (#435)
- Added application.viennarna.RNAalifoldApp interface to RNAalifold
- Secondary structure constraints can be given to application.viennarna.RNAfoldApp and application.viennarna.RNAalifoldApp

Changes

The residues that are recognized by structure.filter_amino_acids() have changed (see above)
Deprecated application.viennarna.RNAfoldApp.get_mfe() and replaced it by application.viennarna.RNAfoldApp.get_free_energy()

Fixes

Support PDB format dialect with inverted charge column (X+ instead of +X) in structure.io.PDBFile(#421)
Fixed erroneous atom parsing in strutcure.io.mmtf.MMTFFile, if an MMTF file has multiple different groupType entries for the same residue name and the same number of atoms (#426)
Fixed angle condition in structure.base_stacking() (#432)
Fixed TypeError in database.muscle.Muscle5App
Fixed bond_line_style parameter in structure.graphics.plot_secondary_structure()
Fixed error in pseudoknots() and base_pairs_from_dot_bracket() in cases the secondary structure had no base pairs
Update identification of error messages from server in database.entrez.fetch()

v0.34.1

1 year ago

Fixes

Support for new UniProt REST API (#409)
Preserve lower-case chain IDs when an AtomArray is read from PDB and PDBQT files (#413)
application.vina.VinaApp supports now docking of molecules containing certain metal elements

v0.34.0

1 year ago

Changelog

Additions

Support for new RCSB search API (#408)
- Added case_sensitive parameter in database.rcsb.FieldQuery
structure.info.mass() support deuterium
structure.connect_via_distances() can connect atoms over periodic boundaries
Added more chain-level utilities consistent with residue-level utilities
- structure.apply_chain_wise()
- structure.spread_chain_wise()
- structure.get_chain_masks()
- structure.get_chain_starts_for()
- structure.get_chain_positions()
structure.superimpose() supports also pure coordinates

Changes

structure.hbond() uses an associated structure.BondList to find hydrogen atoms to potential hydrogen bond donors
Lines depicting bonds in structure.graphics.plot_atoms() and structure.graphics.plot_ball_and_stick_model() use rounded tips

Fixes

Fixed structure.io.pdbx.get_assembly missing chains in some structures (#387)
Added a more meaningful error, if Matplotlib is required, but not installed (#302)
Added more descriptive error, if a structure.io.pdb.PDBFile has erroneous atom IDs (#379)
structure.io.pdb.PDBFile pads lines always to 80 characters
Allow empty attribute string in sequence.io.GFFFile
Fixed wrong similarity scores, if a sequence.align.SubstitutionMatrix with two different alphabets is read from string or file
Fixed application.mafft.MafftApp runs for more than 10 sequences.

v0.33.0

2 years ago

Changelog

Additions

Added application.muscle.Muscle5App to support the changed CLI of Muscle 5
Added structure.orient_principal_components() to orient atom coordinates to the given axes
biotite.structure.io.pdbx.get_structure() uses label_xxx or auth_xxx field as fallback, if the respective other one is not available
Added default_bond_type parameter to biotite.structure.io.write_structure_to_ctab() and biotite.structure.connect_via_distances to allow the user to change the BondType in the generated BondList

Fixes

sequence.io.gff.GFFFile.read() is now able to read GFF records with trailing tabs
Fixed DeprecationWarning in structure.align_vectors() (#295)
Fixed alignment in atom name column in structure.io.pdb.PDBFile.write()
Fixed error handling in structure.index_xxx() functions, if invalid input shape is given
Ensured quoted values in looped categories will not be truncated in structure.io.pdbx.PDBxFile.set_category()