MetaPhlAn is a computational tool for profiling the composition of microbial communities from metagenomic shotgun sequencing data
--profile_vsc
parameter (together with --vsc_out
and --vsc_breadth
) enables the profiling of viral sequence clusters.--subsampling
now subsamples the FASTQ files and not the mapping results--mapping_subsampling
parameter enables the previous mapping subsampling behaviour--subsampling_output
parameter enables to save the subsampled FASTQ filecreate_toy_database.py
script enables the custom filtering of the MetaPhlAn databasesmerge_metaphlan_profiles.py
script-t rel_ab_w_read_stats
now produces the reads stats also at the SGB levelNo markers were found for the clade
error while executing StrainPhlAn without providing the clade markers FASTA file--subsampling
parameter allows reads' subsampling on the flight--subsampling_seed
parameter enables a deterministic or randomized subsampling of the reads--gtdb_profiles
of the merge_metaphlan_profiles.tsv
allows the merge of GTDB-based MetaPhlAn profiles--breadth_thres
parameter allows StrainPhlAn to filter the consensus markers sequences after the execution of sample2markers.py
--non_interactive
parameter disables user interaction when running StrainPhlAn--abs_n_markers_thres
and --abs_n_samples_thres
parameters enables the specification of the samples/markers filtering thresholds in absolute numbers--treeshrink
parameter enables StrainPhlAn to run TreeShrink for outlier removal in the treeVallesColomerM_2022_Jan21_thresholds.tsv
for compatibility with the mpa_vJan21 database--clades
parameter enables sample2markers.py
to restrict the reconstruction of markers to the specified clades-c
parameter of the extract_markers.py
script now allows the specification of multiple clades--print_clades_only
parameter now produces an output print_clades_only.tsv
reportstrain_transmission.py
script now uses by the default the VallesColomerM_2022_Jan21_thresholds.tsv
thresholdsmetaphlan2krona.py
and hclust2
have been added to the bioconda recipeChanges in version 4.0.1
MetaPhlAn 4 relies on ~5.1M unique clade-specific marker genes identified from ~1M microbial genomes (~236,600 references and 771,500 metagenomic assembled genomes) spanning 26,970 species-level genome bins (SGBs), 4,992 of them taxonomically unidentified at the species level.
What's new in version 4
--unclassified_estimation
--mpa3
MetaPhlAn 4 relies on ~5.1M unique clade-specific marker genes identified from ~1M microbial genomes (~236,600 references and 771,500 metagenomic assembled genomes) spanning 26,970 species-level genome bins (SGBs, http://segatalab.cibio.unitn.it/data/Pasolli_et_al.html), 4,992 of them taxonomically unidentified at the species level. What was changed in version 4:
MetaPhlAn 3.1 represents a moderate update to the bioBakery 3 software and databases. This update is based on improvements in our ability to align NCBI-sourced microbial genomes and their constituent genes to UniProt resources alongside additional removal of low-quality species (see PMID:33944776 for the definition of “low-quality species”).
What has changed in version 3.1: