HMM-based identification and categorization of iron genes and iron gene operons in genomes and metagenomes
Please see the Wiki page for introduction and tutorial on how to use this tool.
Garber AI, Nealson KH, Okamoto A, McAllister SM, Chan CS, Barco RA and Merino N (2020) FeGenie: A Comprehensive Tool for the Identification of Iron Genes and Iron Gene Neighborhoods in Genome and Metagenome Assemblies. Front. Microbiol. 11:37. doi: 10.3389/fmicb.2020.00037
Special thanks to Michael Lee for helping to put together the Conda environment for FeGenie. Thanks to Natasha Pavlovikj for creating the Conda recipe for FeGenie. Thanks to Michał Sitko for creating a Dockerfile for FeGenie.
conda create -n fegenie -c conda-forge -c bioconda -c defaults fegenie=1.0 --yes
conda activate fegenie
FeGenie.py -h
and when you are done using FeGenie and would like to deactivate the Conda environment for FeGenie
conda deactivate
git clone https://github.com/Arkadiy-Garber/FeGenie.git
cd FeGenie
bash setup.sh
./FeGenie.py -h
FeGenie.py -bin_dir /directory/of/bins/ -bin_ext fasta -t 16
The argument for -bin_ext needs to represent the filename extension of the FASTA files in the selected directory that you would like analyzed (e.g. fa, fasta, fna, etc).
./FeGenie.py -bin_dir /directory/of/bins/ -bin_ext fasta -t 16 -out output_fegenie
hmms/iron
directory can be found within FeGenie's main repository
-t 8 means that 8 threads will be used for HMMER and BLAST. If you have less than 16 available on your system, set this number lower (default = 1)
FeGenie introductory slideshow:
FeGenie video tutorial:
To start the tutorial, hit the 'launch binder' button below, and follow the commands in 'Walkthrough'
(Initially forked from here. Thank you to the awesome binder team!)
Enter the main FeGenie directory
cd FeGenie
print the FeGenie help menu
FeGenie -h
run FeGenie on test dataset
FeGenie.py -bin_dir genomes/ -bin_ext fna -out fegenie_out
Go into the output directory and check out the output files
cd fegenie_out
less FeGenie-geneSummary-clusters.csv
run FeGenie on gene calls
FeGenie.py -bin_dir ORFs/ -bin_ext faa -out fegenie_out --orfs
run FeGenie on gene calls, and use reference database (RefSeq sub-sample) for cross-validation
FeGenie.py -bin_dir ORFs/ -bin_ext faa -out fegenie_out --orfs -ref refseq_db/refseq_nr.sample.faa
In case of running FeGenie
with docker the only dependency you need to have installed is docker itself (installation guide).
With docker installed you can run FeGenie
in the following way:
docker run -it -v $(pwd):/data --env iron_hmms=/data/hmms/iron --env rscripts=/data/rscripts note/fegenie-deps ./FeGenie.py -bin_dir /data/test_dataset -bin_ext txt -out fegenie_out -t $(nproc)
./FeGenie.py ...
follows normal, non-dockerized flow of arguments.
Beware that you need to mount directories which contain files FeGenie
is supposed to read. If you are not familiar with docker then run docker run
command from the directory into which you cloned FeGenie
repository. If all the files you pass to FeGenie
are in inside this directory and you use relative filepaths (like e.g. hmms/iron
) everything will work just fine.