Download Provided Dataset
Download PhyloFisher’s Provided Database
- Retrieve the provided starting database via wget:
wget https://ndownloader.figshare.com/files/29093409
- Uncompress the .tar.gz file:
tar -xzvf 29093409
The uncompressed database directory contains the subdirectories and files detailed below.
database/
backups/
{Month}_{Day}_{Year}.tar.gz
a compressed file containing backups of the two database foldersorthologs/
,paralogs/
, the database filemetadata.tsv
, andtree_colors.tsv
datasetdb/
datasetdb.dmnd
a diamond database of the orthologsdatasetdb.fasta
a fasta file of the orthologs used to construct the diamond database
orthologs/
{gene_name}.fas
240 fasta files of the orthologs
orthomcl/
bacterial
abbreviated names of bacterial species present in OrthoMCLgene_og
a tsv file detailing the names of the 240 genes and their corresponding OrthoMCL orthogroup number(s)orthomcl.diamonddb.dmnd
diamond database of OrthoMCL
paralogs/
{gene_name}_paralogs.fas
240 fasta files of the paralogs
- profiles
{gene_name}.hmm
240 profile hmm files of the orthologs
- proteomes/
{Unique_ID}.faa.tar.gz
complete proteomes of all taxa in the database
metadata.tsv
tsv file of containing metadata for species in the database. Detailed extensively in the section “Detailed Explanation of the PhyloFisherDatabase_v1.0 Metadata File” of this manualtree_colors.tsv
a comma separated file used to color taxa based on taxonomy for manual inspection of single gene trees.