Bacterial isolates from the honey bee (Apis mellifera) gut were analyzed for the presence of prophages using a combination of phage analysis programs: VirSorter2 (v2.2.2), CheckV (v.0.7.0), and Vibrant (v1.2.1). Proteins were predicted using prokka (v. 1.14.6) and initially functionally annotated using the PHROGS database (https://phrogs.lmge.uca.fr/index.php, downloaded from http://millardlab.org/2021/11/21/phage-annotation-with-phrogs/ on July 20, 2022). Proteins were then additionally annotated using hidden Markov profiles from the eggNOG v5 (http://eggnog5.embl.de) databases.
Associated nucleotide/amino acid sequences and functional annotations of prophage predicted in bacterial isolates are included as files here.
Associated code is available at github: https://github.com/ebueren/HBProphage_PeerJ
This data is associated with "Characterization of prophages in bacterial genomes from the honey bee (Apis mellifera) gut microbiome", currently in review at PeerJ.
Publisher
University Libraries, Virginia TechCorresponding Author Name
Emma BuerenFiles/Folders in Dataset and Description
[hmm_cleaned_v2.csv] - Complete functional annotational of all predicted prophage sequences from honey bee gut isolates. Combined annotation results from the PHROGs databases (downlaoded July 20, 2022) which were annotated using Prokka (e value <10^-6), as well as the best hit to hidden Markov profiles from the EggNOG 5.0 databases of bacteria, archaea, eukarya, and viruses using hits hmmsearch (-E 0.001) with HMMER.
[hmm_cleaned_datadictionary_v2.csv] - Descriptions of each column in hmm_cleaned_vs2.csv
[dc525_phrogs_prokka_prot.faa] - Amino acid sequences of predicted prophages from bacterial isolates associated with A. mellifera. Protein predictions were made using Prokka and the PHROGS database.
[dc525_rd3.fna] - Nucleotide sequences of predicted prophage regions from bacterial isolates associated with A. mellifera.
Prophage regions were predicted by analyzing bacterial genomes through VirSorter2. Putative regions were trimmed with CheckV. Regions were retained as high confident prophage sequences if Virsorter2 initially scored above the sequence 0.9 or above, or if CheckV trimmed regions were additionally confirmed as viral by VIBRANT.