Virginia Tech
5 files

HBProphage PeerJ 2023

posted on 2023-03-03, 18:21 authored by Emma BuerenEmma Bueren

Bacterial isolates from the honey bee (Apis mellifera) gut were analyzed for the presence of prophages using a combination of phage analysis programs: VirSorter2 (v2.2.2), CheckV (v.0.7.0), and Vibrant (v1.2.1). Proteins were predicted using prokka (v. 1.14.6) and initially functionally annotated using the PHROGS database (,  downloaded from on July 20, 2022). Proteins were then additionally annotated using hidden Markov profiles from the eggNOG v5 ( databases.

Associated nucleotide/amino acid sequences and functional annotations of prophage predicted in bacterial isolates are included as files here.

Associated code is available at github:

This data is associated with "Characterization of prophages in bacterial genomes from the honey bee (Apis mellifera) gut microbiome", currently in review at PeerJ. 


NSF MCB-1817736



University Libraries, Virginia Tech

Corresponding Author Name

Emma Bueren

Corresponding Author E-mail Address

Files/Folders in Dataset and Description

[hmm_cleaned_v2.csv] - Complete functional annotational of all predicted prophage sequences from honey bee gut isolates. Combined annotation results from the PHROGs databases (downlaoded July 20, 2022) which were annotated using Prokka (e value <10^-6), as well as the best hit to hidden Markov profiles from the EggNOG 5.0 databases of bacteria, archaea, eukarya, and viruses using hits hmmsearch (-E 0.001) with HMMER. [hmm_cleaned_datadictionary_v2.csv] - Descriptions of each column in hmm_cleaned_vs2.csv [dc525_phrogs_prokka_prot.faa] - Amino acid sequences of predicted prophages from bacterial isolates associated with A. mellifera. Protein predictions were made using Prokka and the PHROGS database. [dc525_rd3.fna] - Nucleotide sequences of predicted prophage regions from bacterial isolates associated with A. mellifera. Prophage regions were predicted by analyzing bacterial genomes through VirSorter2. Putative regions were trimmed with CheckV. Regions were retained as high confident prophage sequences if Virsorter2 initially scored above the sequence 0.9 or above, or if CheckV trimmed regions were additionally confirmed as viral by VIBRANT.

Usage metrics

    Biological Sciences



    Ref. manager