Previous Article | Next Article ![]()
Journal of Virology, July 2004, p. 7036-7051, Vol. 78, No. 13
0022-538X/04/$08.00+0 DOI: 10.1128/JVI.78.13.7036-7051.2004
Copyright © 2004, American Society for Microbiology. All Rights Reserved.
Aissa E. Doumbouya, Jaw-Ching Liu,
Thomas M. Merritt,|| and Jennifer S. Lanoie
Entomology and Nematology Department, University of Florida, Gainesville, Florida 32611-0620
Received 16 February 2004/ Accepted 5 April 2004
|
|
|---|
|
|
|---|
To date, a total of 23 baculovirus genomes are available in GenBank, some of which are close variants. From these, 22 infect insects from the order Lepidoptera and 1 infects mosquitoes from the order Diptera. The comparative study of these baculovirus genomes reflects their diversity, with a size range from 99,657 bp in Adoxophyes orana GV (AdorGV) (78) to 178,733 bp in Xestia c-nigrum GV (XcGV) (30); number of open reading frames (ORFs) from 109 in Culex nigripalpus NPV (CuniNPV) (1) to 181 in XcGV; number of homologous regions (hrs) from 17 in Spodoptera litura NPV (SpltMNPV) (65) to none in Cydia pomenella GV (CpGV) and Adoxophyes orana GV (AdorGV) (53, 78). The number of conserved genes reported among lepidopteran baculoviruses varies, as more baculovirus genomes became available (31). However, when CuniNPV is included in the analysis, the number is reduced to 30 genes (32).
NeseNPV has been used to control the European pine sawfly, N. sertifer, one of the most harmful insects to the coniferous forests of the northern hemisphere (41). Sawfly larvae live gregariously and resemble caterpillars. Originally from Europe, N. sertifer was accidentally brought to North America around 1925. Before the concern of viral registration, NeseNPV was successfully used by foresters, Christmas tree growers, and private landowners. Later, the virus was registered as Neochek-S and became available in the United States for use against the European pine sawfly (34). Nucleopolyhedroviruses of sawflies have been reported in 25 species (34). These viruses are highly host specific and, in contrast to most lepidopteran NPVs, replicate only in the epithelial cells of the larval midgut (22). Infection by the virus results in a reduction of larval feeding and, eventually, complete suppression of appetite long before death.
Few genes of NeseNPV had been previously sequenced (Liu and Maruniak, unpublished data; GenBank accession no. AF121349). Rohrmann et al. (72) published a description of the N-terminal sequence of the NeseNPV polyhedrin protein. The complete polyhedrin amino acid sequence was later obtained (Harris and Possee, unpublished data) and used in phylogenetic studies that placed NeseNPV as the outgroup baculovirus (11, 81).
Insight into the origin and subsequent evolution of the family Baculoviridae can be obtained by comparing the similarities and differences in the genomic contents and gene distribution of distantly related viruses. The comparison may also elucidate some properties of the baculovirus morphogenesis, structure, or genomic information that are essential for effective propagation and infectivity. In this work, we describe the genome sequence and the genetic organization of the hymenopteran baculovirus Neodiprion sertifer NPV.
|
|
|---|
Multiplex PCR. Because no clonal isolate of NeseNPV was available, a multiplex PCR procedure was used to arrange the cloned fragments, to determine if the whole genome was cloned, and to detect genome variants. The physical map of the NeseNPV DNA was assembled with a modification of the optimized multiplex PCR method (74). The genomic viral DNA was used as a template for PCRs that would allow the correct order of the clones in the genome to be determined after sequencing the PCR products. Using the initial sequence data from all the HindIII clones, unique primers were synthesized within the first 300 bp that annealed with the DNA fragment toward the end of the fragment (end-out primers). The primers were first tested by sequencing the corresponding clone. The primers were then combined in groups of three per pool in such a way that primers from opposite ends of the same fragment were together in the same pool. All possible pairs of pool combination were used in PCRs using the genomic NeseNPV DNA as the template under the following conditions: 1.5 mM MgCl2, 0.5 mM deoxynucleoside triphosphates, 1 ng of genomic NeseNPV DNA, 10 pmol of each primer, 1 U of Elongase (Invitrogen), for a final 25-µl reaction volume. Cycling conditions were those described by Moraes and Maruniak (59). Two microliters of each PCR product were electrophoresed on a 0.7% agarose gel. When no PCR products were obtained, it meant that none of the six primers from the two pools tested amplified the DNA under those conditions and, hence, their corresponding fragments were in the wrong orientation or too distantly separated in the genome to amplify the DNA. When one DNA band was obtained, the PCR product was sequenced using the pooled primers previously used for the PCR amplification. When one DNA band was seen in all the lanes containing a particular pool of primers, the common pool of primers was used to sequence that PCR product. If more than one DNA band was obtained in a single lane, the pools involved were further divided in groups of two primers each and the PCR was repeated. The sequences obtained were compared to previous data from the HindIII library. Overlapping sequences permitted the location of the correct order of the HindIII fragments on the NeseNPV physical map. The order of the HindIII fragments was further confirmed by repeating the PCR amplification using the genomic DNA as template and only the primer pair from adjacent fragments identified by the multiplex method.
DNA sequence analysis. The complete NeseNPV genome was assembled with the Sequencher 4.1 program (Gene Codes) using the sequence data from every individual clone and the multiplex PCR experiments. Every base of each DNA strand was sequenced a minimum of three times; however, due to the fact that the location of every sequence had been previously determined, either by transposon mapping or by primer walking, the assembly process was not random. The ORFs encoding 50 amino acids or more and showing minimum overlapping, as has been the criteria for baculoviruses (5), were found using the ORF Finder program (http://www.ncbi.nlm.nih.gov/gorf/gorf.html) (77). BLASTP 2.2.6 (http://www.ncbi.nlm.nih.gov/BLAST) (3), conserved domain database (CDD) (http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml) search, run by default in parallel with BLASTP and PSI-BLAST (4) programs and Smith-Waterman similarity search (http://coe02.ucalgary.ca/algo-sw/sw_aa.shtml), were used to compare all of the possible ORFs to the amino acid sequence database. The following criteria were used to accept NeseNPV ORFs as baculovirus homologs: (i) During BLASTP comparison, an E value of 0.1 or less was accepted; (ii) when the BLAST list showed similarity with related ORFs from several baculoviruses, it was considered a NeseNPV homolog even if the E value of some members of the list were higher than 0.1; (iii) if an hr overlapped with an ORF, that ORF was not considered for the analysis. For NeseNPV ORFs that showed random matches with other baculovirus ORFs (represented by an E value over 0.1), forced alignments were performed to reveal if there was significant homology not evident by BLAST or Smith-Waterman similarity search (http://coe02.ucalgary.ca/). This was also done with NeseNPV ORFs that had signal peptides and transmembrane domains in order to search for an ld130 homolog candidate. CLUSTALW (http://www.ebi.ac.uk/clustalw/) was used for this purpose and for several other amino acid alignments. Some ORF overlapping was permitted, especially when both ORFs were homologous to other baculoviruses. The putative coding regions were numbered as NeseNPV ORFs. The sequences 160 nucleotides upstream of each ORF start codon were analyzed for early and late transcriptional motifs. Late motifs contained the sequence (A/G/T)TAAG (71). Early motifs were defined as either TATA alone (designated "e" in Table 1) or TATA with the CA(T/G)T sequence, 20 to 40 nucleotides downstream ("E" in Table 1). To find possible transmembrane domains in the NeseNPV putative proteins, the TMHMM server 2.0 was used (http://www.cbs.dtu.dk/services/TMHMM-2.0). To examine the NeseNPV ORFs for N-terminal signal peptides, SignalP 2.0 prediction server (61) was used (http://www.cbs.dtu.dk/services/SignalP). Tandem Repeats Finder (http://c3.biomath.mssm.edu/trf.html) was used to locate the hr's (6). The alignment parameters used were 2, 7, and 7 for match, mismatch, and indels, respectively, a minimum alignment score of 50 to report repeats, and a maximum period size (repeat size) of 500. Several programs from the Genetics Computer Group (Wisconsin Package Version 10.2) (18) were used to locate the palindrome sequences of NeseNPV hr's and to determine the G+C ratio of each hr and of the entire NeseNPV genome
|
View this table: [in a new window] |
TABLE 1. NeseNPV ORFsa
|
GeneParity analysis. The gene order of the NeseNPV genome was compared to those of six other baculovirus genomes: AcMNPV (representing group I lepidopteran NPVs), LdMNPV, MacoMNPV-A and HzSNPV (group II lepidopteran NPVs), CpGV (lepidopteran GV), and CuniNPV (dipteran NPV). For this, a modified version of the GeneParity plot method (33) was used. Each complete genome was compared individually to that of NeseNPV. Minor modifications were done to two of the genomes, relative to the original publication. For AcMNPV (5), the comparison was made starting with the polyhedrin gene. For CuniNPV (1), the order of lef-5, 38K, ac96, and helicase in the genome was used, not the ORF number. A gene cluster was defined as any group of genes (two or more) that appeared consecutively in NeseNPV and in the comparative genome. However, some viruses do not have all of the genes in a cluster because they may be present in another location of the genome or just absent. Therefore, relaxed clusters were also determined as those in which at least two genes were consecutive in both genomes compared, but a certain number of genes (a maximum of five in one case for CuniNPV) may have been inserted between genes that appear consecutively in NeseNPV.
Nucleotide sequence accession number. The NeseNPV genome sequence has been deposited in GenBank under accession no. AY430810.
|
|
|---|
![]() View larger version (22K): [in a new window] |
FIG. 1. Linear map of the HindIII sites of the NeseNPV genome. The number and transcriptional direction of each ORF are labeled as a black arrow. Known baculovirus homologs are labeled in bold by the NeseNPV ORF number. Nonbaculovirus homologs are indicated in parentheses. The homologous regions (hr) are shown as triangles, and the direct repeats (dr) are shown as circles ().
|
|
View this table: [in a new window] |
TABLE 2. Characteristics of baculovirus genomesa
|
|
View this table: [in a new window] |
TABLE 3. Baculovirus genes present in NeseNPV genome
|
Regulatory genes. Gene expression in baculoviruses follows a temporal cascade. Four genes capable of transregulating early viral gene expression that occurs prior to viral DNA replication have been identified: ie-0, ie-1, ie-2, and pe38 (23). NeseNPV does not have significant homology to any of these early transregulating genes of baculoviruses. On the other hand, it had homology with various late transcription activator genes (50, 51), including the late expression factors lef-4 (nese62), lef-5 (nese58), lef-8 (nese81), lef-9 (nese40), lef-11 (nese23), p47 (nese49), and the very late factor vlf-1 (nese45). From these, the four genes encoding the viral RNA polymerase subunits (lef-4, lef-8, lef-9, and p47), forming the simplest eukaryotic DNA-directed RNA polymerase (27), are present in all the sequenced baculoviruses, including CuniNPV (1, 32). Another gene involved in regulation of late expression was nese5, homologous to ac69 recently named mtase1 for its methyltransferase activity (79). Although mtase1 has no demonstrable effect on viral replication, it has been shown to stimulate transient late gene expression (44). NeseNPV lacks lef-6, lef-10, lef-12, and 39K that are late regulatory genes (51). However, Lin and Blissard (47), using a lef-6 null AcMNPV virus, have shown that lef-6 is not essential for viral DNA replication nor for late gene transcription but probably is important in accelerating the infection cycle of AcMNPV. In support of this finding, NeseNPV does not have lef-6, indicating that it is not needed for replication of this and perhaps other viruses. The two other lef genes absent from NeseNPV, lef-10 and lef-12, are not conserved in all baculoviruses (32). The protein kinase (pk-1) gene homolog, involved in very late gene expression (20), was not found in NeseNPV.
Structural genes. From the 15 conserved structural genes of lepidopteran baculoviruses (32), only 12 were found to have homology with NeseNPV (Table 3). The structural polyhedrin or granulin gene homologs are present in all baculoviruses with the exception of CuniNPV. NeseNPV polyhedrin (nese1) had the highest amino acid homology to the Neodiprion baculovirus NeabNPV polyhedrin with a 92% amino acid identity, but when compared to the lepidopteran polyhedrins/granulins, the homology ranged from 53% with SpltMNPV to 38% with CpGV (Table 1). The NeseNPV basic DNA binding protein p6.9 (nese36) had a 56% amino acid identity with AdorGV p6.9 but was not associated by BLAST with any other baculovirus p6.9 gene, even though it had a high arginine (29%) and serine (27%) concentration and the RRPGRPR conserved motif identified in other p6.9 proteins (76). The size of the predicted NeseNPV p6.9 protein was 86 amino acids (aa), bigger that all other p6.9 proteins except for those of HaSNPV (76) and HzSNPV (14). Nese36, however, had a high amino acid identity with splicing factor proteins from organisms such as insects, mouse, and human. The occlusion-derived viral envelope odv-e18 (nese65), odv-e56 (nese38), and odv-ec27 (nese66) homologs in NeseNPV had amino acid identities of 34% (with XcGV), 39% (with LdMNPV) and 23% (with BmNPV), respectively (Table 1). The capsid-associated protein genes vp39 (nese89) and vp91 capsid (nese84) were also present. Although the identity of the NeseNPV vp39 homolog averaged only 25% with other vp39 of baculoviruses, it contained a total of six cysteines of the eight conserved in AcMNPV and OpMNPV (24). From those, four were in conserved locations that could generate disulfide bonds important for the protein structure. The vp1054 (nese85), a viral structural protein present in both occlusion-derived virus (ODV) and budded virus (BV) required for nucleocapsid assembly (63), and the glycoprotein gp41 (nese47), found only in ODV but required for the egress of nucleocapsids from the nucleus during the BV synthesis (64), were found in NeseNPV. The average identity with other vp1054 and gp41 proteins was 24% and 25%, respectively. The final NeseNPV structural gene found in all lepidopteran baculoviruses was p74 (nese50) with an average amino acid identity of 38%. This gene product, found associated with ODVs, is essential for the primary infection of midgut cells of insect larvae during oral infection (21). Another gene, associated with the ODV envelope of Spodoptera littoralis MNPV (SpliNPV), having a similar function as p74, is the ac119 homolog called the per os infectivity factor or pif gene (39). Nese79 had a 33% amino acid identity with ac119 and 34% with SpliNPV pif. A second pif gene (pif-2 = SeMNPV ORF 35), has recently been described for SeMNPV (70). Nese55 had a 35% amino acid identity with Se35 and 46% with its AcMNPV homolog, ac22. Both pif and pif-2 genes have been conserved in all the baculoviruses sequenced (32). Genes not identified in NeseNPV were p10, pk-1, odv-e66, odv-e25, and neither ld130 nor gp64. The p10 gene, responsible for the formation of fibrillar structures and associated with the occlusion body, is also absent in CpGV (53) and CuniNPV (1). CuniNPV also lacks odv-e66 and pk-1 homologs. A major difference between NeseNPV and other baculoviruses is the lack of a definitive gp64 or ld130 homolog that code for envelope fusion proteins. These functionally analogous proteins (37) present in BV envelopes are capable of inducing cell fusion and are involved in viral attachment (52, 68). All the baculoviruses sequenced to date have the ld130 homolog (32), which codes for the F protein, and the members of the group I NPVs also have gp64 (8, 9, 58). NeseNPV does not have either of these gene homologs. NeseNPV infections of sawfly larvae are restricted to the insect midgut epithelial cells, as happens with baculoviruses that infect other Hymenoptera, Diptera, Coleoptera, Thysanura and Trichoptera hosts (22). In NeseNPV, the function of viral transmission between the epithelial cells may lie in another gene. To find probable candidates for such role, all the NeseNPV ORFs were tested for transmembrane domains and N-terminal signal peptides. The following NeseNPV ORFs had both a signal peptide and a transmembrane domain: Nese26, nese55 (pif-2), nese60 (ac96 homolog), nese65 (odv-e18), nese67 (ac145 homolog), nese69 (ac115 homolog), nese71, nese79 (pif) and nese84 (vp91 capsid). These and other NeseNPV ORFs that had a transmembrane domain, an N-terminal signal peptide or a similar size to baculovirus ld130 homologs were aligned individually with the ld130 homologs from several baculoviruses. However in all searches, the amino acid identity was never higher than 15% and considered random amino acid matches.
Inhibitors of apoptosis. Apoptosis or programmed cell death is known to happen in vertebrates and invertebrate organisms as a response to a series of inductions, including pathogen attacks (15). Baculoviruses have genes capable of suppressing cellular apoptosis to maintain viral replication. The family of genes called inhibitor of apoptosis, iap, which was first identified in CpGV (17), is found in most baculoviruses sequenced to date with the exception of CuniNPV. NeseNPV contains an iap homolog (nese17) with an amino acid identity of 38% to CpGV iap-3. Several baculoviruses have more than one iap gene in their genomes, while NeseNPV has only one. Homologs to iap usually have two baculovirus iap repeats (BIR) and a C-terminal zinc finger (7). Nese17 has only one BIR motif and the zinc finger motif, similar to what is found in SpltMNPV (65) and LdMNPV (42), and also in some iap-2s such as MacoMNPV and RoMNPV (28, 46). NeseNPV iap had a higher homology with the iap-3 gene family proposed by Luque et al. (53) based on their iap phylogenetic analysis. Until biological activity is determined for nese17, it will simply be called iap. Other apoptosis inhibitor genes include p35, found originally in AcMNPV (16), and Slp49 of SpliNPV, which is a p35 homolog that blocks apoptosis induced by infection of Sf9 cells with an AcMNPV p35-deficient mutant (19, 82, 69). NeseNPV did not have any homology with either p35 or Slp49 genes, indicating that either its iap is the only gene responsible for any apoptotic suppressor activity or a very low amino acid homology has made it difficult to locate a p35 or a Slp49 homolog. However, since it has been shown that not all iap's are capable of preventing cell death (54), the biological activity of the NeseNPV iap homolog will have to be experimentally confirmed.
Other conserved genes. Other NeseNPV ORFs had homology with baculovirus genes, the biological functions of which have not yet been determined. These are listed in Table 3. Nese67 was homologous to ac145 (30%) and ac150 (25%).
Duplicated genes. Two NeseNPV ORFs (nese18 and nese19) were duplicates, positioned in opposite orientation with a homologous region (hr4) and a small direct repeat (dr2) separating them (Fig. 1). NeseNPV does not have a copy of the AcMNPV ORF 2 homolog, characterized as the baculovirus repeat ORF (bro) (38). Other baculoviruses usually have one or more copies of bro genes with the exception of SeMNPV (36), PxGV (29), and AdorGV (78), which, like NeseNPV, have no bro homologs.
Nonbaculovirus homologs. Six NeseNPV ORFs had high homology with known proteins not related to baculoviruses (Table 1). Amino acid identity as high as 40% was obtained between nese7 and the family of trypsin-like proteases from insects belonging to the orders Lepidoptera, Diptera, Coleoptera, and Siphonaptera. Homology was also obtained with the trypsin of other arthropods and mammals. Figure 2, shows the alignment of nese7 with trypsins of other insects and a mammal (cow). Conserved amino acid sequences are shown in boxes. The trypsin catalytic triad (histidine, aspartic acid, and serine) and the conserved cysteines from the trypsin-like proteins (73) were present in nese7 and are indicated in Fig. 2. The size of the predicted protein (258 aa) was comparable to that of other trypsins. The maximum-likelihood quartets method suggests that the trypsin-like gene of the NeseNPV may be of invertebrate origin and in 35% of the time its sister taxa was the trypsin of a Coleoptera, Diaprepes abbreviata (data not shown). This putative gene has not been reported in any DNA virus including baculoviruses. Neodiprion baculoviruses, however, seem to have it present, since Neodiprion lecontei (Basil Arif, personal communication) also has this gene. It will be interesting to determine if this gene was derived from the insect host, since no hymenopteran trypsin was found in the database search to make a comparison. Nese52 had four C2H2 zinc finger domains and significant homology with C2H2 zinc finger proteins from insects and mammals. Although on average, only the first half of this ORF presented the homology, this type of protein has not been previously described in baculoviruses. Two NeseNPV ORFs, nese72 and nese73, had homology with the regulator of chromosome condensation (RCC1) proteins from a wide group of organisms such as microsporidia, nematodes, fruit flies, frogs, rodents and humans. Nese83 had up to 39% amino acid identity with the capsid protein of several insect densoviruses. Finally, nese90 was homologous to a large number of phosphotransferases.
![]() View larger version (58K): [in a new window] |
FIG. 2. Alignment of NeseNPV ORF 7 (nese7) and trypsins from other organisms. Bos taurus, cow; Hypoderma lineatum, early cattle grub; Sarcophaga bullata, grey fleshfly; Drosophila erecta, a fruit fly; Aedes aegypti, yellow fever mosquito; Bombyx mori, silkworm. Regions highlighted as conserved amino acids by the conserved domain database (CDD) are boxed with a consensus sequence indicated on the top line. # represents the trypsin catalytic triad, histidine, aspartic acid, and serine. * indicates the conserved cysteines.
|
hrs. Six hrs were located within the NeseNPV genome (Fig. 3), each containing small (5- or 6-bp) perfect palindromes, which can be extended up to 19-bp imperfect palindromes, embedded within a series of direct repeats. The nucleotide sequence has been aligned to show the size of the repeat units, and when needed, gaps were introduced to optimize the alignment (Fig. 3A). The number of hrs in other baculoviruses ranges from four in CuniNPV, PxGV and MacoNPV (1, 29, 46) to 17 in SpltNPV (65). Baculoviruses reported without typical hrs are CpGV (53) and AdorGV (78). NeseNPV hrs varied in length from 396 (hr5) to 794 (hr2) nucleotides (Table 4), and the combined size of the six hrs was 3,669 bp or 4.3% of the genome. This percentage was significantly higher than those for the hrs from CuniNPV (0.8%) and AdhoNPV (1.1%) (1, 60) but lower than those for several other baculovirus hrs, including SpltMNPV (65), HzSNPV (14), HaSNPV (13), and PxGV (29), which comprise approximately 6% of their genomes. NeseNPV hrs consist of repeats of 45 bp (hr5 and hr6), 65 bp (hr2), or alternating 65 bp and 45 bp units (hr1, hr3, and hr4) (Fig. 3B). The number of repeat units in each hr ranged from 9 to 14 and the length and number of times they occur are presented in Table 4. The average G+C content of the hrs was 49.8%, significantly higher than the 34% for the complete NeseNPV genome.
![]() View larger version (64K): [in a new window] |
FIG. 3. (A) hrs of NeseNPV. The nucleotide sequence of three NeseNPV hrs are aligned to exemplify the direct repeats found in units of 65 bp (hr2), 45 bp (hr5), or alternating 65 and 45 bp (hr1). The consensus sequence of the hrs is shown in bold, with uppercase letters representing 75% or more conserved nucleotides (60). Gaps () were inserted to optimize the alignment of the repeat units. The numbers on the left hand side of each hr indicate the nucleotide position in the genome. For each hr, the nucleotides forming a perfect palindrome (10 to 12 nucleotides) are marked with a solid arrow. Imperfect palindromes, formed by 70% or more complementary bases, are indicated by dots that extend 31 to 40 bases. The late promoter motif ATAAG and the potential GATA factor-binding site are double underlined and boxed, respectively. (B) Schematic representation of the repeats units of either 65 bp (solid arrows) or 45 bp (double-line arrows) found in each hr. (C) Comparison of the consensus sequences of the six NeseNPV hrs with the perfect palindrome region underlined and the extended imperfect palindrome dotted. An asterisk in hr6 indicates that the repeat sequence is different from the other hrs.
|
|
View this table: [in a new window] |
TABLE 4. Structure of NeseNPV homologous regions (hr's) and direct repeats (dr's)
|
Baculovirus hrs have been characterized to be enhancers of RNA II-mediated transcription of early genes (26, 48, 62). hrs have also been characterized as origins of DNA replication (66, 67) and are typically dispersed around the genome. This dispersion is suggested to speed the rate at which replication occurs by initiation at several sites of the genome (67). However, in NeseNPV the hrs were located predominantly in the first half of the circular genome, with hr1 at 8.6 to 9.1 kb (9.95 to 10.58 map units) and hr6 at 49.0 to 49.5 kb (56.7 to 57.2 map units). Furthermore, the distance between hr5 and hr6 was 26.9 kb, which reveals a further clustering of hrs in the first quarter of the genome (Fig. 1; Table 4). Clustering of the hrs has also been observed in CuniNPV, with its hrs concentrated in 35% of the genome.
A unique feature observed in NeseNPV hrs was the presence of multiple copies of the sequences ATAAG and GATA (Fig. 3A and 3C). Baculovirus late promoter motifs are (A/G/T)TAAG; however, TAAG is the essential element (51). Although the late promoter element TAAG is found in baculoviruses, it is generally not located inside of an hr. Interestingly, a GATA element that binds host transcription factors (40) is located close to, and in hr1 and hr2 overlapping, each of the TAAG late promoter motifs within the hrs. Multiple GATA elements have been found in the Hz-1 PAT1 persistence-associated transcript, which does not encode a protein (12). The sequence of PAT1 also displays abundant direct and inverted repeats. No ORFs are predicted for the NeseNPV hrs either, indicating that they could have a yet-uncharacterized regulatory function similar to that of the GATA elements of Hz-1 PAT1. hrs are a distinctive feature of the majority of baculovirus genomes and have been shown to be essential for regulatory function (26, 62, 66, 67). The function of the NeseNPV hrs as origins of replication or enhancers of transcription has yet to be determined experimentally.
GeneParity. The comparison of the gene order of NeseNPV and six other genomes showed that from the 43 NeseNPV baculovirus homolog genes, 42 were in AcMNPV, 41 in LdMNPV, 43 in MacoNPV-A, 43 in HzSNPV, 41 in CpGV, and 29 in CuniNPV. Eight clusters of two or more genes were found; however, not all of the genomes had all these clusters (Fig. 4). Clusters 1, 2, 3, and 5 were absent in CuniNPV, while clusters 6, 7, and 8 were present only in CuniNPV. Relaxed clusters were determined to try to recognize regions where genes may have been added or deleted between genes that were consecutive in a more ancestral lineage of baculovirus. For example, cluster 2 (p45, p40, and p6.9) was considered relaxed because in NeseNPV as well as in the other genomes there is an ORF between p45 and p40. Cluster 3 (vlf-1, ac78, gp41, and ac81) was considered relaxed only in AcMNPV because there is an extra ORF between ac78 and gp41 not found in any of the other baculovirus genomes analyzed. Cluster 4 (lef-5, 38K, ac96, helicase, and lef-4) was the most conserved cluster. It was considered relaxed to accommodate the addition of lef-4 to a cluster that has already been shown to be conserved (32). Five genes exist between helicase and lef-4 in CuniNPV, while the other baculoviruses have either three or four genes. However, not only was the order of the genes in cluster 4 conserved, but also their relative transcriptional direction. Cluster 5 (ac142, odv-e18, odv-ec27, and ac148) was relaxed because in NeseNPV there is an extra ORF between ac142 and odv-e18, which does not occur in the five other genomes.
![]() View larger version (33K): [in a new window] |
FIG. 4. GeneParity plots comparing the NeseNPV gene order to six other baculoviruses: AcMNPV (A), LdMNPV (B), MacoMNPV-A (C), HzSNPV (D), CpGV (E), and CuniNPV (F). Eight clusters are underlined and/or boxed in the graphs, which include the following genes or AcMNPV homologs: 1, ac92, ac93; 2, p45, p40, p6.9; 3, vlf-1, ac78, gp41, ac81; 4, lef-5, 38K, ac96, helicase and lef-4; 5, ac142, odv-e18, odv-ec27*, ac148; 6, lef-9, ac68; 7, p47, p74; and 8, lef-1, ac115. Asterisks represent genes that are not part of a specific cluster in certain genomes. Straight lines mark clusters of two or more genes (underlined above) that are sequential in both NeseNPV and the genome being compared. Relaxed clusters that include genes that are not sequential in the compared genome are indicated with a box.
|
The presence of conserved gene clusters can result from a physical constraint preventing their separation, as suggested by Herniou et al. (32) about lef-5, 38K, ac96, and helicase (cluster 4 in the present analysis). Another explanation for the conservation of clusters could be gene overlapping. Cluster 3 had genes with overlapping coding regions. In genes transcribed on opposite strands and away from each other, such as the genes from clusters 1 and 4, the promoter region or start codon of one gene could be included in the continuous gene and so the physical separation of these two genes may cause their inactivation. Therefore, the continuity of the gene cluster would be maintained in the evolution of the baculovirus lineages.
Baculovirus phylogeny. The DNA polymerase was used previously to investigate the relationship among distantly related organisms and DNA viruses (10). Therefore, the DNA polymerase of the baculovirus was used in order to help establish the position of the root for a baculovirus tree by rooting it with 23 DNA polymerases from widely diverse phylogenetic origins. It was necessary to include these diverse taxa to determine whether there were different DNA polymerase lineages for baculoviruses and the Hz-1 virus. The results (Fig. 5a) indicated that 75% of the time the DNA polymerases did split into viral and nonviral. Moreover, the NeseNPV DNA polymerase was more related to those from the lepidopteran NPV and GV than the DNA polymerase of the CuniNPV from Diptera. The Hz-1 virus DNA polymerase possibly does not belong to a baculovirus monophyletic group including CuniNPV, NeseNPV, and the lepidopteran NPVs and GVs. These results suggested that possibly CuniNPV was the appropriate rooting outgroup for the monophyletic baculovirus group, which was also supported by the overall level of genomic similarity to NeseNPV and to the other NPVs and GVs (e.g., CuniNPV does not have a polyhedrin/granulin ortholog). The same adjacency pattern, indicating that the CuniNPV is the most ancient lineage, was obtained by midpoint rooting of the maximum-likelihood tree or by UPGMA clustering after excluding all nonbaculovirus DNA polymerases from the analysis (data not shown). Once the cladogenetic events were oriented in time by using the DNA polymerase gene, indicating the split of the CuniNPV lineage from the ancestral stem leading to the NeseNPV and the other lepidopteran NPVs and GVs, a maximum-likelihood tree was estimated for the proteons including 29 conserved genes for 24 baculoviruses and used the CuniNPV as outgroup. Figure 5b shows a tree for the proteons which agrees with the DNA polymerase placement of CuniNPV and NeseNPV. As for the DNA polymerase gene phylogeny, the same adjacency pattern was also obtained without an explicit choice of outgroup by midpoint rooting and UPGMA clustering. Additionally, the group I NPV, group II NPV, and GV clades are delineated using both proteons and the DNA polymerase in the phylogenetic analysis.
![]() ![]() View larger version (51K): [in a new window] |
FIG. 5. DNA polymerase phylogeny (A) indicates distinct lineages when all sequenced baculoviruses and the Hz-1 DNA polymerases are included. The root position of the DNA polymerase gene suggested that the lepidopteran NPV and GV share a common ancestral lineage with the NeseNPV after its split from the one leading to the CuniNPV. (B) A global maximum-likelihood phylogeny for 29 conserved genes from 24 baculoviruses supports the DNA polymerase tree in its separation of NeseNPV and CuniNPV. The numbers near nodes indicate the percentage of time a partition was found by the quartet method and is considered the level of support for the tree. Only nodes appearing more than 50% of the time were resolved.
|
We thank Ronit Keisari for technical help, Basil Arif and Hilary Lauzon for sharing information about the Neodiprion lecontei NPV, and D. Boucias, Q. Li, and Pamela Howell for comments and corrections on the manuscript.
Florida Agricultural Experiment Station journal series no. R-09762. ![]()
Present address: LEMBInstituto de Ciências BiomédicasUSP, CEP 05508-900, São Paulo, SP, Brazil. ![]()
Present address: Department of Molecular and Cellular Oncology, University of Texas, M. D. Anderson Cancer Center, Houston, TX 77054. ![]()
|| Present address: Department of Pediatrics, Health Science Center at Houston Graduate School of Biomedical Sciences, University of Texas, Houston, TX 77030. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»