Isolation and Characterization of the First Freshwater Cyanophage Infecting Pseudanabaena

This study presents the isolation of the very first freshwater cyanophage, PA-SR01, that infects Pseudanabaena, and fills an important knowledge gap on freshwater cyanophages as well as cyanophages infecting Pseudanabaena.

C yanobacteria play important roles in primary production and trophic interactions.
They are the dominant autotrophs in most aquatic environments, such as freshwater and marine environments (1). Viruses infecting cyanobacteria are referred to as cyanophages and can play major roles in the dynamics, genetic diversity, and structure of cyanobacterial communities (2)(3)(4). Compared to marine cyanophages, which have been widely studied (5), there are very limited studies on freshwater cyanophages (6,7).
To better understand the biological interactions and evolutionary relationships between cyanophages and their host, cyanophage whole-genome sequences could provide a solid platform to elucidate such relationships (8)(9)(10)(11)(12). At the same time, as metagenomics becomes a more prevalent approach to monitoring environmental cyanophage diversity, genomic sequences of cultured cyanophages are needed for more precise annotation of the viral metagenome. Most viral sequences in metagenomic databases cannot be allocated putative functions and there are still many viral contigs of unknown identity in metagenomes (13,14). This further strengthens the case on host cell growth with or without addition of chloroform-treated MLA medium. Chloroform sensitivity serves as a first indication of a viral lipid component (20). Chloroform can dissolve lipids that may be structural components of infection mechanisms in lipid-containing phage (21). However, chloroform sensitivity alone does not prove the presence of lipid in viral particles. Further studies are needed to confirm the presence of lipids in PA-SR01. This figure also shows that the latent period of PA-SR01 is approximately 1 day, which is relatively short compared to other freshwater cyanophages such as S-LBS1 (latent period of 4 days), PaV-LD (latent period of 2 days), and S-EIV1 (latent period of 2 days) (9,10,12).
Genomic overview. The 137,012-bp genome of PA-SR01 is circularly permutated (Fig. 3) with a GC content of 39.5%. One hundred sixty-six ORFs were predicted, though most ORFs in PA-SR01 did not have homologous genes of known function. In total, 47 ORFs had significant similarity to other sequences, and more than 70% of the ORFs could not be annotated to any homologs. Only 11 ORFs were similar to phage sequences and only 17 were similar to genes of known function (BLASTp; E value cutoff ϭ 10 Ϫ5 ). Three clustered tRNA genes, tRNA Met , tRNA Asp , and tRNA Gly , were identified ( Table 2). Genome annotation of PA-SR01 ORFs identified putative genes with functions associated with structural proteins, DNA metabolism, DNA packaging, and lysis (Table  3). HHpred was used to ascribe function to additional ORFs (Table 4). This resulted in the identification of genes encoding putative functions associated with DNA-binding domains (ORF3), Mu-like prophage I protein (ORF24), major capsid protein (ORF32), and PD-(D/E)XK endonuclease (ORF139). In total, 21 ORFs showing homology to genes of known function were obtained.
PA-SR01 is morphologically distinct from known Caudovirales, and this is also reflected in the genes shared between them. Out of 166 ORFs predicted, merely 6 ORFs were homologous to genes from known Caudovirales. This suggested that PA-SR01 did not belong to the order Caudovirales, which was further supported by the observed tailless morphology of PA-SR01.
Structural genes. ORF41 is the only ORF in PA-SR01 encoding tail tape measure protein (TMP), which is a tail-associated protein. Tail tape measure protein of tailed phages determines the tail length and enables DNA transition into the host cell during infection (22). Despite the name suggesting its widespread presence in tail phage genomes, the tail tape measure protein-encoding gene has, nevertheless, also been observed to be present in tailless phages (10). This indicates that tail tape measuring   (10,23). To obtain a more thorough understanding of structural proteins in PA-SR01, a mass spectrometer approach is needed. Host-derived genes. Host-derived metabolic genes are commonly present in cyanophages and play important roles in interactions between cyanophage and their host (24). For example, a survey of 33 cyanophages revealed that psbA was found in 88% of the cyanophage genomes and 50% of the cyanophages contained both psbA and psbD genes (25). Besides photosynthetic genes, other host-derived genes have also been found that are responsible for phycobilisome degradation, carbon metabolism, phosphate uptake, and nucleotide biosynthesis (11,(26)(27)(28). The only host-derived metabolic gene identified in PA-SR01 genome is ribonucleotide-diphosphate reductase (RNR) (24). This suggests that PA-SR01 is evolutionary distinct from known cyanophages and could have its own special metabolic genes that require further study.
In PA-SR01, ORF107 and ORF108 were found homologous to ribonucleotidediphosphate reductase (RNR) subunit alpha and beta, respectively. The RNR gene product can reduce ribonucleotide diphosphate to deoxy-ribonucleotide diphosphate, which is a precursor of DNA (26). Cyanophage can thus make use of RNRs to degrade host DNA to provide building blocks for synthesizing genomes of phage progeny. RNR genes are considered essential for the rapid replication found in lytic phage (29), and this could be a contributing factor to the short latent period of PA-SR01.
Nucleotide metabolism. Besides RNR, there were several genes identified that are involved in nucleotide metabolism. The PA-SR01 genome encodes a homolog (ORF103)   (31). In the PA-SR01 genome, another gene possibly involved in nucleotide metabolism is ORF116, encoding a homolog of deoxy-nucleotide monophosphate kinase which may phosphorylate dGMP, dTMP, and 5-hydroxymethyl-dCMP to be used in producing new viral DNA genomes (32). Both ThyX and deoxy-nucleotide monophosphate kinase might contribute to phage genome replication in PA-SR01. dTMP produced by ThyX could be phosphorylated to dTDP, which could be further phosphorylated by nucleoside-diphosphate kinase (NDPK) to form dTTP, a monomer that can be utilized by DNA polymerase (ORF114) to generate long-chain DNA molecules. However, no homologs of NDPK were found in the PA-SR01 genome, and thus further studies are needed to better understand the detailed nucleotide biosynthesis strategy of PA-SR01. Insertion element. PA-SR01 has one ORF showing extremely high similarity to the ORF from cyanobacterium Tolypothrix bouteillei. ORF81 has 90% amino acid sequence similarity to IS200/IS605 family element transposase accessory protein. Such a high sequence similarity is rare in phage genome and suggests that this ORF originated from recent horizontal gene transfer. Similar insertion sequences (IS) have been observed in other phage genomes (11,33) and their functions remain unknown. IS elements are rare in phage genomes and are considered disadvantageous for bacteriophage propagation as they could disrupt the efficiency of phage genome organization (33). This also supports the hypothesis that ORF81 came from recent horizontal gene transfer, as it is less likely for a phage genome with IS elements to propagate and pass on its gene over many generations compared to phage without IS elements. Lysis-associated genes. The lysozyme homolog is commonly found in cyanophage and is believed to be the functional gene for cell lysis (9,11,12). However, no homologs of lysozyme can be found in the PA-SR01 genome; instead, ORF83 encodes a putative septal ring lytic transglycosylase RlpA family protein. Lytic transglycosylases represent a major class of enzymes capable of lysing bacterial cell walls with the same substrate specificity as lysozyme. Across different families of lytic transglycosylases, family 4 has been shown to be involved with bacteriophage-induced lysis (34). Rare lipoprotein A (RlpA) was found to be a new lytic transglycosylase with strong preference for naked glycan strands (35). ORF83, encoding a homolog of the RlpA family, could be the key gene responsible for cell lysis. This suggests that PA-SR01 adopts a different lysis strategy from known cyanophages and that PA-SR01 is likely to be evolutionary distinct.
PA-SR01, a new evolutionary lineage of cyanophage. PA-SR01 represents a new evolutionary lineage of cyanophage based on its genomic content. There is a lack of structural gene similarity between the PA-SR01 genome and other phage genomes, with the exception of the major capsid protein (ORF32) and tail-tape measuring protein (ORF41). This is further supported by the morphological features of PA-SR01. To our knowledge, PA-SR01 is only the second tailless cyanophage discovered and a vast majority of cultured cyanophages belong to the order Caudovirales. Besides structural distinction, PA-SR01 adopts a different lysis strategy from other cyanophages, based on the fact that lytic transglycosylase instead of lysozyme is found in the PA-SR01 genome.
Phylogenetic analysis of the terminase large subunit (terL) and major capsid protein shows that PA-SR01 is evolutionary distinct from other cyanophage isolates. Although PA-SR01 terL is related to T7-like phages, it does not fall within the group of T7-like phages (Fig. 5). Furthermore, the amino acid sequence percentage identity shared between PA-SR01 terL and S-CBS2 terL is merely 26%. The BLASTP result of PA-SR01 terL showed much greater similarity to noncyanophage terL sequences, indicating an evolutionary divergence of terL in PA-SR01.
Maximum likelihood amino acid tree of the major capsid protein provides further evidence that PA-SR01 is evolutionarily distinct from other cyanophage isolates (Fig. 6). A majority of the phages fall within the three main groups, Myoviridae, Siphoviridae, and Podoviridae, respectively. However, PA-SR01 does not fall within any of the clades and represents an independent branch, providing further support of the evolutionary divergence of PA-SR01 from other phages.
PA-SR01 sequence similarities in the environment. The widespread occurrence of viral sequences similar to PA-SR01 in the environment is shown by the recruitment of metagenomics reads onto the translated PA-SR01 genome. Both marine and freshwater environments were investigated in this analysis ( Fig. 7A to C), and 146 ORFs were mapped with at least with one freshwater metagenome and 106 ORFs were mapped with multiple freshwater metagenomes. Twenty-six ORFs were extensively mapped to freshwater metagenomes. Fifty-eight ORFs were mapped to at least one marine metagenome and twenty-two ORFs were mapped across several marine metagenomes. This indicates that PA-SR01-like phages are much more prominent in freshwater. Seven ORFs (ORF12, ORF29, ORF64, ORF103, ORF114, ORF121, and ORF134) were extensively mapped with marine metagenomes and they were all extensively mapped with freshwater metagenomes as well. This suggests that phages adopting similar packaging strategies (e.g., ORF12 encoding a terminase large subunit) and similar DNA metabolism (e.g., ORF29 encoding a DEAD/DEAH box helicase, ORF114 encoding a DNAdirected DNA polymerase, and ORF103 encoding an FAD-dependent thymidylate synthase) are widespread in aquatic environments.
The FAD-dependent thymidylate synthase ThyX (ORF103) and a hypothetical protein (ORF134) have the most recruited sequences across both marine and freshwater metagenomes. ThyX is a key gene in double-stranded phage genome replication, suggesting that phages with similar DNA replication strategy are widespread in aquatic First Freshwater Cyanophage Infecting Pseudanabaena Journal of Virology environments and ThyX is an important part of phage DNA replication for both marine and freshwater phages. There are also a large number of recruited reads to IS200/IS605 family element transposase accessory protein TnpB (ORF81), located around 80 kbp. In contrast to the marine environment, 5 out of 10 freshwater metagenomes were recruited onto ORF81. As mentioned previously, TnpB was considered disadvantageous for bacteriophage and is not widely present in known cultured phage (11,33). The recruited reads from multiple metagenomes onto ORF81 suggests that TnpB might not be detrimental for phage propagation or there could be extensive horizontal gene transfer of TnpB gene from host bacteria to phage. It is clear in Fig. 7 that we observed many more recruited reads from freshwater viral metagenomes than marine. The majority of ORFs across the genome of PA-SR01 have recruited sequences in the metagenome from urban freshwaters in Singapore. This was expected, since PA-SR01 was isolated from a water body in Singapore and it is likely that viral sequences similar to PA-SR01 are widely distributed locally. Surprisingly, metagenomes from Lake Michigan produced a comparable amount of recruited reads on the PA-SR01 genome. This strongly suggests the widespread presence of viral sequences similar to PA-SR01 around the globe. Further evidence for the widespread presence of viral sequences similar to PA-SR01 in the environment is provided by the relative abundance of different cyanophages in the Lake Michigan metagenome (Fig. 8). It is clear that viral sequences similar to PA-SR01 are prevalent in the Lake Michigan metagenome. Although there are several other phages having higher normalized recruited reads than PA-SR01, they are of comparable amount. Furthermore, the number of recruited reads of PA-SR01 is apparently much higher than the majority of other known cyanophages examined in this analysis. Admittedly, the occurrence of PA-SR01 relative to other cyanophages is overestimated due to the fact that only the blast hit with highest E value was recruited. For example, in the list of selected cyanophages there are several P-HM1-like phages, e.g., P-RSM4, P-SSM2, P-TIM68 and Syn1. Significant sequence similarity and core genes are shared among those phages, but only one phage genome would recruit each read, causing the dilution of read numbers assigned to each P-HM1-like cyanophage. Nonetheless, the data indicate that viral sequences similar to PA-SR01 phages are relatively abundant in freshwater environments.
The relative abundance of viral sequences similar to PA-SR01 in the Pacific Ocean surface water (Fig. 9) is much lower than that in Lake Michigan and is among the least abundant, suggesting that viral sequences similar to PA-SR01 are more prevalent in freshwater environments. Since PA-SR01 was isolated from freshwater, it is more likely to have genes specific to freshwater environments.
In conclusion, this study describes the characteristics and genome of PA-SR01, a rare tailless cyanophage with a uniquely different set of genes from other known cyanophages. PA-SR01 infects a tropical isolate of Pseudanabaena sp. and represents a new evolutionary lineage of cyanophage. Comparative metagenomics data indicate the global prevalence of PA-SR-01-like phages in both freshwater and marine environments. PA-SR01 and related viruses are likely to play major roles in controlling and shaping Pseudanabaena populations. Given the large number of genes without homologies in PA-SR01, more work is needed to characterize the phage-host interactions and ecological roles of PA-SR01.

MATERIALS AND METHODS
Host cells. The Pseudanabaena strain KCZY-C8 was isolated in February 2019 from a tropical eutrophic fresh water body (Singapore Serangoon Reservoir) at 1°23=26.2ЉN 103°54'58.7ЉE 15 cm below the surface water. The strain was isolated by micropipetting from a surface water sample into sterile MLA medium (36) at 25°C. Identification of the strain was determined to the level of genus following the morphological characteristics (cell shape, dimension, and organization within trichome) reported in Bergey's Manual of Systematics of Archaea and Bacteria (37) and other studies (38)(39)(40). Detailed traits of KCZY-C8 can be found in Fig. 10. We also used partial bacterial 16S rRNA sequence to verify the strain identity ( Table 6). The culture was then incubated and maintained in batch culture at 25°C under low radiance (20 mol photons m Ϫ2 s Ϫ1 ) with a 12-h/12-h light/dark cycle.
Cyanophage isolation. Cyanophage PA-SR01 was isolated from viral concentrates collected from surface water as described above. Briefly, 450 ml of water was filtered through 0.2 m (Nuclepore) pore size filters. The virus-sized particles in the filtrate were concentrated 100-to 200-fold with a 100-kDa molecular weight (MW) cutoff in ultrafiltration centrifugal tubes (Amicon Ultra-15 centrifugal filter units; Millipore). Viral concentrate was stored at 4°C in dark before any further action. Viral concentrate was serially diluted up to 10 7 times. PA-SR01 was isolated by adding the aliquots to an exponentially growing culture of Pseudanabaena strain KCZY-C8 in a 24-well microtiter plate and incubating at 25°C under low radiance (20 mol photons m Ϫ2 s Ϫ1 ) with 12-h/12-h light/dark cycle for 14 days. Culture lysis was determined by a substantial decrease in optical density at 750 nm (OD 750 ) compared with control cultures (41). A clonal viral isolate was obtained by three rounds of extinction dilution (42)  line represents a read recruited from one of the following publicly available metagenomics data sets: (A) freshwater viral metagenome: Lake Pavin, Lake Bourget, Lake Neagh, reclaimed water virus, and Singapore urban freshwater; (B) marine viral metagenomes: Baltic Sea, Papua New Guinea, Patagonia, Gulf of Mexico, San Pedro Channel, and Pacific Ocean surface; (C) freshwater viral metagenome: Lake Baikal, Lake Limonopolar, Lake Michigan, Lake Soyang, and Han River. Transmission electron microscopy. Thirty milliliters of PA-SR01 lysate was centrifuged at 15,000 ϫ g for 5 min followed by filtering through a 0.22-m syringe filter to remove the cellular debris. The filtered lysate was centrifuged at 5,000 ϫ g with a 100-kDa MW cutoff in ultrafiltration centrifugal tubes (Amicon Ultra-15 centrifugal filter units; Millipore) to increase the phage particle concentration. For  staining, 20 l of gadolinium triacetate (1% wt/wt) was adsorbed to the surface of copper grids at room temperature for 1 min. Excess liquid was blotted from the side of the copper grids with clean filter paper. The grids were viewed and photographed on a JEOL JEM-2100F field emission gun transmission electron microscope at the National University of Singapore Faculty of Chemical and Biomolecular Engineering.
Host range. PA-SR01 infectivity was tested against local freshwater isolates of cyanobacteria strains, as well as cyanobacteria obtained from the Commonwealth Scientific and Industrial Research Organisation (CSIRO) culture collection. PA-SR01 phage lysate (1 ml) was added to cultures of exponentially growing cyanobacteria as listed in Table 1. Growth of cyanobacteria cultures without PA-SR01 addition was also monitored to serve as a control. Infectivity was determined by a reduction in OD reading compared to control.
Chloroform sensitivity. Chloroform sensitivity of the cyanophage was tested. Filtered lysate (1 ml) was mixed with 1 ml of chloroform followed by shaking manually for 10 min. Chloroform removal was carried out by centrifugation at 4,100 ϫ g for 5 min at room temperature. The aqueous phase was transferred to a 1.5 ml microcentrifuge tube and incubated for 6 h at room temperature to remove any remaining chloroform. One milliliter of chloroform was added to treat 1 ml of MLA medium to serve as the control. Chloroform-treated MLA, nontreated MLA, and treated and nontreated virus particles were added to exponentially growing Pseudanabaena strain KCZY-C8 cultures and the OD was measured over 6 days.
DNA extraction, purification, and sequencing. Pseudanabaena strain KCZY-C8 was grown in 300 ml of MLA medium at 25°C under low irradiance (20 mol photons m Ϫ2 s Ϫ1 ) with a 12-h/12-h light/dark cycle until lysis. The lysates were centrifuged at 15,000 ϫ g for 5 min to remove cellular debris. The supernatant containing the majority of viral particles was filtered through a 0.22-m syringe filter (Minisart syringe filter, Satorius) to remove cellular debris. In order to remove free nucleic acid, the lysate was treated with DNase I. The treated lysate was concentrated with a 100-kDa MW cutoff ultrafiltration centrifugal tube (Amicon Ultra-15 centrifugal filter units; Millipore) at 5,000 ϫ g to a final volume of 1 ml. QIAamp DNA minikit was used to extract viral DNA with 20 l of RNase A added in the first step to remove RNA. The cyanophage genome was sequenced using an Illumina High throughput sequencer, with a 150-bp paired-end library constructed using a New England BioLab Next Ultra DNA library prep kit.
Genome assembly. The sequencing data were trimmed using BBDuk (version 35.43) to remove adaptors and Phix reads. Reads were de novo assembled into contigs by MetaSPAdes genome assembler (3.12.0) (43).
Genome annotation. The open reading frames (ORFs) were predicted using GeneMarkS (44) and Prodigal (45); where the prediction differed, the longer of the two was kept. Homology searching was performed with BLASTp against NCBI nonredundant (nr) database (accessed in October 2019). Sequences with E values of Ͻ10 Ϫ5 were considered to be homologs. HHpred against protein data bank (PDB) and Pfam database were used to predict more distant homologs (46). The genome was analyzed for tRNA genes with tRNAscan-SE 2.0 (47) and for Rho-independent terminators using Findterm (48), with the energy threshold set to Ϫ16 kCal. A genomic map was generated with CGview (49).

SDS-PAGE analysis for structural protein.
Purified PA-SR01 was diluted in SDS buffer (5:1, vol/vol) and heated at 95°C for 5 min. The sample was then resolved by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) using a Mini-PROTEAN Tetra Cell (Bio-Rad Laboratories). The Mini-PROTEAN TGX stain-free precast gel was run in an SDS running buffer (pH 8.3) at 120 V for 1.5 h using a PageRuler unstained protein ladder (Thermo Fisher) for size calibration.
Phylogenetic analysis. The large terminase subunit (terL) and major capsid protein were compared phylogenetically with those from other cyanophages and bacteriophages (Table 7) using Mega-X software (version 10.1.6). ClustalX was used to align the inferred amino acid sequences with default parameters. Based on the multiple sequence alignment, the Jones-Taylor-Thornton (JTT) model was selected and a maximum likelihood tree was constructed with 100 bootstrap replicates.
Recruitment of reads to metagenomics. The presence of viral sequences similar to PA-SR01 in aquatic environments was investigated by recruiting viral metagenomics data onto the genome of PA-SR01 (50). In total, 88 gigabytes of freshwater metagenome data and 173 gigabytes of marine metagenome data were used (Table 8). Briefly, metagenomic data were first made into a BLAST nucleotide database and queried with the predicted protein sequence of PA-SR01 using tBLASTn (E value of Յ10 Ϫ5 ), which performed a six-frame translation of the subject nucleotide sequence into protein sequence (51). Metagenomics nucleotide reads with a blast hit to PA-SR01 were then extracted from each metagenome and used as query to blast (BLASTx, E value of Յ10 Ϫ5 , max_target_seqs ϭ 1) against a viral protein database containing predicted proteins of PA-SR01 phage and another 2,536 bacteriophage genomes from the NCBI Reference Sequence Database (RefSeq; accessed on Jan 2020). If the best hit was related to PA-SR01 instead of the other phages, it was recruited as viral sequences similar to PA-SR01 and mapped onto the genome of PA-SR01, based on percentage identity of amino acid sequence using ggplot2 (52).  The BLAST hits number to PA-SR01 was normalized by dividing by the total number of predicted ORFs and the size of the metagenome (in gigabytes), which provides a normalized measure to compare recruitments across metagenomes of different size. Similar recruitment analysis was also performed for other phage genomes (Table 9).
Data availability. The whole-genome sequence of the phage is available in GenBank under accession number MT234670.