## ABSTRACT

Cytomegalovirus (CMV) is acquired by the oral route in children, and primary infection is associated with abundant mucosal replication, as well as the establishment of latency in myeloid cells that results in lifelong infection. The efficiency of primary CMV infection in humans following oral exposure, however, is unknown. We consistently detected self-limited, low-level oral CMV shedding events, which we termed transient CMV infections, in a prospective birth cohort of 30 highly exposed CMV-uninfected infants. We estimated the likelihood of transient oral CMV infections by comparing their observed frequency to that of established primary infections, characterized by persistent high-level shedding, viremia, and seroconversion. We developed mathematical models of viral dynamics upon initial oral CMV infection and validated them using clinical shedding data. Transient infections comprised 76 to 88% of oral CMV shedding events. For this high percentage of transient infections to occur, we identified two mathematical prerequisites: a very small number of initially infected oral cells (1 to 4) and low viral infectivity (<1.5 new cells infected/cell). These observations indicate that oral CMV infection in infants typically begins with a single virus that spreads inefficiently to neighboring cells. Thus, although the incidence of CMV infection is high during infancy, our data provide a mechanistic framework to explain why multiple CMV exposures are typically required before infection is successfully established. These findings imply that a sufficiently primed immune response could prevent CMV from establishing latent infection in humans and support the achievability of a prophylactic CMV vaccine.

**IMPORTANCE** CMV infects the majority of the world's population and is a major cause of birth defects. Developing a vaccine to prevent CMV infection would be extremely valuable but would be facilitated by a better understanding of how natural human CMV infection is acquired. We studied CMV acquisition in infants and found that infections are usually brief and self-limited and are successfully established relatively rarely. Thus, although most people eventually acquire CMV infection, it usually requires numerous exposures. Our analyses indicate that this is because the virus is surprisingly inefficient, barely replicating well enough to spread to neighboring cells in the mouth. Greater knowledge of why CMV infection usually fails may provide insight into how to prevent it from succeeding.

## INTRODUCTION

Human cytomegalovirus (CMV) infects most people worldwide and is an important cause of disease in congenitally infected or immunocompromised individuals (1). In addition, chronic CMV infection may have important indirect effects at the population level (2–4). Licensure of a prophylactic CMV vaccine is therefore a major public health priority (5). However, development of a CMV vaccine would benefit from a better understanding of the determinants of successful natural primary infection. The early events of CMV acquisition are difficult to observe because primary infection occurs unpredictably, beginning in early childhood, and is asymptomatic. Like all human herpesviruses (HHVs), CMV establishes lifelong latency following primary infection, with periodic lytic reactivation and viral shedding that allows transmission to new hosts. It is known that CMV is shed in the breast milk of nearly all infected women (6, 7); however, not all infants exposed to CMV in breast milk acquire infection (8, 9). The efficiency with which CMV infection is acquired following natural exposure is poorly understood. For example, to our knowledge, whether mucosal CMV infection may sometimes fail to establish latency has not been previously described.

A novel advance in studying the biology of HHV infection in infants is the use of frequent longitudinal oral sampling from birth to determine the exact time of HHV acquisition (10, 11). We used this approach to characterize primary CMV infection in a birth cohort of 30 highly exposed Ugandan infants (11). Viral expansion in the oral cavity during primary CMV infection was remarkably slow even in the absence of adaptive immune pressure, peaking only after several weeks, which is explained by inefficient viral spread to new cells (12). Here, we describe brief episodes of low-level oral CMV shedding that were detected in a high proportion of infants who did or did not subsequently go on to develop primary infection during the observation period. We term these episodes transient infections and show, using mathematical models, that the observed frequency of transient infections requires that, following exposure to CMV in the oral cavity, there are initially only very few infected cells and that they have low infectivity for other cells. These data indicate that the high incidence of established CMV infection among Ugandan infants is best explained by repeated low-efficiency exposures to CMV. Our findings support viral genomic analyses showing strong bottlenecks during transplacental transmission that result in small CMV founder populations (13). Thus, we speculate that a relatively small induction of immune pressure by vaccination may be sufficient to protect against CMV infection during infancy.

## RESULTS

Low-viral-copy-number oral shedding detected prior to primary infection is consistent with transient infection in Ugandan infants.We analyzed oral shedding in a cohort of 30 Ugandan infants who underwent weekly oral sampling for multiple HHVs by quantitative PCR. In addition to the 20 primary CMV infections (associated with persistent high-level oral shedding, viremia, and seroconversion [11]), we also observed 136 self-limited episodes of CMV DNA detection in oral swabs from infants who had not acquired primary infection. These episodes, termed transient infections, occurred commonly in both infants who subsequently did and did not develop primary infection within the observation period (Fig. 1A and B). Transient infections were brief; among the 102 transient infections where duration was not censored, 75% were limited to a single positive swab (potential range, 1 to 13 days), 15% were observed over two consecutive positives swabs (8 to 20 days), and the remaining 10% had three or four consecutive swabs (15 to 34 days) (Fig. 1C). Transient infections also were notable for low maximum log_{10} CMV DNA copy numbers (median, 3.5; interquartile range [IQR], 3.0 to 3.9; range, 2.3 to 5.5) (Fig. 1D). These features differed dramatically from those of primary infection (Fig. 1B), during which shedding was sustained throughout the remaining observation period in all 20 infants (median, 31 weeks; IQR, 9 to 47) with high peak log_{10} CMV DNA copy numbers (median, 7.5; IQR, 7.2 to 8.6; range, 4.3 to 8.9) (Fig. 1D).

Low-level CMV detection suggests viral replication in oral mucosal cells.Transient infections were observed in all other HHVs tested (69 for Epstein-Barr virus [EBV], 70 for herpes simplex virus (HSV), 6 for HHV-6, and 16 for HHV-8) (Table 1). Each of these viruses demonstrated a unique pattern of transient infection, which suggests that viral detection is determined by host-pathogen interactions rather than the presence of nonreplicating viral DNA in saliva. Specifically, while the median durations and maximum DNA copy numbers of swabs with positive PCR results prior to primary infection were similar for CMV, EBV, HSV, HHV-6, and HHV-8 (data not shown), the median proportion of swabs positive for viral DNA prior to establishment of primary infection was considerably higher for CMV (17.5%) than for EBV (4%), HSV (3%), HHV-6 (0%), and HHV-8 (0%) (Fig. 1E), despite high shedding rates of both EBV and HHV-6 in siblings and mothers and similar incidences of infant primary infection with CMV and HHV-6 (11). Substantial interparticipant variability was observed for CMV and EBV, with less variability noted for other HHVs (Fig. 1E).

The proportions of CMV-positive swabs were similar between infants who developed primary CMV infection (excluding time points after acquisition of primary infection) and infants who did not develop primary infection during the study period (median, 17.5% versus 18%, respectively) (Fig. 1F). This result suggests that brief periods of transient shedding prior to established primary infection are unlikely to represent early primary infection. The frequencies of positive swabs for other HHVs were also similar whether infants developed an established infection with the virus or not (Fig. 1F).

The oral swab data were inconsistent with detection of residual HHV DNA in breast milk. Transient CMV infections were commonly observed among infants who had not breastfed in the previous week (77 of 399 swabs; 19%), and transient infections with other HHVs occurred equally among infants who had and had not breastfed (4% versus 5%) during the previous week. Furthermore, transient HSV, HHV-6, and HHV-8 infections were detected (Fig. 1E) despite the fact that, unlike CMV and EBV, these HHVs are not readily detected in breast milk (14–16). Finally, transient low-level viral-DNA detection was not consistent with laboratory contamination, based on the consistent absence of HHV DNA detection in negative controls for all PCR runs in which these data were generated.

Most CMV transmissions to infants result in transient infection.We estimated the probability of transient CMV infection per transmission by analyzing the observed counts of oral shedding events prior to established infection using a Kaplan-Meier (KM) analysis (Table 1 and Fig. 2). In an analysis inclusive of all the infants, we estimated that 88% (95% confidence interval [CI], 80 to 92%) of CMV transmission events resulted in transient infection. When we included only infants who went on to develop primary CMV infection during the study period, we estimated that 76% (95% CI, 65 to 83%) of CMV transmissions resulted in transient infection (Fig. 2). These models suggest very similar proportions of transient infection for EBV and HSV and far lower rates for HHV-6 (Table 1).

The probability of transient CMV infection decreases as the basic reproductive number (*R*_{0}) and number of initially infected cells (*I*_{0}) increase.We next used mathematical models to identify the necessary conditions for frequent transient CMV infection. Our first model was a stochastic ordinary differential equation (ODE) model. Stochastic mathematical models incorporate equations that sequentially update the integer values of all key model variables, such as the number of infected cells and the viral load at narrow time intervals, by randomly drawing these numbers from distributions. They are ideally used for biological processes that start with a small number of events. Parameters of viral replication and spread and decay rates, as well as the infected-cell death rate, were obtained from prior deterministic mathematical model fitting to dozens of serial viral loads during sustained primary CMV infections among 14 infants in the cohort that had at least 6 months of oral samples following acquisition (12). Of note, our deterministic model did not include a mounting immune response during the early stages of infection because unimpeded exponential growth of CMV is observed during the first 1 to 3 months of sustained primary infection (12).

We previously calculated the *R*_{0} for oral primary CMV infection by fitting the data from each infant in the cohort. *R*_{0}, which is defined as the average number of cells infected by a single cell in the absence of target cell limitation, varied among infants but was generally low (median, 1.63; range, 1.09 to 3.1; IQR, 1.53 to 1.81), considering that an *R*_{0} of <1 is incompatible with sustained infection while an *R*_{0} of ≫1 usually leads to rapid exponential increase in the number of infected cells and viral load (12).

To determine the likelihood of transient CMV infection following a transmission event, we varied two parameters: (i) *R*_{0}, by adjusting the viral infectivity parameter value in our model, and (ii) the number of cells initially infected with replicating CMV (*I*_{0}). In simulations with only one initially infected cell (*I*_{0} = 1) where *R*_{0} was equal to 1.05, most transmission events resulted in transient infection (Fig. 3A). If we assumed 10 initially infected cells (*I*_{0} = 10) and that *R*_{0} was equal to 1.5, then primary CMV infection was established in the vast majority of simulations. With *I*_{0} equal to 10 and *R*_{0} equal to 1.05, we observed a mix of both transient and established CMV infections, though prolonged low-level shedding episodes were frequently observed. A similar trend was observed with *R*_{0} equal to 1.5 and *I*_{0} equal to 1 (Fig. 3A), though sustained low-level shedding was less common.

To further describe the relationship between *R*_{0}, *I*_{0}, and the probability of transient CMV infection, we simulated CMV transmission across a range of potential values for both parameters (Fig. 3B). In the infant cohort data, only a certain proportion of transient infections could be captured due to the weekly sampling scheme, the sensitivity of the assay, and the short duration of transient infection. To determine if a simulated transient infection was observed, we first performed weekly sampling from each simulation that mimicked the weekly sampling used in the Uganda cohort. We then retained only transient-infection episodes that were captured by weekly sampling that exceeded viral loads that would be detected by PCR testing (≥3 viral-genome copies/reaction). The results did not change substantially if we assumed all transient infections were observable (data not shown). In general, simulated transient CMV infections were most likely to occur at low values for *R*_{0} and *I*_{0} (Fig. 3B). The observed range for the estimated probabilities of transient CMV infections among Ugandan infants (0.76 to 0.88) (Table 1) suggests an *R*_{0} near 1.1 with an *I*_{0} of <10 (Fig. 3B).

Mathematical model validation based on the viral loads of transient CMV infections.Among Ugandan infants who underwent weekly sampling, the median CMV load of transient infections was 2.8 log_{10} CMV DNA copies (IQR, 2.5 to 3.2; range, 2.2 to 5.5) (Fig. 4A). In model simulations, we demonstrated that the observed viral load increases mostly as a function of *I*_{0}, even if *I*_{0} increases by only a few cells (Fig. 4A). CMV loads in simulated transient infections most closely resembled the clinical data when *I*_{0} was equal to 1, suggesting that most primary CMV infections initiate in a single cell. Increases in *R*_{0} slightly decreased the viral load in transient infections because established infections became more likely as viral loads exceeded 10^{3} DNA copies. The model generally reproduced the distribution of viral loads from the data for *R*_{0} values between 1.1 and 1.5.

Mathematical model validation based on the durations of transient CMV infections.The simulated durations of transient CMV infections also increased as *I*_{0} increased (Fig. 4B). In contrast, increases in *R*_{0} decreased the duration of transient infection because at higher *R*_{0} values, shedding beyond 1 week was more likely to progress to established infection rather than transient infection. The model, therefore, suggests that extremely prolonged transient CMV infections are only likely for *R*_{0} values near 1 and become more probable as *I*_{0} increases.

Most transient CMV infections observed in the Ugandan infants consisted of one or two consecutive positive swabs, implying that transient CMV episodes typically last less than 2 weeks (Fig. 1C and 4B). The durations in simulated CMV episodes most closely resembled the data when *I*_{0} was equal to 1 to 3 and *R*_{0} was approximately 1.2 to 1.5 (Fig. 4C), again suggesting that most CMV transmission events begin with viral replication in a few cells and that viral infectivity is low. Of note, we previously estimated a median *R*_{0} of 1.6 during sustained primary infection (12), suggesting the possibility of slight differences in parameters during infections that go on to become established.

Transient infection also requires a low *R*_{0} in a tissue-based mathematical model of infection.One biological oversimplification of our stochastic ODE model is that it assumes homogeneous mixing of viruses and target cells when in reality interaction of infected cells and viruses is spatially constrained in solid-tissue microenvironments. Infection within solid tissues may lead to rapid depletion of target cells and impact the factors that dictate transient versus sustained infection.

To address this issue, we constructed a spatially constrained agent-based model (ABM) that recapitulates the three-dimensional histology of stacked squamous epithelial cells in mucosal tissue. In ABM simulations, the CMV-infected cell is the key unit of infection, and viral production is not included for simplification purposes. Cells in our three-dimensional model randomly infect contacting neighbors, which are arranged in a lattice of rhombic dodecahedrons with a depth of 10 cells. Therefore, the maximum possible theoretical value for *R*_{0} in the model is 12. We simulated mean *R*_{0} values between 1 and 2 in parallel with findings from the ODE model. However, in contrast to the stochastic ODE, spatial constraints may limit CMV spread because infected cells cannot infect adjacent cells that are already infected or dead. A virus that enters a cell that is already infected, and infectious, is assumed to have no added biological effect. The effective reproductive number (*R _{e}*), or the average number of cells infected by an infected cell over the course of infection, may therefore decrease relative to

*R*

_{0}as target cells become limited. To add relevant biological variability to the model, we assigned each cell a unique duration of infectivity and therefore a unique

*R*

_{0}value. This allowed the model to account for expected differences in infectivity among individual cells due to cellular, viral, or other stochastic factors.

Sustained infections in the ABM spread radially through the three-dimensional tissue matrix (see Movie S1 in the supplemental material). Assuming a single initial infected cell (*I*_{0} = 1), the probabilities of transient versus sustained infection were nearly equivalent between the stochastic ODE and ABMs, given equivalent *R*_{0} values, both approximated by 1 − 1/*R*_{0}. Two-dimensional representations of the ABM (to simplify visualization) demonstrated that transient infections are more common at low *R*_{0} values but still occur commonly when *R*_{0} is equal to 2 (see Movies S2, S3, and S4 in the supplemental material). Similar to the findings of the ODE model, the ABM movies demonstrate that low-level transient infections may occur at *R*_{0} values equal to 1.2, with the possibility of random extinction following many cycles of infection (see Movie S2 in the supplemental material). At higher *R*_{0} values, persistent high-level infection becomes inevitable after multiple sustained cycles of viral spread (see Movies S3 and S4 in the supplemental material).

We used the ABM to test whether the life span of infected cells relative to that of uninfected cells had any impact on the probability of transient infection, given the theoretical possibility that more rapid elimination of infected cells might increase target cell saturation. (During highly lytic viral infections, the average life span of an infected cell is less than that of an uninfected cell, whereas nonlytic infection is mathematically defined by equal life spans of uninfected and infected cells.) The infected-cell death rate had no impact on the probability of transient infection, which was determined entirely by *R*_{0}, within each microenvironment of infection (data not shown). Therefore, the ABM reinforces the fact that low values for *R*_{0} and *I*_{0} are prerequisites for transient infection even when spatial constraints of tissue, variable cell-to-cell infectivity, and viral cytolytic potential are considered.

## DISCUSSION

Oral CMV shedding after the establishment of primary infection in infants follows a stereotypical trajectory over more than a year, which is described by slow viral expansion, sustained steady-state shedding at high viral loads, and an extremely protracted clearance phase (12). However, the earliest stages of infection are far less predictable. We found that the majority of CMV transmissions terminate prior to establishment of primary infection and do not result in successful establishment of latent infection. As such, we used the term transient infection to describe these episodes.

While this study represents the first formal description and modeling of transient CMV infections based on intensive viral sampling, our findings are consistent with those of numerous studies of CMV and other viruses that can cause lifelong human infection. One study of CMV transmission through breast milk found that many infants had transiently detectable CMV in saliva using PCR testing every 3 months; oral contamination by breast milk was discounted because the infant saliva CMV load typically exceeded that of contemporaneously sampled milk from the mother (17). In multiple prospective studies of childcare facility outbreaks, oral HSV-1 was cultured from children who did not go on to seroconvert or have recurrent shedding (18–20). In addition, several case series describe HSV-2-specific (21) or HIV-1-specific (22, 23) T cell responses among seronegative individuals in whom virus could not be detected, suggesting infection may have been self-limited. In our cohort, transient infection was defined in part by the absence of seroconversion. Studies to evaluate whether CMV-specific cellular immune responses may be induced by transient infections would be informative.

The high proportion of transmission events resulting in transient CMV infection implies two distinct bottlenecks prior to sustained infection: a small number of initially infected cells and a low *R*_{0}. First, despite numerous repeated exposures to high-viral-load shedding in breast milk of nursing mothers and saliva and urine of household contacts (6, 7, 11, 24, 25), transmissions are likely to involve initial infection of one or only a few cells. The law of multiplicity of infection dictates that infection of a few cells, among thousands of susceptible targets, occurs due to the entry of a single virus into each cell. The finding that both transient and established CMV infections are typically initiated by a single virion indicates that a very small number of infectious viruses reach a susceptible target cell and/or that initial infection of those cells is inefficient. Additional studies to evaluate the replication competence and determinants of transmissibility of CMV in natural exposures would be valuable.

The idea of restricted founder virus populations is supported by genetic studies of congenital human CMV infection that identified strong transmission bottlenecks (13). Although the founder population size of congenital CMV infections was estimated to be tens to hundreds of viruses rather than <10, as in our study, this may reflect the difference between oral and transplacental routes of transmission. Following sexual exposure, HIV-1 acquisition is highly inefficient, and viral variants with low infectivity may undergo limited rounds of replication in the mucosa without establishing systemic infection (26, 27). Furthermore, when persistent HIV-1 infection does result from sexual transmission, it is usually established by a single founder virus (28). In contrast, as opposed to mucosal transmission, a significantly larger founder virus population size is evident in people who acquire HIV-1 via the intravenous route (29). Together, these observations suggest that universal mechanisms may underlie transient infections with different viruses (30). Nevertheless, we found that the frequency of transient CMV infections was higher prior to established infection than for other HHVs, suggesting unique aspects of CMV transmission.

The second bottleneck occurs because the initial oral cell infected by CMV has a surprisingly low probability of infecting adjacent cells and stochastic extinction occurs frequently. Using an agent-based model, which captures microanatomic constraints on viral spread, we confirmed that this inefficiency is not due to early target cell limitation or a lytic effect on infected cells but rather reflects the inherent infectivity of CMV. The importance of a low *R*_{0} value early during infection has also been hypothesized for HIV-1 infection (26). To our knowledge, the *R*_{0} for CMV in the oral cavity or other mucosal sites during natural infection of healthy individuals has not been calculated previously. The *R*_{0} for CMV in the blood of transplant patients has been reported to be between approximately 1.5 and 15, depending on preexisting immunity and other factors (31, 32); these higher estimates likely reflect differences in patient populations and anatomical compartments.

Given that there is no evidence of intensifying immunologic pressure against oral CMV shedding in infants until many weeks after primary infection (12), we infer that this low value for *R*_{0} is an inherent property of the virus in oral mucosa. The virologic basis for a low *R*_{0} during oral CMV infection is unclear. It is possible that limited replication efficiency at the site of acquisition may confer a commensurate advantage on other aspects of infection, such as dissemination or immune evasion. Inefficient spread to uninfected cells may also be due in part to persistence of maternal antibodies, which would need to be addressed through studies of primary infection in older children or infants born to CMV-uninfected women. Innate immune pressure may also limit the efficiency of CMV spread. Important differences in routes of infection and hosts notwithstanding, murine CMV studies have revealed that macrophages and natural killer cells induce a bottleneck and restrict systemic spread from the draining lymph nodes after inoculation by footpad injection (33, 34). While vaccines that induce neutralizing antibodies may prevent infection of cells altogether, other vaccines, such as those that induce T-cell or antibody-dependent cell-mediated cytotoxicity responses, allow viral replication and antigen expression within at least one cell prior to virus elimination (35, 36). For these prevention strategies, understanding the determinants of transient versus established infection might be valuable for guiding the design of vaccine candidates and their evaluation.

A limitation of our approach is imprecise classification of transient infections. In contrast to abortive infection, which is defined as the absence of productive viral replication following host cell entry, our data suggest limited rounds of replication in oral mucosa. The half-life of viral DNA in mucosa is measured on a scale of hours (37–39); thus, low-level DNA dissipates rapidly in the absence of replication. While this evidence favors CMV replication, we did not prove that all episodes of CMV detection represent viral replication, and additional studies are required to define the contribution of productive viral replication to transient infections. As such, the frequency of transient infection may have been overestimated. Overestimation of transient infections may also occur as a result of undetected oral CMV shedding below the limit of detection that could link multiple transient infections into a single episode.

Alternatively, we might have underestimated the proportion of transient infections. The maximum viral load of some transient infections may be below the limit of detection with PCR, or the infections may have such short durations that they are difficult to detect with weekly sampling. Using our closest estimate of *R*_{0} in this analysis, we estimated that most transient CMV infections would be detected with weekly sampling, but this model prediction needs to be confirmed with more temporally granular sampling protocols.

Our previous model of primary infection predicted slightly higher estimates of *R*_{0} (median, 1.63; IQR, 1.53 to 1.81) (12), consistent with lower probabilities of transient episodes (roughly 50%). However, we posit that an *R*_{0} value calculated strictly from cases of primary infection is likely to be an overestimation, because it is conditioned on establishment of infection and ignores the information gained from evaluating stochastic extinction events. It is therefore possible that the best estimate of *R*_{0} lies between these calculations and may vary between different exposures, particularly if the fitness of virions is heterogeneous in natural CMV exposures. While we also are unable to capture the likely possibility that early *R*_{0} values may vary across infants and even exposures within the same infant, our ABM demonstrated robust results under the assumption of cell-to-cell variability in infectious potential (*R*_{0}).

In conclusion, we have demonstrated that, despite infecting most of the world's population during early childhood, CMV establishes primary infection inefficiently. This finding provides hope for the development of an effective prophylactic CMV vaccine.

## MATERIALS AND METHODS

Cohort description.The details of the cohort and Institutional Review Board (IRB) approvals obtained from the University of British Columbia, the Fred Hutchinson Cancer Research Center, and the Uganda National Council for Science and Technology have been described previously (11, 12). Briefly, 32 pregnant women were enrolled in Kampala, Uganda, and study visits were performed each week after delivery. Oral swabs were collected from mothers, newborns, and all other preschool age children in the home in a standardized manner at each visit. All the swabs were tested by PCR for CMV, EBV, HSV, HHV-6, and HHV-8 using published methods (40–45). Two infants with persistent CMV shedding beginning in the first week of life were classified as having congenital infections and were excluded from subsequent analyses (11).

Virologic definitions.We defined a positive swab as any value greater than or equal to 3 copies/reaction, or ∼150 copies/ml of swab buffer (11). We defined any transition from negative to positive HHV detection as a transmission event. Establishment of primary CMV infection was defined as sustained viral shedding following transmission, using criteria that were validated using serology in the cohort (11). A transient infection was defined as one or more consecutive positive swabs that resolved (had a subsequent negative swab) prior to the date of primary infection.

The number of consecutive swabs was used to estimate the duration ranges for transient infections by assuming that they started 0.5 to 6.5 days before the first positive result and ended 0.5 to 6.5 days after the last positive result. For example, a transient infection with a single positive swab had a duration between 1 day, starting 0.5 day before the first positive result and ending 0.5 days after the last positive result, and 13 days, starting 6.5 days before the first positive result and ending 6.5 days after the last positive result. This explains why the duration ranges shown in Fig. 1C overlap. For each additional consecutive positive swab, 1 week was added to the bounds. We assumed no negative swabs were missed between samples.

Because not all samples were taken exactly 7 days apart, durations were calculated only from transient infections with samples collected within 5 to 9 days of each other, on average capturing weekly sampling but allowing for a 2-day window. For example, if a transient infection contained two consecutive positive swabs that were less than 5 days apart, they were not included in Fig. 1C and 4B and C because this could represent a misclassification of the range represented for 2 consecutive episodes. Similarly, if two consecutive positive swabs were sampled more than 9 days apart, misclassification might also occur. We assumed this censoring was the result of the sampling and was not related to the probability of observing an episode. These exclusion criteria, applied only to the duration results, excluding 34/136 total transient CMV infections: 5 were excluded because samples were less than 5 days apart, 20 were excluded because samples were more than 9 days apart, and 9 were excluded because the episode contained both consecutive samples less than 5 days apart and consecutive samples more than 9 days apart.

Estimating transient-infection probability.Using Kaplan-Meier estimation, we computed the cumulative distribution of primary infection as a function of transient infections. The Kaplan-Meier estimate incorporates transient infections among infants where primary infection was not observed during the study. We considered the total number of transient infections before the establishment of primary infection to be a geometric process, with termination occurring due to either infection or censoring, that is, each transmission resulting in infection had the same probability of becoming a transient infection. We then estimated the probability of primary infection, *P*, by fitting a cumulative geometric distribution [(1 − *P*)^{n + 1}] to the estimated cumulative distribution across each transient-infection count, *n*, using nonlinear least squares. To derive the point estimate and confidence intervals, we performed 1,000 bootstraps. For each bootstrap estimate, the infants were sampled with replacement, the Kaplan-Meier procedure was used to estimate the cumulative distribution, and then the probability of primary infection was estimated using the cumulative geometric distribution model. The estimate and confidence interval were computed by taking the median and the 2.5th and 97.5th percentiles from the 1,000 bootstraps results. The nonlinear least-squares algorithm was conducted using the “port” method in the *nls* function in the R programming language (46). This procedure was conducted on all infants and on the subset of infants with primary infection observed during the study.

Stochastic ODE model structure.The underlying within-host ODE model for infection was as follows: *dL*/*dt* = β*SV* = α*L* − μ*L*; *dl*/*dt* = α*L* − δ*l*; *dV*/*dt* = *pl* − *cV*.

The *L* compartment represented latently infected (or preproductive) cells without replicating virus, the *I* compartment represented cells with replicating virus, and *V* was the viral population. Because the model focused on early infection dynamics, the total target cell population (*S*) was fixed, and only the dynamics of infected cells and viruses were considered. Parameter values were chosen to be consistent with a previous model of primary CMV infection (12) as follows: *S* = 4e8 (47), μ = 1/4.5 (47), α = 1 (31, 48, 49), δ = 0.77/day (31, 48), *P* = 1,600/cell/day (49), and *c* = 2/day (50). The infectivity term, β, was varied by simulation and determined the *R*_{0} value: *R*_{0} is equal to β*SP*/*c*δ(1 − μ/α).

The model was simulated stochastically with integer output by using the differential equations as approximations for the transition probabilities at small time steps (*dt* = 0.01 day). Let Poisson(λ) be a random Poisson draw with rate λ; then, the transition steps are as follows: *L*(*t* + *dt*) = *L*(*t*) + Poisson[β*SV*(*t*) × *dt*] − Poisson(α*L* × *dt*) − Poisson(μ*L* × *dt*); *l*(*t* + *dt*) = *l*(*t*) + Poisson(α*L* × *dt*) − Poisson(δ*l* × *dt*); *V*(*t* + *dt*) = *V*(*t*) + Poisson(*pl* × *dt*) − Poisson(*cV* × *dt*).

Counts were absorbed at zero.

Stochastic ODE model simulations and transient-infection definitions.Using a stochastic ODE model, we conducted 10,000 model simulations across varying *R*_{0} values, the average number of cells infected by a single cell in the absence of target cell limitation, and *I*_{0} values, the number of initially infected cells. Each simulation had three stopping rules: the infection terminated when (i) the viral and infected-cell counts all reached zero, (ii) a viral load of >9 log units was reached, or (iii) the 500th simulation day was reached.

We used two conditions to determine if a transient infection was observable. The first was that the viral load had to exceed 150 copies/ml, the minimum observed value in the data, at some time during the simulation. The second was based on a weekly sampling scheme to match the data generated from the infant cohort: we randomly chose an initial sampling time within the first week of the simulation and then calculated how many weekly samples would have occurred before the simulation terminated. If the initial sample time occurred after the infection ceased (all populations reached zero), then that transient infection was not observed.

The results presented are from successfully observed transient infections. For each simulation, the maximum viral load and one random viral-load sample were recorded. The random viral-load sample was compared to the Ugandan cohort data. To analyze duration, distributions of total consecutive samples from the simulations were compared to observed total consecutive positive swabs from transient infections in the data. As a summary measure across the total consecutive swab counts, we computed a deviance score using the G test (51). The observed values were the total consecutive swab counts observed in the data in categories: 1, 2, 3, 4, and 5+. The expected values were calculated by multiplying the proportions estimated from the model simulations and the total durations observed in the data.

The data for the heat maps with contours of aborted infection probabilities across *R*_{0} and *I*_{0} were constructed using scipy.interpolate.griddata in python to interpolate probabilities between the input parameter values and to create a smoothed plot.

Agent-based model and simulations.We used a stochastic agent-based mathematical model to simulate the spread of infection based on a given reproductive number (*R*_{0}) for cells arranged in a particular geometric layout. Each cell was given a life span that equated on average to one time unit during which it infected, on average, *R*_{0} other cells. All life spans were drawn from a Weibull distribution. The geometric arrangements of susceptible cells for infections were based on hexagonal packing of cells in 2 dimensions, allowing 6 contacts per cell. We also performed simulations in 3 dimensions with cell packing as rhombic dodecahedrons, each of which had 12 cellular contacts. The thickness of the epidermis in these 3-dimensional simulations was 10 cells.

The selection of a new infected cell among contacted cells was performed randomly and could include already infected cells; the latter event would lead to no new infected cells in the matrix and was intended to capture target cell saturation. To avoid target cell exhaustion (in both 2- and 3-dimensional simulations), we typically simulated groups of 10,000 cells and began infection at the center of the matrix. The model also included an average replenishment interval for susceptible cells, which we assumed to be 10 life span units in our simulations. The replenishment interval essentially allows dead cells to be replaced with susceptible ones at a given rate.

To perform a trial, the simulation was launched 10,000 times for each value of *R*_{0} and with different values of initially infected cells (*I*_{0}). Runs were tracked to see how far the infection spread and whether it went “extinct” in an allotted time period. For those runs that went extinct, we recorded the average number of life spans before extinction.

Code and data.The statistical analysis and stochastic ODE model simulations were conducted using the R programming language (CRAN). Heat map and contour figures were made using the matplotlib, scipy, and numpy modules in python. The agent-based model was programmed in C++. The code and data are available at https://github.com/bryanmayer/CMV-Transient-Infections .

## ACKNOWLEDGMENTS

We are grateful to the families who participated in this study and the staff of the Uganda Cancer Institute, who assisted with the fieldwork.

The work was supported by the National Institutes of Health Roadmap KL2 Clinical Scholar Training Program (grant KL2-RR025015 to S.G.), the University of Washington Center for AIDS Research (New Investigator Awards P30-AI027757 to S.G.; P01-AI030731 to L.C., A.W., S.S., J.T.S, and M.-L.H.; R01-CA138165 to C.C.; and P30-CA015704 to C.C. and L.C.), a Canadian Institutes of Health Research Grant (MOP 136825 to S.G.), and a BC Children's Hospital Foundation Award (to S.G.). The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

A.W. received personal fees from Aicuris, Amgen, Eisai, UpToDate, and Admedus and support from Agenus, Genentech, Genocea, Gilead, and Vical. C.C. received grants, personal fees, and nonfinancial support from Janssen Pharmaceuticals and grants and nonfinancial support from GSK and TempTime. L.C. received personal fees from Immune Design and has a patent null licensed. S.G. received personal fees from Omeros Corp. and grants from VBI Vaccines Inc.

## FOOTNOTES

- Received 6 March 2017.
- Accepted 27 March 2017.
- Accepted manuscript posted online 5 April 2017.
Supplemental material for this article may be found at https://doi.org/10.1128/JVI.00380-17 .

- Copyright © 2017 American Society for Microbiology.