Thursday Morning Session (July 24, 08:30-12:00)
Moderator: Bill Nelson

In the midst of an outbreak or sustained epidemic, reliable prediction of transmission risks and patterns of spread is critical to inform public health programs. Projections of transmission growth or decline among specific risk groups can aid in optimizing interventions, particularly when resources are limited. Phylogenetic trees have been widely used in the detection of transmission chains and high-risk populations. Moreover, tree topology and the incorporation of population parameters (phylodynamics) can be useful in reconstructing the evolutionary dynamics of an epidemic across space and time among individuals. However, we have recently demonstrated that existing phylogenetic resources are limited in their capability in detecting risk groups within the population that exhibit distinct and/or dynamic transmission patterns (including transmission decline). We have, therefore, pursued phylogeny-based deep learning systems (DeepDynaForecast and DeepDynaTree) to learn the topological patterns predictive of these small group dynamics. Our DeepDyna series leverage a primal-dual graph learning structure with shortcut multi-layer aggregation, which has been applied to simulated viral and bacterial outbreak data and empirical, large-scale data from the human immunodeficiency virus (HIV) epidemic in Florida between 2012 and 2020. Our frameworks have demonstrated remarkable accuracy in the face of transmission and sampling stochasticity and provided insight into age-structured transmission differences that can act to inform HIV outreach strategies.

Abstract to follow

Coffee Break

Abstact to follow

Phylodynamic analyses using BEAST 2 provides many techniques and results that can be useful for pathogen surveillance programs. However, BEAST 2 involves many manual steps which render it less suitable for high throughput research or operations, such as the needs of Public Health agencies. From generating an XML-format file for a model in the BEAUti GUI interface or selecting MCMC runs for convergence within the model’s parameter space, to finally producing figures and tables summarising results, these manual steps can require considerable training and cannot be run inside a lab’s genomics automated workflows. With the aim of developing a python-based pipeline for automating and streamlining the manual steps involved with using BEAST 2, our team developed BEAST_pype. We will present how we piloted this pipeline to expedite production of phylodynamic surveillance results on emerging SARS-CoV-2 variant dynamics for use in PHAC’s routine COVID-19 forecast modelling. BEAST_pype’s modular nature means that new workflows can be generated for other uses, whilst its diagnostic notebook makes optimising convergence of MCMC runs easier. BEAST_pype uses template XMLs as an input for generating new XMLs, making it easily extendable to other models, and ad-hoc analyses. As a further example, we showcase a new BEAST_pype workflow for mpox outbreak analytics. We plan to use BEAST_pype’s modularity and expandability to produce workflows for other pathogens and more models.





Thursday Afternoon Session (July 24, 13:00-14:30)
Moderator: Bill Nelson

Recombination is the exchange of genetic material between individual genomes. Many rapidly evolving human viruses have relatively high rates of recombination, including HIV-1 and SARS-CoV-2. However, it is not common practice to screen for recombination before carrying out phylodynamic analysis. This is surprising because the premise of phylodynamics is that we can reconstruct epidemiological processes from the shape of a phylogenetic tree. It is also well known that recombination tends to alter the shapes of trees (making them more "star-like") when it is not accounted for in phylogenetic reconstruction. I will show recent findings from my group that recombination can cause the birth-death SIR (BDSIR) model to systematically overestimate the basic reproduction number (R0). We used BEAST2 to both simulate data under the BDSIR model and to re-estimate the model parameters from the simulations with varying amounts of recombination. Next, I describe some challenges in detecting recombination from virus genome data. I will present our recent work on using ancestral recombination graphs to infer reassortment events in avian influenza virus genomes. Reassortment is a form of recombination that involves the exchange of genomic segments instead of crossovers between homologous nucleic acids. This evolutionary mechanism has played a significant role in the emergence of new pandemic strains of influenza virus, such as H1N1 pdm09. Lastly, I will describe a dynamic stochastic blockmodeling approach that we have adapted from the field of network science to detect recombination in HIV-1 genomes, and our ongoing work on extending this method in a Bayesian framework.

Many disease outbreaks in humans are thought to have been enabled by pathogen recombination, the rate of which varies widely. One reason for this variation is that recombination (inclusive of reassortment) occurs during multiple infection of the same host. This means that processes which influence the incidence of disease in the host population affect the rate at which pathogen genetic material is shuffled. We develop eco-evolutionary models to investigate how ecological traits of a host (e.g., mean lifetime, immune investment) influence the rate at which pathogen genetic material is recombined, and the consequences of this recombination on the pathogen's emergence in a novel host. Using approximations from population genetics, we find support for the idea that pathogens of short lived, acutely infected hosts (e.g., rodents) should recombine most frequently. This variation in recombination rate is explained by differences in the density of infections and, thus, the extent of co-infection at equilibrium. Using highly pathogenic avian influenza data, we test this prediction. We find that, in agreement with the predicted relationship, the extent of linkage disequilibrium between mutations on different segments of the flu genome increases with host body size.





Friday Morning Session (July 25, 8:30-12:00)
Moderator: Troy Day

Phylodynamics requires constructing a bridge beween dynamic models and genome data. One route for this bridge lies through the genealogy that describes the patterns of shared ancestry among sampled genomes. A key problem in phylodynamics has been a mismatch between inference methodology and epidemiological models: the approximations that must be made to perform inference conflict with questions of great interest. I will describe new results in which we have obtained exact expressions for phylodynamic likelihoods associated with population models of (almost) arbitrary complexity. These results unify and strictly extend existing approaches and broaden the scope of phylodynamic inference methods. In particular, I will deduce an exact expression for the likelihood of an observed genealogy, as the solution to a well-defined filter equation. The most widely used existing approaches to phylodynamics are seen to be very special cases of these equations. Interestingly, the equations can be solved numerically using standard Monte Carlo techniques. I will conclude by highlighting the need for improved algorithms and indicating some open questions.

The latent viral reservoir (LVR) consists of transcriptionally-inactive HIV-1 proviruses within long-lived resting CD4+ T-cells, in which proviruses can persist even under fully-suppressive antiretroviral therapy (ART). The presence of this reservoir is the primary reason for viral resurgence upon ART interruption. Gaining a deeper understanding of proviral dynamics in the LVR is critical for the development of HIV cure strategies. We have developed a probabilistic framework to characterize clonal expansion as a key factor driving the expansion and maintenance of the LVR.

The common assumption that identical proviral sequences with unknown integration sites are clonal overlooks the possibility that they may represent independent integration events and remain identical by chance. Clonality is a pairwise attribute, not an intrinsic attribute of each provirus as implied by traditional measures of clonality. The probability that a pair of genetically identical proviruses (I) are clonal (C) can be written as P(C|I)=P(I|C)P(C)/P(I). Ignoring somatic mutation of integrated proviruses, we assume P(I|C)=1. P(I) is decomposed into P(C)+P(I|¬C)P(¬C), where P(C) and P(¬C) are determined by the coalescence - i.e., effective population size (Ne) and generation time - of cellular lineages in the LVR, and P(I|¬C) by the coalescence and mutation of viral lineages pre-ART. The relative contributions of these terms is determined by the proviral integration date, which we estimate by root-to-tip regression.

Analytical results indicate that the conditional probability of clonality given identity, P(C|I), is substantially lower for proviruses integrated near the time of ART initiation. We observed that P(C|I) increases with higher viral mutation rates, longer sequence lengths, and larger effective population sizes (Ne?). These findings suggest that the contribution of clonal expansion to the maintenance of LVR may be overestimated. To explore this further, we applied our probabilistic framework to proviral sequence data from the RHSP and CAPRISA cohorts. The analysis supports the conclusion that current estimates may overstate the role of clonal expansion in sustaining the LVR. However, this framework has limitations, primarily due to uncertainty in key parameter estimates. Improved empirical characterization of these parameters is essential to refine model predictions and enhance the robustness of our conclusions.

Coffee Break

Recently developed particle filters enable simulation-based likelihood calculation for a general class of compartment models observed via dynamic tree-valued data. In many areas of machine learning and statistical inference, automatic differentiation and GPU computation have enabled the development of algorithms capable of using large amounts of data to fit complex models. For technical reasons, practical particle filter methodology has been slow to incorporate these advanced tools. We present a new approach to automatic differentiation of particle filters applicable to a broad class of partially observed dynamic systems. We discuss the application to phylodynamic inference.

Abstract to follow