Poisson Network SIR Epidemic Model

2024-12-30T23:46:00Z

We extend the classical Susceptible-Infected-Recovered (SIR) model to a network-based framework where the degree distribution of nodes follows a Poisson distribution. This extension incorporates an additional parameter representing the mean node degree, allowing for the inclusion of heterogeneity in contact patterns. Using this enhanced model, we analyze epidemic data from the 2018-20 Ebola outbreak in the Democratic Republic of the Congo, employing a survival approach combined with the Hamiltonian Monte Carlo method. Our results suggest that network-based models can more effectively capture the heterogeneity of epidemic dynamics compared to traditional compartmental models, without introducing unduly overcomplicated compartmental framework.

Frequency-dependent returns in nonlinear public goods games

2024-12-30T20:21:04Z

When individuals interact in groups, the evolution of cooperation is traditionally modeled using the framework of public goods games. These models often assume that the return of the public good depends linearly on the fraction of contributors. In contrast, in real life public goods interactions, the return can depend on the size of the investor pool as well. Here, we consider a model in which the multiplication factor (marginal per capita return) for the public good depends linearly on how many contribute, which results in a nonlinear model of public goods. This simple model breaks the curse of dominant defection found in linear public goods interactions and gives rise to richer dynamical outcomes in evolutionary settings. We provide an in-depth analysis of the more varied decisions by the classical rational player in nonlinear public goods interactions as well as a mechanistic, microscopic derivation of the evolutionary outcomes for the stochastic dynamics in finite populations and in the deterministic limit of infinite populations. This kind of nonlinearity provides a natural way to model public goods with diminishing returns as well as economies of scale.

Derivations of Animal Movement Models with Explicit Memory

2024-12-29T20:38:21Z

Highly evolved animals continuously update their knowledge of social factors, refining movement decisions based on both historical and real-time observations. Despite its significance, research on the underlying mechanisms remains limited. In this study, we explore how the use of explicit memory shapes different mathematical models across various ecological dispersal scenarios. Specifically, we investigate three memory-based dispersal scenarios: gradient-based movement, where individuals respond to environmental gradients; environment matching, which promotes uniform distribution within a population; and location-based movement, where decisions rely solely on local suitability. These scenarios correspond to diffusion advection, Fickian diffusion, and Fokker-Planck diffusion models, respectively. We focus on the derivation of these memory-based movement models using three approaches: spatial and temporal discretization, patch models in continuous time, and discrete-velocity jump process. These derivations highlight how different ways of using memory lead to distinct mathematical models. Numerical simulations reveal that the three dispersal scenarios exhibit distinct behaviors under memory-induced repulsive and attractive conditions. The diffusion advection and Fokker-Planck models display wiggle patterns and aggregation phenomena, while simulations of the Fickian diffusion model consistently stabilize to uniform constant states.

The asymptotic distribution of the $k$-Robinson-Foulds dissimilarity measure on labelled trees

2024-12-28T04:10:10Z

Motivated by applications in medical bioinformatics, Khayatian et al. (2024) introduced a family of metrics on Cayley trees (the $k$-RF distance, for $k=0, \ldots, n-2$) and explored their distribution on pairs of random Cayley trees via simulations. In this paper, we investigate this distribution mathematically, and derive exact asymptotic descriptions of the distribution of the $k$-RF metric for the extreme values $k=0$ and $k=n-2$, as $n$ becomes large. We show that a linear transform of the $0$-RF metric converges to a Poisson distribution (with mean 2) whereas a similar transform for the $(n-2)$-RF metric leads to a normal distribution (with mean $\sim ne^{-2}$). These results (together with the case $k=1$ which behaves quite differently, and $k=n-3$) shed light on the earlier simulation results, and the predictions made concerning them.

On the Estimation of the Time-Dependent Transmission Rate in Epidemiological Models

2024-12-27T21:33:50Z

The COVID-19 pandemic highlighted the need to improve the modeling, estimation, and prediction of how infectious diseases spread. SEIR-like models have been particularly successful in providing accurate short-term predictions. This study fills a notable literature gap by exploring the following question: Is it possible to incorporate a nonparametric susceptible-exposed-infected-removed (SEIR) COVID-19 model into the inverse-problem regularization framework when the transmission coefficient varies over time? Our positive response considers varying degrees of disease severity, vaccination, and other time-dependent parameters. In addition, we demonstrate the continuity, differentiability, and injectivity of the operator that link the transmission parameter to the observed infection numbers. By employing Tikhonov-type regularization to the corresponding inverse problem, we establish the existence and stability of regularized solutions. Numerical examples using both synthetic and real data illustrate the model's estimation accuracy and its ability to fit the data effectively.

Bottom-up robust modeling for the foraging behavior of Physarum polycephalum

2024-12-27T18:44:14Z

The true slime mold \textit{Physarum polycephalum} has the remarkable capability to perform self-organized activities such as network formation among food sources. Despite well reproducing the emergence of slime networks, existing models are limited in the investigation of the minimal mechanisms, at the microscopic scale, that ensure robust problem-solving capabilities at the macroscopic scale. To this end, we develop three progressively more complex multi-agent models to provide a flexible framework to understand the self-organized foraging and network formation behaviors of \textit{Physarum}. The hierarchy of models allows for a stepwise investigation of the minimal set of rules that allow bio-inspired computing agents to achieve the desired behaviors on nutrient-poor substrates. By introducing a quantitative measure of connectedness among food sources, we assess the sensitivity of the model to user-defined and bio-inspired parameters, as well as the robustness of the model to parameter heterogeneity across agents. We ultimately observe the robust emergence of pattern formation, in line with experimental evidence. Overall, our study sheds light onto the basic mechanisms of self-organization and paves the way towards the development of decentralized strategies for network formation in engineered systems, focusing on trade-offs between biological fidelity and computational efficiency.

Global Prediction of COVID-19 Variant Emergence Using Dynamics-Informed Graph Neural Networks

2024-12-27T16:43:44Z

During the COVID-19 pandemic, a major driver of new surges has been the emergence of new variants. When a new variant emerges in one or more countries, other nations monitor its spread in preparation for its potential arrival. The impact of the new variant and the timings of epidemic peaks in a country highly depend on when the variant arrives. The current methods for predicting the spread of new variants rely on statistical modeling, however, these methods work only when the new variant has already arrived in the region of interest and has a significant prevalence. Can we predict when a variant existing elsewhere will arrive in a given region? To address this question, we propose a variant-dynamics-informed Graph Neural Network (GNN) approach. First, we derive the dynamics of variant prevalence across pairs of regions (countries) that apply to a large class of epidemic models. The dynamics motivate the introduction of certain features in the GNN. We demonstrate that our proposed dynamics-informed GNN outperforms all the baselines, including the currently pervasive framework of Physics-Informed Neural Networks (PINNs). To advance research in this area, we introduce a benchmarking tool to assess a user-defined model's prediction performance across 87 countries and 36 variants.

The impact of simultaneous infections on phage-host ecology

2024-12-27T16:16:24Z

Phages use bacterial host resources to replicate, intrinsically linking phage and host survival. To understand phage dynamics, it is essential to understand phage-host ecology. A key step in this ecology is infection of bacterial hosts. Previous work has explored single and multiple, sequential infections. Here we focus on the theory of simultaneous infections, where multiple phages simultaneously attach to and infect one bacterial host cell. Simultaneous infections are a relevant infection dynamic to consider, especially at high phage densities when many phages attach to a single host cell in a short time window. For high bacterial growth rates, simultaneous infection can result in bi-stability: depending on initial conditions phages go extinct or co-exist with hosts, either at stable densities or through periodic oscillations of a stable limit cycle. This bears important consequences for phage applications such as phage therapy: phages can persist even though they cannot invade. Consequently, through spikes in phage densities it is possible to infect a bacterial population even when the phage basic reproductive number is less than one. In the regime of stable limit cycles, if timed right, only small densities of phage may be necessary.

Predicting high dengue incidence in municipalities of Brazil using path signatures

2024-12-27T14:38:50Z

Predicting whether to expect a high incidence of infectious diseases is critical for health surveillance. In the epidemiology of dengue, environmental conditions can significantly impact the transmission of the virus. Utilizing epidemiological indicators alongside environmental variables can enhance predictions of dengue incidence risk. This study analyzed a dataset of weekly case numbers, temperature, and humidity across Brazilian municipalities to forecast the risk of high dengue incidence using data from 2014 to 2023. The framework involved constructing path signatures and applying lasso regression for binary outcomes. Sensitivity reached 75%, while specificity was extremely high, ranging from 75% to 100%. The best performance was observed with information gathered after 35 weeks of observations using data augmentation via embedding techniques. The use of path signatures effectively captures the stream of information given by epidemiological and climate variables that influence dengue transmission. This framework could be instrumental in optimizing resources to predict high dengue risk in municipalities in Brazil and other countries after learning these country patterns.

Impact of phylogeny on the inference of functional sectors from protein sequence data

2024-12-27T10:50:33Z

Statistical analysis of multiple sequence alignments of homologous proteins has revealed groups of coevolving amino acids called sectors. These groups of amino-acid sites feature collective correlations in their amino-acid usage, and they are associated to functional properties. Modeling showed that nonlinear selection on an additive functional trait of a protein is generically expected to give rise to a functional sector. These modeling results motivated a principled method, called ICOD, which is designed to identify functional sectors, as well as mutational effects, from sequence data. However, a challenge for all methods aiming to identify sectors from multiple sequence alignments is that correlations in amino-acid usage can also arise from the mere fact that homologous sequences share common ancestry, i.e. from phylogeny. Here, we generate controlled synthetic data from a minimal model comprising both phylogeny and functional sectors. We use this data to dissect the impact of phylogeny on sector identification and on mutational effect inference by different methods. We find that ICOD is most robust to phylogeny, but that conservation is also quite robust. Next, we consider natural multiple sequence alignments of protein families for which deep mutational scan experimental data is available. We show that in this natural data, conservation and ICOD best identify sites with strong functional roles, in agreement with our results on synthetic data. Importantly, these two methods have different premises, since they respectively focus on conservation and on correlations. Thus, their joint use can reveal complementary information.

Estimation of System Parameters Including Repeated Cross-Sectional Data through Emulator-Informed Deep Generative Model

2024-12-27T08:19:23Z

Differential equations (DEs) are crucial for modeling the evolution of natural or engineered systems. Traditionally, the parameters in DEs are adjusted to fit data from system observations. However, in fields such as politics, economics, and biology, available data are often independently collected at distinct time points from different subjects (i.e., repeated cross-sectional (RCS) data). Conventional optimization techniques struggle to accurately estimate DE parameters when RCS data exhibit various heterogeneities, leading to a significant loss of information. To address this issue, we propose a new estimation method called the emulator-informed deep-generative model (EIDGM), designed to handle RCS data. Specifically, EIDGM integrates a physics-informed neural network-based emulator that immediately generates DE solutions and a Wasserstein generative adversarial network-based parameter generator that can effectively mimic the RCS data. We evaluated EIDGM on exponential growth, logistic population models, and the Lorenz system, demonstrating its superior ability to accurately capture parameter distributions. Additionally, we applied EIDGM to an experimental dataset of Amyloid beta 40 and beta 42, successfully capturing diverse parameter distribution shapes. This shows that EIDGM can be applied to model a wide range of systems and extended to uncover the operating principles of systems based on limited data.

The Accumulation of Beneficial Mutations and Convergence to a Poisson Process

2024-12-26T17:23:49Z

We consider a model of a population with fixed size $N$, which is subjected to an unlimited supply of beneficial mutations at a constant rate $μ_N$. Individuals with $k$ beneficial mutations have the fitness $(1+s_N)^k$. Each individual dies at rate 1 and is replaced by a random individual chosen with probability proportional to its fitness. We show that when $μ_N \ll 1/(N \log N)$ and $N^{-η} \ll s_N \ll 1$ for some $η< 1$, the fixation times of beneficial mutations, after a time scaling, converge to the times of a Poisson process, even though for some choices of $s_N$ and $μ_N$ satisfying these conditions, there will sometimes be multiple beneficial mutations with distinct origins in the population, competing against each other.

Identifiability of the spatial SEIR-HCD model of COVID-19 propagation

2024-12-25T09:58:22Z

This paper investigates the identifiability of a spatial mathematical model of the spread of fast-moving epidemics based on the law of acting masses and diffusion processes. The research algorithm is based on global methods of Sobol sensitivity analysis and Bayesian approach, which together allowed to reduce the variation boundaries of unknown parameters for further solving the problem of parameter identification by measurements of the number of detected cases, critical and dead. It is shown that for identification of diffusion coefficients responsible for the rate of movement of individuals in space, it is necessary to use additional information about the process.

PhyloGen: Language Model-Enhanced Phylogenetic Inference via Graph Structure Generation

2024-12-25T08:33:05Z

Phylogenetic trees elucidate evolutionary relationships among species, but phylogenetic inference remains challenging due to the complexity of combining continuous (branch lengths) and discrete parameters (tree topology). Traditional Markov Chain Monte Carlo methods face slow convergence and computational burdens. Existing Variational Inference methods, which require pre-generated topologies and typically treat tree structures and branch lengths independently, may overlook critical sequence features, limiting their accuracy and flexibility. We propose PhyloGen, a novel method leveraging a pre-trained genomic language model to generate and optimize phylogenetic trees without dependence on evolutionary models or aligned sequence constraints. PhyloGen views phylogenetic inference as a conditionally constrained tree structure generation problem, jointly optimizing tree topology and branch lengths through three core modules: (i) Feature Extraction, (ii) PhyloTree Construction, and (iii) PhyloTree Structure Modeling. Meanwhile, we introduce a Scoring Function to guide the model towards a more stable gradient descent. We demonstrate the effectiveness and robustness of PhyloGen on eight real-world benchmark datasets. Visualization results confirm PhyloGen provides deeper insights into phylogenetic relationships.

Metabolic scaling, life history, and the equal fitness paradigm

2024-12-25T02:29:45Z

Natural selection has produced an extraordinary diversity of life histories spanning many orders of magnitude in body size, vital rates, and biological times. In general, big and cold organisms grow and reproduce slowly and live long lives; small and warm organisms grow and reproduce quickly and live short lives. The Metabolic Theory of Ecology (MTE) predicts equal and opposite scaling exponents of mass-specific biological rates (e.g., respiration, growth, and reproduction) and times (e.g., development, lifespan, and generation) as a function of size. However, empirical support for these predictions varies depending on trait and taxon. Here I: 1) provide background and mixed support for the quarter-power scaling exponents for life history rates and times predicted by MTE, 2) discuss possible explanations, including effects of natural selection on taxonomic and functional groups, and inadequate data for life history traits, 3) briefly summarize the Equal Fitness Paradigm (EFP) as a unifying theory of bioenergetics, life history and demography that does not depend on any particular allometric scalings, and 4) discuss ramifications of the EFP for other biological phenomena, including physiological performance metrics and trophic energetics of ecosystems. I draw mostly from my knowledge of mammals, yet in many cases the mammalian examples can be generalized to other organisms. I end with prospects for further evaluating and extending the EFP.