https://arxiv.org/api/cutKVGhS1W9r+jQ+zQps3C1SMj02026-04-01T14:43:43Z1283927015http://arxiv.org/abs/2601.09634v1Human Ancestries Simulation and Inference: a Review of Ancestral Recombination Graph Samplers2026-01-14T17:09:42ZThere is little debate about the importance of the ancestral recombination graph in population genetics. An important theoretical tool, the main obstacle to its widespread usage is the computational cost required to match the ever-increasing scale of the data being analyzed. Many of these difficulties have been overcome in the past two decades, which have consequently seen the development of increasingly sophisticated ARG simulation and inference software. Nonetheless, challenges remain, especially in the area of ancestry inference. This paper is a comprehensive review of ARG samplers that have emerged in the past three decades to meet the need for scalable and flexible ancestry simulation and inference solutions. It specifically focuses on their performance, usability, and the biological realism of the underlying algorithm, and aims primarily to provide a technical overview of the field for researchers seeking to write their own coalescent-with-recombination sampler. As a complement to this article, we have compiled links to software, source code and documentation and made them available at https://www.patrickfournier.ca/arg-samplers-review/graph.2026-01-14T17:09:42ZPatrick FournierFabrice Larribehttp://arxiv.org/abs/2411.06500v4Graph Neural Network Surrogates to leverage Mechanistic Expert Knowledge towards Reliable and Immediate Pandemic Response2026-01-14T15:26:51ZDuring the COVID-19 crisis, mechanistic models have guided evidence-based decision making. However, time-critical decisions in a dynamical environment limit the time available to gather supporting evidence. We address this bottleneck by developing a graph neural network (GNN) surrogate of an age-structured and spatially resolved mechanistic metapopulation simulation model. This combined approach complements classical modeling approaches which are mostly mechanistic and purely data-driven machine learning approaches which are often black box. Our design of experiments spans outbreak and persistent-threat regimes, up to three contact change points, and age-structured contact matrices on a spatial graph with 400 nodes representing German counties. We benchmark multiple GNN layers and identify an ARMAConv-based architecture that offers a strong accuracy-runtime trade-off. Across horizons of 30-90 day simulation and prediction, allowing up to three contact change points, the surrogate model attains 10-27 \% mean absolute percentage error (MAPE) while delivering (near) constant runtime with respect to the forecast horizon. Our approach accelerates evaluation by up to 28,670 times compared with the mechanistic model, allowing responsive decision support in time-critical scenarios and straightforward web integration. These results show how GNN surrogates can translate complex metapopulation models into immediate, reliable tools for pandemic response.2024-11-10T15:54:09Z20 pages, 9 figuresAgatha SchmidtHenrik ZunkerAlexander HeinleinMartin J. Kühnhttp://arxiv.org/abs/2601.09537v1Gene genealogies in haploid populations evolving according to sweepstakes reproduction2026-01-14T14:57:44ZSweepstakes reproduction may be generated by chance matching of reproduction with favorable environmental conditions. Gene genealogies generated by sweepstakes reproduction are in the domain of attraction of multiple-merger coalescents where a random number of lineages merges at such times. We consider population genetic models of sweepstakes reproduction for haploid panmictic populations of both constant ($N$), and varying population size, and evolving in a random environment. We construct our models so that we can recover the observed number of new mutations in a given sample without requiring strong assumptions regarding the population size or the mutation rate. Our main results are {\it (i)} continuous-time coalescents that are either the Kingman coalescent or specific families of Beta- or Poisson-Dirichlet coalescents; when combining the results the parameter $α$ of the Beta-coalescent ranges from 0 to 2, and the Beta-coalescents may be incomplete due to an upper bound on the number of potential offspring an arbitrary individual may produce; {\it (ii)} in large populations we measure time in units proportional to either $ N/\log N$ or $N$ generations; {\it (iii)} incorporating fluctuations in population size leads to time-changed multiple-merger coalescents where the time-change does not depend on $α$; {\it (iv)} using simulations we show that in some cases approximations of functionals of a given coalescent do not match the ones of the ancestral process in the domain of attraction of the given coalescent; {\it (v)} approximations of functionals obtained by conditioning on the population ancestry (the ancestral relations of all gene copies at all times) are broadly similar (for the models considered here) to the approximations obtained without conditioning on the population ancestry.2026-01-14T14:57:44Z52 pages, 9 figuresBjarki Eldonhttp://arxiv.org/abs/2510.24955v2geohabnet: An R package for mapping habitat connectivity for biosecurity and conservation2026-01-14T02:59:42ZMapping habitat suitability, based on factors like host availability and environmental suitability, is a common approach to determining which locations are important for the spread of a species. Mapping habitat connectivity takes geographic analyses a step further, evaluating the potential roles of locations in biological invasions, pandemics, or species conservation. Locations with high habitat suitability may play a minor role in species spread if they are geographically isolated. Yet, a location with lower habitat suitability may play a major role in a species' spread if it acts as a bridge between regions that would otherwise be physically fragmented.
Here we introduce the geohabnet R package, which evaluates the potential importance of locations for the spread of species through habitat landscapes. geohabnet incorporates key factors such as dispersal probabilities and habitat suitability in a network framework, for better understanding habitat connectivity for host-dependent species, such as pathogens, arthropod pests, or pollinators.
geohabnet uses publicly available or user-provided datasets, six network centrality metrics, and a user-selected geographic scale. We provide examples using geohabnet for surveillance prioritization of emerging plant pests in Africa and the Americas. These examples illustrate how users can apply geohabnet for their species of interest and generate maps of the estimated importance of geographic locations for species spread.
geohabnet provides a quick, open-source, and reproducible baseline to quantify a species' habitat connectivity across a wide range of geographic scales and evaluates potential scenarios for the expansion of a species through habitat landscapes. geohabnet supports biosecurity programs, invasion science, and conservation biology when prioritizing management efforts for transboundary pathogens, pests, or endangered species.2025-10-28T20:37:36ZAaron I. Plex SuláKrishna KeshavAshish AdhikariRomaric A. Mouafo-TchindaJacobo RobledoStavan Nikhilchandra ShahKaren A. Garretthttp://arxiv.org/abs/2601.08538v1Beta-coalescents when sample size is large2026-01-13T13:24:26ZSweepstakes reproduction refers to a highly skewed individual recruitment success without involving natural selection and may apply to individuals in broadcast spawning populations characterised by Type III survivorship. We consider an extension of the model of sweepstakes reproduction for a haploid panmictic population of constant size $N$; the extension also works as an alternative to the Wright-Fisher model. Our model incorporates an upper bound on the random number of potential offspring (juveniles) produced by a given individual. Depending on how the bound behaves relative to the total population size, we obtain the Kingman coalescent, an incomplete Beta-coalescent, or the (complete) Beta-coalescent. We argue that applying such an upper bound is biologically reasonable. Moreover, we estimate the error of the coalescent approximation. The error estimates reveal that convergence can be slow, and small sample size can be sufficient to invalidate convergence, for example if the stated bound is of the form $N/\log N$. We use simulations to investigate the effect of increasing sample size on the site-frequency spectrum. When the limit is a Beta-coalescent, the site frequency spectrum will be as predicted by the limiting tree even though the full coalescent tree may deviate from the limiting one. When in the domain of attraction of the Kingman coalescent the effect of increasing sample size depends on the effective population size as has been noted in the case of the Wright-Fisher model. Conditioning on the population ancestry (the random ancestral relations of the entire population at all times) may have little effect on the site-frequency spectrum for the models considered here (as evidenced by simulation results).2026-01-13T13:24:26Z84 pages; 4 figuresJonathan A Chetwynd-DiggleBjarki Eldonhttp://arxiv.org/abs/2601.08370v1Tara Polaris expeditions: Sustained decadal observations of the coupled Arctic system in rapid transition2026-01-13T09:33:11ZThe coupled Arctic system is in rapid transition and is set to undergo further dramatic changes over the coming decades. These changes will lead most likely to an ice-free ocean in summer, expected before mid-century. The Arctic will become more strongly influenced by atmospheric and oceanographic processes characteristic of mid-latitudes, increasing the prevalence of contaminants and new biological species. This ongoing transition of the Arctic to a new state necessitates systematic monitoring of all sentinels (variables that make an essential contribution to characterizing the Earth's state) to improve our understanding of the system, enhance forecasting and support knowledge-based decisions. Here, we describe a sustained multi-decadal observation program to be implemented on the Tara Polar Station between 2026 and 2046. The monitoring program is designed as a series of year-long drift expeditions, called Tara Polaris, in the central Arctic Ocean, covering all seasons. The multidisciplinary data will bridge ecological, geochemical, biological, and physical parameters and processes in the atmosphere, sea ice and ocean. In addition, data collected with consistent methodologies over a 20-year period will make it possible to distinguish long-term trends from seasonal and interannual variability. In this paper, we discuss specific measurement challenges in each compartment (i.e., atmosphere, sea ice and ocean) along key sentinels and the most pressing scientific questions to be addressed. The expected outcomes of the Tara Polaris program will enable us to understand and quantify the main feedbacks of the coupled Arctic system, with their seasonal and interannual trends and spatial variability.2026-01-13T09:33:11ZMathieu ArdynaIGEMarcel NicolausIGEMarie-Noëlle HoussaisIGEJean-Christophe RautIGEHélène AngotIGEKelsey BissonLECOB, LECOBKristina A BrownLECOB, LECOBJ Michel FloresLECOB, LECOBPierre E GalandLECOB, LECOBJean-François GhiglioneKathy S LawFrançois RavettaJulia SchmaleJeroen E SonkeMarcel BabinMaxime GeoffroyLars-Eric Heimbürger-BoavidaConnie LovejoySøren RysgaardNina SchubackMartin VancoppenolleJean-Eric TremblayChris BowlerLee Karp-BossRomain Troubléhttp://arxiv.org/abs/2601.08062v1Combinatorial comparison of general galled trees, time-consistent galled trees, and simplex time-consistent galled trees2026-01-12T23:08:03ZRooted binary phylogenetic networks are extensions of rooted binary trees, adding reticulation nodes that are designed to represent evolutionary processes that involve hybridization events. Enumerative combinatorics studies have counted leaf-labeled phylogenetic networks in a variety of classes, finding that when the number of reticulations is fixed, the time-consistent galled trees are asymptotically less numerous than each of several network classes that had been previously examined. Here we provide enumerative results on two additional network classes: general galled trees and simplex time-consistent galled trees. We show that for a fixed number of galls, as the number of leaves goes to infinity, the asymptotic count of general galled trees is identical to that of time-consistent galled trees, whereas the count of simplex time-consistent galled trees is smaller. If the number of galls is not restricted, then the asymptotic approximations all differ: simplex time-consistent galled trees are less numerous than time-consistent galled trees, which are in turn less numerous than general galled trees. We also report a variety of additional results: recursions to count the studied networks with small numbers of leaves a fixed number of galls, as well as enumerative results for unlabeled networks in the classes that we investigate.2026-01-12T23:08:03ZLily Agranat-TamirMichael FuchsBernhard GittenbergerNoah A. RosenbergKarthik V. Seetharamanhttp://arxiv.org/abs/2511.04417v2The selective advantage of neighborhood-aware mutants in Moran process2026-01-12T16:35:02ZEvolution occurs in populations of reproducing individuals. In stochastic descriptions of evolutionary dynamics, such as the Moran process, individuals are chosen randomly for birth and for death. If the same type is chosen for both steps, then the reproductive event is wasted, because the composition of the population remains unchanged. Here we introduce a new phenotype, which we call a replacer. Replacers are efficient competitors. When a replacer is chosen for reproduction, the offspring will always replace an individual of another type (if available). We determine the selective advantage of replacers in well-mixed populations and on one-dimensional lattices. We find that being a replacer substantially boosts the fixation probability of neutral and deleterious mutants. In particular, fixation probability of a single neutral replacer who invades a well-mixed population of size $N$ is of the order of $1/\sqrt N$ rather than the standard $1/N$. Even more importantly, replacers are much better protected against invasions once they have reached fixation. Therefore, replacers dominate the mutation selection equilibrium even if the phenotype of being a replacer comes at a substantial cost: curiously, for large population size and small mutation rate the relative fitness of a successful replacer can be as low as $1/e$.2025-11-06T14:46:17ZMichal PechoJosef TkadlecMartin A. Nowakhttp://arxiv.org/abs/2601.07403v1Modeling and analysis of a novel two-strain dengue epidemics model considering secondary infections with increased mortality2026-01-12T10:39:19ZIn this study, we develop and analyze a deterministic two-strain host-vector model for dengue transmission that incorporates key immuno-epidemiological mechanisms, including temporary cross-immunity, antibody-dependent enhancement (ADE), disease-induced mortality during secondary infections, and explicit vector co-infection. The human population is divided into compartments for primary and secondary infections, while the mosquito population includes single- and co-infected classes. ADE is modeled through distinct primary ($α$) and secondary ($σ$) transmission rates. Using the next-generation matrix method, we derive the basic reproduction number $R_0$ and establish the local stability of the disease-free equilibrium for $R_0 < 1$. Analytical results show that one-strain endemic equilibria lose stability under ADE conditions ($σ> α$), allowing invasion by a heterologous strain. Employing center-manifold theory and numerical continuation (COCO), we demonstrate the occurrence of backward bifurcation, bistability between disease-free and endemic states, and Hopf-induced oscillations. Numerical simulations confirm transitions among disease-free, endemic, and periodic regimes as key parameters vary. The model highlights how ADE, waning cross-immunity, and vector co-infection interact to generate complex dengue dynamics and provides insights useful for designing effective control and vaccination strategies in dengue-endemic regions.2026-01-12T10:39:19ZBurcu GürbüzAytül GökçeJoseph Páez ChávezThomas Götzhttp://arxiv.org/abs/2410.02634v3When is local search both effective and efficient?2026-01-10T00:34:51ZCombinatorial optimization problems implicitly define fitness landscapes that combine the numeric structure of the 'fitness' function to be maximized with the combinatorial structure of which assignments are 'adjacent'. Local search starts at an assignment in this landscape and successively moves assignments until no further improvement is possible among the adjacent assignments. Classic analyses of local search algorithms have focused more on the question of effectiveness ("did we find a good solution?") and often implicitly assumed that there are no doubts about their efficiency ("did we find it quickly?"). But there are many reasons to doubt the efficiency of local search. Even if we focus on fitness landscapes on the hypercube that are single peaked on every subcube (i.e., semismooth fitness landscapes) where effectiveness is obvious, many local search algorithms are known to be inefficient. Since fitness landscapes are unwieldy exponentially large objects, we focus on their polynomial-sized representations by instances of valued constraint satisfaction problems (VCSP). We define a "direction" for valued constraints such that directed VCSPs generate semismooth fitness landscapes. We call VCSPs oriented if they do not have any pair of variables with arcs in both directions. Since recognizing if a VCSP-instance is directed or oriented is coNP-complete, we generalized oriented VCSPs as conditionally-smooth fitness landscapes that are recognizable in polynomial time for a VCSP-instance. We prove that many popular local search algorithms like random ascent, simulated annealing, history-based rules, jumping rules, and the Kernighan-Lin heuristic are very efficient on conditionally-smooth landscapes. But conditionally-smooth landscapes are still expressive enough so that algorithms like steepest ascent and random facet require a super-polynomial number of steps to find the fitness peak.2024-10-03T16:16:51Z34 pgs; to appear at STACS2026Artem KaznatcheevSofia Vazquez Alferezhttp://arxiv.org/abs/2601.06354v1Designing a Resilient Allee-Ornstein-Uhlenbeck model2026-01-09T23:19:20ZIn stochastic population dynamics, stochastic wandering can produce transition to an absorbing state. In particular, under Allee effects, low densities amplify the possibility of population collapse. We investigate this in an Allee-Ornstein-Uhlenbeck (Allee-OU) model, that couples a bistable Allee growth equation, with demographic noise, and environmental fluctuations modeled as an Ornstein-Uhlenbeck process. This process replaces the bifurcation parameter of the deterministic Allee effect equation. In the model, small noise may induce escape from the safe basin around the positive equilibrium toward extinction. We construct a stochastic control, altering the process to have a stationary distribution. We enable tractable control design, approximating the process by one with a stationary distribution. Two controlled models are developed, one acting directly on population size and another also modulating the environment. A threshold-based implementation minimizes the frequency of interventions while maximizing safe time. Simulations demonstrate that the control stabilizes fluctuations around the equilibrium.2026-01-09T23:19:20ZLuis F. GordilloPriscilla E. Greenwoodhttp://arxiv.org/abs/2310.09825v2Mathematical Analysis of the role of Information on the Dynamics of Typhoid Fever2026-01-08T23:50:13ZThis study presents a deterministic model to examine how information affects the spread of Typhoid Fever. The models properties, including its stability and basic reproduction number, are analyzed. Simulations show that information can influence behavior in ways that may increase disease transmission. Nota bly, the rise in Typhoid cases is linked to poor adherence to health precautions. The findings highlight the critical role of public education in controlling the disease and emphasize the need to include information campaigns in preven tion strategies.2023-10-15T13:15:53Zhttp://www.oalib.com/journal , 2025Nyanga Honda MasasilaRigobert Charles NgelejaOdeli John Kigodi10.4236/oalib.1110109http://arxiv.org/abs/2601.05367v1The rights and wrongs of rescaling in population genetics simulations2026-01-08T20:47:29ZComputer simulations of complex population genetic models are an essential tool for making sense of the large-scale datasets of multiple genome sequences from a single species that are becoming increasingly available. A widely used approach for reducing computing time is to simulate populations that are much smaller than the natural populations that they are intended to represent, by using parameters such as selection coefficients and mutation rates, whose products with the population size correspond to those of the natural populations. This approach has come to be known as rescaling, and is justified by the theory of the genetics of finite populations. Recently, however, there have been criticisms of this practice, which have brought to light situations in which it can lead to erroneous conclusions. This paper reviews the theoretical basis for rescaling, and relates it to current practice in population genetics simulations. It shows that some population genetic statistics are scaleable while others are not. Additionally, it shows that there are likely to be problems with rescaling when simulating large chromosomal regions, due to the non-linear relation between the physical distance between a pair of separate nucleotide sites and the frequency of recombination between them. Other difficulties with rescaling can arise in connection with simulations of selection on complex traits, and with populations that reproduce partly by self-fertilization or asexual reproduction. A number of recommendations are made for good practice in relation to rescaling.2026-01-08T20:47:29ZParul JohriFanny PouyetBrian Charlesworthhttp://arxiv.org/abs/2601.05222v1Oscillatory Regimes in a Game-Theoretic Model for Mosquito Population Dynamics under Breeding Site Control2026-01-08T18:46:17ZMosquito-borne diseases remain a major public-health threat, and the effective control of mosquito populations requires sustained household participation in removing breeding sites. While environmental drivers of mosquito oscillations have been extensively studied, the influence of spontaneous household decision-making on the dynamics of mosquito populations remains poorly understood. We introduce a game-theoretic model in which the fraction of households performing breeding site control evolves through imitation dynamics driven by perceived risks. Household behavior regulates the carrying capacity of the aquatic mosquito stage, creating a feedback between control actions and mosquito population growth. For a simplified model with constant payoffs, we characterize four locally stable equilibria, corresponding to full or no household control and the presence or absence of mosquito populations. When the perceived risk of not controlling breeding sites depends on mosquito prevalence, the system admits an additional equilibrium with partial household engagement. We derive conditions under which this equilibrium undergoes a Hopf bifurcation, yielding sustained oscillations arising solely from the interaction between mosquito abundance and household behavior. Numerical simulations and parameter explorations further describe the amplitude and phase properties of these oscillatory regimes.2026-01-08T18:46:17ZMohammad Rubayet RahmanChanaka KottegodaLucas M. Stolermanhttp://arxiv.org/abs/2601.05193v1Cell size control in bacteria is modulated through extrinsic noise, single-cell- and population-growth2026-01-08T18:17:17ZLiving cells maintain size homeostasis by actively compensating for size fluctuations. Here, we present two stochastic maps that unify phenomenological models by integrating fluctuating single-cell growth rates and size-dependent noise mechanisms with cell size control. One map is applicable to mother machine lineages and the other to lineage trees of exponentially-growing cell populations, which reveals that population dynamics alter size control measured in mother machine experiments. For example, an adder can become more sizer-like or more timer-like at the population level depending on the noise statistics. Our analysis of bacterial data identifies extrinsic noise as the dominant mechanism of size variability, characterized by a quadratic conditional variance-mean relationship for division size across growth conditions. This finding contradicts the reported independence of added size relative to birth size but is consistent with the adder property in terms of the independence of the mean added size. Finally, we derive a trade-off between population-growth-rate gain and division-size noise. Correlations between size control quantifiers and single-cell growth rates inferred from data indicate that bacteria prioritize a narrow division-size distribution over growth rate maximisation.2026-01-08T18:17:17ZArthur GenthonPhilipp Thomas