A stochastic population model with hierarchic size-structure

2024-07-12T13:50:05Z

We consider a hierarchically structured population in which the amount of resources an individual has access to is affected by individuals that are larger, and that the intake of resources by an individual only affects directly the growth rate of the individual. We formulate a deterministic model, which takes the form of a delay equation for the population birth rate. We also formulate an individual based stochastic model, and study the relationship between the two models. In particular the stationary birth rate of the deterministic model is compared to that of the quasi-stationary birth rate of the stochastic model. Since the quasi-stationary birth rate cannot be obtained explicitly, we derive a formula to approximate it. We show that the stationary birth rate of the deterministic model can be obtained as the large population limit of the quasi-stationary birth rate of the stochastic model. This relation suggests that the deterministic model is a good approximation of the stochastic model when the number of individuals is sufficiently large.

Fence decompositions and cherry covers in non-binary phylogenetic networks

2024-07-11T18:50:53Z

Reticulate evolution can be modelled using phylogenetic networks. Tree-based networks, which are one of the more general classes of phylogenetic networks, have recently gained eminence for its ability to represent evolutionary histories with an underlying tree structure. To better understand tree-based networks, numerous characterizations have been proposed, based on tree embeddings, matchings, and arc partitions. Here, we build a bridge between two arc partition characterizations, namely maximal fence decompositions and cherry covers. Results on cherry covers have been found for general phylogenetic networks. We first show that the number of cherry covers is the same as the number of support trees (underlying tree structure of tree-based networks) for a given semibinary network. Maximal fence decompositions have only been defined thus far for binary networks (constraints on vertex degrees). We remedy this by generalizing fence decompositions to non-binary networks, and using this, we characterize semi-binary tree-based networks in terms of forbidden structures. Furthermore, we give an explicit enumeration of cherry covers of semi-binary networks, by studying its fence decomposition. Finally, we prove that it is possible to characterize semi-binary tree-child networks, a subclass of tree-based networks, in terms of the number of their cherry covers.

A general mathematical framework for understanding the behavior of heterogeneous stem cell regeneration

2024-07-11T13:26:37Z

Stem cell heterogeneity is essential for the homeostasis in tissue development. This paper established a general formulation for understanding the dynamics of stem cell regeneration with cell heterogeneity and random transitions of epigenetic states. The model generalizes the classical G0 cell cycle model, and incorporates the epigenetic states of stem cells that are represented by a continuous multidimensional variable and the kinetic rates of cell behaviors, including proliferation, differentiation, and apoptosis, that are dependent on their epigenetic states. Moreover, the random transition of epigenetic states is represented by an inheritance probability that can be described as a conditional beta distribution. This model can be extended to investigate gene mutation-induced tumor development. The proposed formula is a generalized formula that helps us to understand various dynamic processes of stem cell regeneration, including tissue development, degeneration, and abnormal growth.

Modeling and Analysis of a Coupled SIS Bi-Virus Model

2024-07-10T19:52:40Z

The paper deals with the setting where two viruses (say virus 1 and virus 2) coexist in a population, and they are not necessarily mutually exclusive, in the sense that infection due to one virus does not preclude the possibility of simultaneous infection due to the other. We develop a coupled bi-virus susceptible-infected-susceptible (SIS) model from a 4n-state Markov chain model, where n is the number of agents (i.e., individuals or subpopulation) in the population. We identify a sufficient condition for both viruses to eventually die out, and a sufficient condition for the existence, uniqueness and asymptotic stability of the endemic equilibrium of each virus. We establish a sufficient condition and multiple necessary conditions for local exponential convergence to the boundary equilibrium (i.e., one virus persists, the other one dies out) of each virus. Under mild assumptions on the healing rate, we show that there cannot exist a coexisting equilibrium where for each node there is a nonzero fraction infected only by virus 1; a nonzero fraction infected only by virus 2; but no fraction that is infected by both viruses 1 and 2. Likewise, assuming that healing rates are strictly positive, a coexisting equilibrium where for each node there is a nonzero fraction infected by both viruses 1 and 2, but no fraction is infected only by virus 1 (resp. virus 2) does not exist. Further, we provide a necessary condition for the existence of certain other kinds of coexisting equilibria. We show that, unlike the competitive bivirus model, the coupled bivirus model is not monotone. Finally, we illustrate our theoretical findings using an extensive set of in-depth simulations.

Orchard: building large cancer phylogenies using stochastic combinatorial search

2024-07-10T00:32:49Z

Phylogenies depicting the evolutionary history of genetically heterogeneous subpopulations of cells from the same cancer, i.e., cancer phylogenies, offer valuable insights about cancer development and guide treatment strategies. Many methods exist that reconstruct cancer phylogenies using point mutations detected with bulk DNA sequencing. However, these methods become inaccurate when reconstructing phylogenies with more than 30 mutations, or, in some cases, fail to recover a phylogeny altogether. Here, we introduce Orchard, a cancer phylogeny reconstruction algorithm that is fast and accurate using up to 1000 mutations. Orchard samples without replacement from a factorized approximation of the posterior distribution over phylogenies, a novel result derived in this paper. Each factor in this approximate posterior corresponds to a conditional distribution for adding a new mutation to a partially built phylogeny. Orchard optimizes each factor sequentially, generating a sequence of incrementally larger phylogenies that ultimately culminate in a complete tree containing all mutations. Our evaluations demonstrate that Orchard outperforms state-of-the-art cancer phylogeny reconstruction methods in reconstructing more plausible phylogenies across 90 simulated cancers and 14 B-progenitor acute lymphoblastic leukemias (B-ALLs). Remarkably, Orchard accurately reconstructs cancer phylogenies using up to 1,000 mutations. Additionally, we demonstrate that the large and accurate phylogenies reconstructed by Orchard are useful for identifying patterns of somatic mutations and genetic variations among distinct cancer cell subpopulations.

Quantifying "just-right" APC inactivation for colorectal cancer initiation

2024-07-09T12:24:01Z

Dysregulation of the tumour suppressor gene Adenomatous Polyposis Coli (APC) is a canonical step in colorectal cancer development. Curiously, most colorectal tumours carry biallelic mutations that result in only partial loss of APC function, suggesting that a "just-right" level of APC inactivation, and hence Wnt signalling, provides the optimal conditions for tumorigenesis. Mutational processes act variably across the APC gene, which could contribute to the bias against complete APC inactivation. Thus the selective consequences of partial APC loss are unclear. Here we propose a mathematical model to quantify the tumorigenic effect of biallelic APC genotypes, controlling for somatic mutational processes. Analysing sequence data from >2500 colorectal cancers, we find that APC genotypes resulting in partial protein function confer about 50 times higher probability of progressing to cancer compared to complete APC inactivation. The optimal inactivation level varies with anatomical location and additional mutations of Wnt pathway regulators. We use this context dependency to assess the regulatory effect of secondary Wnt drivers in combination with APC in vivo, and provide evidence that mutant AMER1 combines with APC genotypes that lead to relatively low Wnt. The fitness landscape of APC inactivation is consistent across microsatellite unstable and POLE-deficient colorectal cancers and tumours in patients with Familial Adenomatous Polyposis, suggesting a general "just-right" optimum, and pointing to Wnt hyperactivation as a potential cancer vulnerability.

Population dynamics under demographic and environmental stochasticity

2024-07-08T20:44:32Z

The present paper is devoted to the study of the long term dynamics of diffusion processes modelling a single species that experiences both demographic and environmental stochasticity. In our setting, the long term dynamics of the diffusion process in the absence of demographic stochasticity is determined by the sign of $Λ_0$, the external Lyapunov exponent, as follows: $Λ_0<0$ implies (asymptotic) extinction and $Λ_0>0$ implies convergence to a unique positive stationary distribution $μ_0$. If the system is of size $\frac{1}{ε^{2}}$ for small $ε>0$ (the intensity of demographic stochasticity), demographic effects will make the extinction time finite almost surely. This suggests that to understand the dynamics one should analyze the quasi-stationary distribution (QSD) $μ_ε$ of the system. The existence and uniqueness of the QSD is well-known under mild assumptions. We look at what happens when the population size is sent to infinity, i.e., when $ε\to 0$. We show that the external Lyapunov exponent still plays a key role: 1) If $Λ_0<0$, then $μ_ε\to δ_0$, the mean extinction time is of order $|\ln ε|$ and the extinction rate associated with the QSD $μ_ε$ has a lower bound of order $\frac{1}{|\lnε|}$; 2) If $Λ_0>0$, then $μ_ε\to μ_0$, the mean extinction time is polynomial in $\frac{1}{ε^{2}}$ and the extinction rate is polynomial in $ε^{2}$. Furthermore, when $Λ_0>0$ we are able to show that the system exhibits multiscale dynamics: at first the process quickly approaches the QSD $μ_ε$ and then, after spending a polynomially long time there, it relaxes to the extinction state. We give sharp asymptotics in $ε$ for the time spent close to $μ_ε$.

The Path-Label Reconciliation (PLR) Dissimilarity Measure for Gene Trees

2024-07-08T20:19:11Z

In this study, we investigate the problem of comparing gene trees reconciled with the same species tree using a novel semi-metric, called the Path-Label Reconciliation (PLR) dissimilarity measure. This approach not only quantifies differences in the topology of reconciled gene trees, but also considers discrepancies in predicted ancestral gene-species maps and speciation/duplication events, offering a refinement of existing metrics such as Robinson-Foulds (RF) and their labeled extensions LRF and ELRF. A tunable parameter α also allows users to adjust the balance between its species map and event labeling components. We show that PLR can be computed in linear time and that it is a semi-metric. We also discuss the diameters of reconciled gene tree measures, which are important in practice for normalization, and provide initial bounds on PLR, LRF, and ELRF. To validate PLR, we simulate reconciliations and perform comparisons with LRF and ELRF. The results show that PLR provides a more evenly distributed range of distances, making it less susceptible to overestimating differences in the presence of small topological changes, while at the same time being computationally efficient. Our findings suggest that the theoretical diameter is rarely reached in practice. The PLR measure advances phylogenetic reconciliation by combining theoretical rigor with practical applicability. Future research will refine its mathematical properties, explore its performance on different tree types, and integrate it with existing bioinformatics tools for large-scale evolutionary analyses. The open source code is available at: https://pypi.org/project/parle/.

Analysis of genetic diversity among some Iraqi durum wheat cultivars revealed by different molecular markers

2024-07-08T14:44:33Z

Durum wheat has been cultivated since the beginning of crop domestication, occupying now the tenth ranking among the global most significant cultivated crops. Despite the fact that, the extent of the crop genetic diversity has not yet fully incorporated into modern varieties through breeding programs. In this study, a total of 35 markers (11 RAPD, 12 ISSR, and 12 CDDP) were utilized to assess the genetic variability and population structure of sixteen different cultivars of Iraqi durum wheat. Out of 294 bands obtained, 171 were identified as polymorphic: 47.00 polymorphic alleles from 98 RAPD bands, 53 polymorphic alleles from a total of 89 ISSR bands, and 71 alleles from 107 CDDP bands. The average number of observed alleles (Na), effective number of alleles (Ne), Shannon's information index (I), expected heterozygosity or gene diversity (He), unbiased expected heterozygosity (uHe), and polymorphic information content (PIC) (1.45, 1.38, 0.32, 0.22, 0.24, and 0.28, respectively) were obtained for RAPDs , (1.63, 1.45, 0.40, 0.27, 0.29, and 0.32, respectively) ISSRs and (1.35, 1.35, 0.31, 0.21, 0.23, and 0.30, respectively) for the CDDP markers. A dendrogram of two main clades (unweighted pair group method with arithmetic mean; UPGMA) and three populations of structure analysis, were obtained based on the three markers data. The analysis of molecular variance indicated 97.00%, 97.00%, and 90.00% variability within populations, applying RAPD, ISSR, and CDDP markers, respectively. The highest diversity indices were revealed in population 2 under the RAPD and CDDP markers, whereas population 1 had the highest values of these indices according to the ISSR markers. The results provide greater knowledge on the genetic makeup of Iraqi durum wheat cultivars, that facilitate future breeding programs of this crop.

Epidemic Transmission Modeling with Fractional Derivatives and Environmental Pathogens

2024-07-08T05:44:37Z

This research presents an advanced fractional-order compartmental model designed to delve into the complexities of COVID-19 transmission dynamics, specifically accounting for the influence of environmental pathogens on disease spread. By enhancing the classical compartmental framework, our model distinctively incorporates the effects of order derivatives and environmental shedding mechanisms on the basic reproduction numbers, thus offering a holistic perspective on transmission dynamics. Leveraging fractional calculus, the model adeptly captures the memory effect associated with disease spread, providing an authentic depiction of the virus's real-world propagation patterns. A thorough mathematical analysis confirming the existence, uniqueness, and stability of the model's solutions emphasizes its robustness. Furthermore, the numerical simulations, meticulously calibrated with real COVID-19 case data, affirm the model's capacity to emulate observed transmission trends, demonstrating the pivotal role of environmental transmission vectors in shaping public health strategies. The study highlights the critical role of environmental sanitation and targeted interventions in controlling the pandemic's spread, suggesting new insights for research and policy-making in infectious disease management.

A state-space catch-at-length assessment model for redfish on the Eastern Grand Bank of Newfoundland reveals large uncertainties in data and stock dynamics

2024-07-05T18:15:35Z

We developed a state-space age-structured catch-at-length (ACL) assessment model for redfish in NAFO Divisions 3LN. The model was developed to address limitations in the surplus production model that was previously used to assess this stock. The ACL model included temporal variations in recruitment, growth, and mortality rates, which were limitations identified for the surplus production model. Our ACL model revealed some important discrepancies in survey and fishery length compositions. Our model also required large population dynamics process errors to achieve good fits to survey indices and catch estimates, which also demonstrated that additional understanding of these data and other model assumptions is required. As such, we do not propose the ACL model to provide management advice for 3LN redfish, but we do provide research recommendations that should provide a better basis to model the 3LN redfish stock dynamics. Recommendations include implementing sampling programs to determine redfish species/ecotypes in commercial and research survey catches and improving biological sampling for maturity and age.

Error-induced extinction in a multi-type critical birth-death process

2024-07-05T14:42:36Z

Extreme mutation rates in microbes and cancer cells can result in error-induced extinction (EEX), where every descendant cell eventually acquires a lethal mutation. In this work, we investigate critical birth-death processes with $n$ distinct types as a birth-death model of EEX in a growing population. Each type-$i$ cell divides independently $(i)\to(i)+(i)$ or mutates $(i)\to(i+1)$ at the same rate. The total number of cells grows exponentially as a Yule process until a cell of type-$n$ appears, which cell type can only die at rate one. This makes the whole process critical and hence after the exponentially growing phase eventually all cells die with probability one. We present large-time asymptotic results for the general $n$-type critical birth-death process. We find that the mass function of the number of cells of type-$k$ has algebraic and stationary tail $(\text{size})^{-1-χ_k}$, with $χ_k=2^{1-k}$, for $k=2,\dots,n$, in sharp contrast to the exponential tail of the first type. The same exponents describe the tail of the asymptotic survival probability $(\text{time})^{-χ_n}$. We present applications of the results for studying extinction due to intolerable mutation rates in biological populations.

Diversity in Evolutionary Dynamics

2024-07-05T09:20:47Z

We consider the dynamics imposed by natural selection on the populations of two competing, sexually reproducing, haploid species. In this setting, the fitness of any genome varies over time due to the changing population mix of the competing species; crucially, this fitness variation arises naturally from the model itself, without the need for imposing it exogenously as is typically the case. Previous work on this model [14] showed that, in the special case where each of the two species exhibits just two phenotypes, genetic diversity is maintained at all times. This finding supported the tenet that sexual reproduction is advantageous because it promotes diversity, which increases the survivability of a species. In the present paper we consider the more realistic case where there are more than two phenotypes available to each species. The conclusions about diversity in general turn out to be very different from the two-phenotype case. Our first result is negative: namely, we show that sexual reproduction does not guarantee the maintenance of diversity at all times, i.e., the result of [14] does not generalize. Our counterexample consists of two competing species with just three phenotypes each. We show that, for any time~$t_0$ and any $\varepsilon>0$, there is a time $t\ge t_0$ at which the combined diversity of both species is smaller than~$\varepsilon$. Our main result is a complementary positive statement, which says that in any non-degenerate example, diversity is maintained in a weaker, "infinitely often" sense. Thus, our results refute the supposition that sexual reproduction ensures diversity at all times, but affirm a weaker assertion that extended periods of high diversity are necessarily a recurrent event.

Statistics for Phylogenetic Trees in the Presence of Stickiness

2024-07-04T14:50:42Z

Samples of phylogenetic trees arise in a variety of evolutionary and biomedical applications, and the Fréchet mean in Billera-Holmes-Vogtmann tree space is a summary tree shown to have advantages over other mean or consensus trees. However, use of the Fréchet mean raises computational and statistical issues which we explore in this paper. The Fréchet sample mean is known often to contain fewer internal edges than the trees in the sample, and in this circumstance calculating the mean by iterative schemes can be problematic due to slow convergence. We present new methods for identifying edges which must lie in the Fréchet sample mean and apply these to a data set of gene trees relating organisms from the apicomplexa which cause a variety of parasitic infections. When a sample of trees contains a significant level of heterogeneity in the branching patterns, or topologies, displayed by the trees then the Fréchet mean is often a star tree, lacking any internal edges. Not only in this situation, the population Fréchet mean is affected by a non-Euclidean phenomenon called stickness which impacts upon asymptotics, and we examine two data sets for which the mean tree is a star tree. The first consists of trees representing the physical shape of artery structures in a sample of medical images of human brains in which the branching patterns are very diverse. The second consists of gene trees from a population of baboons in which there is evidence of substantial hybridization. We develop hypothesis tests which work in the presence of stickiness. The first is a test for the presence of a given edge in the Fréchet population mean; the second is a two-sample test for differences in two distributions which share the same sticky population mean.

Parameter estimation of epidemic spread in two-layer random graphs by classical and machine learning methods

2024-07-03T11:29:58Z

Our main goal in this paper is to quantitatively compare the performance of classical methods to XGBoost and convolutional neural networks in a parameter estimation problem for epidemic spread. As we use flexible two-layer random graphs as the underlying network, we can also study how much the structure of the graphs in the training set and the test set can differ while to get a reasonably good estimate. In addition, we also examine whether additional information (such as the average degree of infected vertices) can help improving the results, compared to the case when we only know the time series consisting of the number of susceptible and infected individuals. Our simulation results also show which methods are most accurate in the different phases of the epidemic.