Artificial Empathy: AI based Mental Health

2025-10-31T05:23:32Z

Many people suffer from mental health problems but not everyone seeks professional help or has access to mental health care. AI chatbots have increasingly become a go-to for individuals who either have mental disorders or simply want someone to talk to. This paper presents a study on participants who have previously used chatbots and a scenario-based testing of large language model (LLM) chatbots. Our findings indicate that AI chatbots were primarily utilized as a "Five minute therapist" or as a non-judgmental companion. Participants appreciated the anonymity and lack of judgment from chatbots. However, there were concerns about privacy and the security of sensitive information. The scenario-based testing of LLM chatbots highlighted additional issues. Some chatbots were consistently reassuring, used emojis and names to add a personal touch, and were quick to suggest seeking professional help. However, there were limitations such as inconsistent tone, occasional inappropriate responses (e.g., casual or romantic), and a lack of crisis sensitivity, particularly in recognizing red flag language and escalating responses appropriately. These findings can inform both the technology and mental health care industries on how to better utilize AI chatbots to support individuals during challenging emotional periods.

Pharmacovigilance Analysis of Drug-Induced Rhabdomyolysis Based on the FDA Adverse Event Reporting System (FAERS)

2025-10-30T07:08:11Z

This study aimed to systematically identify and quantify risks for drug-induced rhabdomyolysis (DIR) using real-world data and to propose an evidence-based risk mitigation framework. We conducted a retrospective pharmacovigilance study using the FDA Adverse Event Reporting System (FAERS) database from Q1 2005 to Q1 2025. A two-stage analysis involved initial signal detection using the Reporting Odds Ratio (ROR), followed by a LASSO-optimized multivariate logistic regression to calculate adjusted odds ratios (aORs) for 54 target drugs while controlling for confounders. Our analysis confirmed potent DIR risks for known agents, such as gemfibrozil (aOR 173.67) and statins (lovastatin aOR 97.20, simvastatin aOR 85.12). Crucially, we identified strong, novel risk signals for drugs currently lacking warnings, most notably levetiracetam (aOR 11.02) and donepezil (aOR 8.90). A significant "labeling gap" was quantified: 61.1% of drugs with a statistically significant DIR risk lack a corresponding warning in U.S. drug labels. We subsequently developed a three-tiered risk stratification model. The proposed framework provides a data-driven foundation for developing tiered clinical decision support systems, enhancing prescribing safety, and guiding future regulatory action to bridge the identified evidence-to-labeling gap.

Methylator: A Modular Framework for DNA Methylation Analysis in Mammals and Plants Using Galaxy

2025-10-27T19:14:00Z

DNA cytosine methylation is a critical epigenetic mark regulating gene expression and thus playing an important role in development and differentiation across eukaryotes. Existing tools for high-throughput methylation analysis often lack cross-species flexibility or require command-line expertise. We present Methylator, a novel, end-to-end DNA methylation analysis framework integrated into the Galaxy platform, enabling accessible DNA methylation analysis for mammals and plants. Methylator supports analyses from data obtained using diverse protocols like WGBS, RRBS, and PBAT and handles all contexts of DNA methylation (CpG, CHG, and CHH). The Methylator framework includes quality control, alignment, methylation calling, differential analysis, and functional analysis through reproducible, user-friendly workflows. Its unique Dirty-Harry alignment method enhances mapping efficiency, while a Shiny-based interface allows for interactive, publication-ready visualizations. Methylator is freely available, offering researchers a versatile, user-friendly solution for epigenomic studies.

Fast Voxel-Wise Kinetic Modeling in Dynamic PET using a Physics-Informed CycleGAN

2025-10-27T09:17:02Z

Tracer kinetic modeling serves a vital role in diagnosis, treatment planning, tracer development and oncology, but burdens practitioners with complex and invasive arterial input function estimation (AIF). We adopt a physics-informed CycleGAN showing promise in DCE-MRI quantification to dynamic PET quantification. Our experiments demonstrate sound AIF predictions and parameter maps closely resembling the reference.

FedCVD++: Communication-Efficient Federated Learning for Cardiovascular Risk Prediction with Parametric and Non-Parametric Model Optimization

2025-10-25T10:38:41Z

Cardiovascular diseases (CVD) cause over 17 million deaths annually worldwide, highlighting the urgent need for privacy-preserving predictive systems. We introduce FedCVD++, an enhanced federated learning (FL) framework that integrates both parametric models (logistic regression, SVM, neural networks) and non-parametric models (Random Forest, XGBoost) for coronary heart disease risk prediction. To address key FL challenges, we propose: (1) tree-subset sampling that reduces Random Forest communication overhead by 70%, (2) XGBoost-based feature extraction enabling lightweight federated ensembles, and (3) federated SMOTE synchronization for resolving cross-institutional class imbalance. Evaluated on the Framingham dataset (4,238 records), FedCVD++ achieves state-of-the-art results: federated XGBoost (F1 = 0.80) surpasses its centralized counterpart (F1 = 0.78), and federated Random Forest (F1 = 0.81) matches non-federated performance. Additionally, our communication-efficient strategies reduce bandwidth consumption by 3.2X while preserving 95% accuracy. Compared to existing FL frameworks, FedCVD++ delivers up to 15% higher F1-scores and superior scalability for multi-institutional deployment. This work represents the first practical integration of non-parametric models into federated healthcare systems, providing a privacy-preserving solution validated under real-world clinical constraints.

Identification of Shared Genetic Biomarkers to Discover Candidate Drugs for Cervical and Endometrial Cancer by Using the Integrated Bioinformatics Approaches

2025-10-25T06:02:52Z

Cervical (CC) and endometrial cancers (EC) are two common types of gynecological tumors that threaten the health of females worldwide. Since their underlying mechanisms and associations remain unclear, computational bioinformatics analysis is required. In the present study, bioinformatics methods were used to screen for key candidate genes, their functions and pathways, and drug agents associated with CC and EC, aiming to reveal the possible molecular-level mechanisms. Four publicly available microarray datasets of CC and EC from the Gene Expression Omnibus database were downloaded, and 72 differentially expressed genes (DEGs) were selected through integrated analysis. Then, we performed the protein-protein interaction (PPI) analysis and identified 9 shared genetic biomarkers (SGBs). The GO functional and KEGG pathway enrichment analyses of these SGBs revealed some important functions and signaling pathways significantly associated with CC and EC. The interaction network analysis identified four transcription factors (TFs) and two miRNAs as key transcriptional and post-transcriptional regulators of SGBs. The expression of the AURKA, TOP2A, and UBE2C genes was higher in CC and EC tissues than in normal samples, and this gene expression was linked to disease progression. Furthermore, we performed docking analysis between 9 SGBs-based proteins and 145 meta-drugs, and identified the top-ranked 10 drugs as candidate drugs. Finally, we investigated the binding stability of the top-ranked three drugs (Sorafenib, Paclitaxel, Sunitinib) using 100 ns MD-based MM-PBSA simulations with UBE2C, AURKA, and TOP2A proteins, and observed their stable performance. Therefore, the proposed drugs might play a vital role in the treatment against CC and EC.

Transverse contributions to the longitudinal stiffness of the human foot

2025-10-24T20:23:23Z

Humans rely on foot stiffness to withstand the propulsive forces of walking and running. Skeletal adaptations that increase foot stiffness include the medial longitudinal arch (MLA) and the transverse tarsal arch (TTA). The TTA has been hypothesized to stiffen the foot through cross-axis coupling of transverse intermetatarsal stiffness with sagittal-plane midfoot stiffness, but this has been tested only in cadaveric specimens. In vivo testing is essential because muscle contraction substantially modulates MLA function and may similarly affect the TTA's cross-axis coupling. Here we provide in vivo evidence for the TTA's contribution to foot stiffness by externally increasing intermetatarsal stiffness and measuring its effects on midfoot elasticity during walking. As predicted by the cross-axis coupling hypothesis, increasing intermetatarsal stiffness with an elastic tape wrapped around the forefoot reduced the energy absorbed in midfoot flattening and increased sagittal-plane midfoot stiffness concomitantly (mean,$\pm$,standard error of the mean (SEM): $13.9\% \pm 3\%$ and $16.8\% \pm 5.8\%$, respectively). However, taping did not change the curvature of the TTA, thereby isolating the effects of cross-axis coupling from morphological changes to the TTA. Thus, forefoot taping modulates midfoot stiffness through cross-axis coupling and could provide a non-invasive means to manage pathological foot flexibility or enhance athletic performance.

Challenges and Recommendations in Establishing National Human Diversity Genomic Projects

2025-10-22T05:20:05Z

Genomic approaches have revolutionized medical research, providing valuable insights into human physiology and disease. Despite major benefits from large collections of genomes, the lack of diversity in genomic data represents a significant challenge for advancing biomedical discovery and accessible health solutions worldwide. Establishing a national genomic project is not a one-size-fits-all endeavor, as each country presents distinct challenges and opportunities. We identify challenges in the way of obtaining and publishing data from Whole Genome Sequencing (WGS) of people in various countries, discuss the progress made by some in their efforts to study their genetic diversity, and assess the most common issues. We recognize that a successful national genome database requires addressing several major issues, including the variable awareness of the recent developments in genomics among government officials, healthcare administrators, and policymakers, the absence of regulations, and ethical considerations, the challenges in securing funding, establishing legal frameworks, and building the necessary infrastructure. By assembling a diverse team of experts across 19 countries, we aim to provide a balanced approach in our recommendations to establish national projects. Our study acknowledges and addresses major intricacies and nuances specific to various settings and regions while presenting diverse opinions of scientists from both high-resource and low-resource countries contributing to a more inclusive and globally relevant framework for advancing genomic research and its applications.

Combinations of histone deacetylase inhibitors extend chronological lifespan in S. cerevisiae

2025-10-21T21:20:48Z

Aging is the primary risk factor for nearly all forms of human death, yet pharmaceutical interventions hold the potential to prevent it. Combinations of drugs have been shown to increase the lifespan of model organisms more than individual drugs, and geroprotective histone deacetylase (HDAC) inhibitors that have different molecular targets within the longevity regulation network show considerably higher drug synergy than many other compounds. In this study, four HDAC inhibitors (curcumin, quercetin, resveratrol, and berberine) have been administered in pairwise, three-, and four-combinations to yeast (S. cerevisiae) and their maximum chronological lifespans (CLS) have been measured. In five of the six pairwise combinations, the drugs exhibited synergy according to the Bliss Independence Model, on average extending maximum CLS 68% over the individual drugs. Three- and four-combinations further extended maximum CLS 49% and 107% over pairwise combinations, respectively. Since the targets of the HDAC inhibitors used in this study are evolutionarily conserved between yeast and humans, the results obtained have implications on human longevity.

Relation between in vitro microbial fermentations and in vivo performance in pigs selected for their residual feed intake

2025-10-21T08:22:27Z

Bioinformatic analysis of microbiota revealed that certain metabolic pathways are associated with low- and high- residual feed intake (HRFI and LRFI), such as the amino-acid biosynthesis pathway and the tRNA-aminoacyl synthesis pathway. The latter is associated with increased propionate production. Yet, in vitro fermentation-profile analyses revealed that LRFI pigs, from the most efficient genetic line, produced more acetate (+15%) and propionate (+56%) from the insoluble fraction (IF) containing the insoluble dietary fibre recovered after simulation of upper gastrointestinal digestion. Valerate was also more frequently abundant in LRFI pigs (P < 0.01). 16S sequencing analysis of the microbes responsible for fermentation suggested that propionate obtained from the fraction of feed that is indigestible by the host is produced mainly by Prevotella and Lactobacillus. This production was strongly correlated with backfat thickness in LRFI pigs (Spearman's correlation = 0.80), while a moderate correlation existed between butyrate production and feed efficiency in HRFI pigs (Spearman's correlation = 0.44). These results revealed that propionate production is related to fat metabolism, suggesting that GPR43 receptor activation by propionate could play a physiological role in adipose cells in RFI-pigs. These observations highlight significant functional differences between the microbiota of HRFI and LRFI pigs, as well as variability within more efficient pigs that could be exploited to improve performance.

Open and Sustainable AI: challenges, opportunities and the road ahead in the life sciences (October 2025 -- Version 2)

2025-10-14T08:23:04Z

Artificial intelligence (AI) has recently seen transformative breakthroughs in the life sciences, expanding possibilities for researchers to interpret biological information at an unprecedented capacity, with novel applications and advances being made almost daily. In order to maximise return on the growing investments in AI-based life science research and accelerate this progress, it has become urgent to address the exacerbation of long-standing research challenges arising from the rapid adoption of AI methods. We review the increased erosion of trust in AI research outputs, driven by the issues of poor reusability and reproducibility, and highlight their consequent impact on environmental sustainability. Furthermore, we discuss the fragmented components of the AI ecosystem and lack of guiding pathways to best support Open and Sustainable AI (OSAI) model development. In response, this perspective introduces a practical set of OSAI recommendations directly mapped to over 300 components of the AI ecosystem. Our work connects researchers with relevant AI resources, facilitating the implementation of sustainable, reusable and transparent AI. Built upon life science community consensus and aligned to existing efforts, the outputs of this perspective are designed to aid the future development of policy and structured pathways for guiding AI implementation.

Interplay of Fidelity and Diversity in the Evolution of the Genetic Code

2025-10-13T00:40:14Z

The origin and organizing principles of the genetic code remain fundamental puzzles in life science. The vanishingly low probability of the natural codon-to-amino acid mapping arising by chance has spurred the hypothesis that its structure is a solution optimized for robustness against mutations and translational errors. For the construction of effective molecular machines, the dictionary of encoded amino acids must also be diverse enough in physicochemical features. Here, we examine whether the standard genetic code can be understood as a near-optimal solution balancing these two objectives: minimizing error load and aligning codon assignments with the naturally occurring amino acid composition. Using simulated annealing, we explore this trade-off across a broad range of parameters. We find that the standard genetic code lies near local optima within the multidimensional parameter space. It is a highly effective solution that balances fidelity against resource availability constraints. These results suggest that the present genetic code reflects coevolution under conflicting pressures of fidelity and diversity, offering new insight into its emergence and evolution.

Isotropy and Geometry of Pretrained Protein LMs

2025-10-12T15:19:40Z

Large pretrained language models have transformed natural language processing, and their adaptation to protein sequences -- viewed as strings of amino acid characters -- has advanced protein analysis. However, the distinct properties of proteins, such as variable sequence lengths and lack of word-sentence analogs, necessitate a deeper understanding of protein language models (LMs). We investigate the isotropy of protein LM embedding spaces using average pairwise cosine similarity and the IsoScore method, revealing that models like ProtBERT and ProtXLNet are highly anisotropic, utilizing only 2--14 dimensions for global and local representations. In contrast, multi-modal training in ProteinBERT, which integrates sequence and gene ontology data, enhances isotropy, suggesting that diverse biological inputs improve representational efficiency. We also find that embedding distances weakly correlate with alignment-based similarity scores, particularly at low similarity.

Evaluation and Implementation of Machine Learning Algorithms to Predict Early Detection of Kidney and Heart Disease in Diabetic Patients

2025-10-12T13:28:26Z

Cardiovascular disease and chronic kidney disease are major complications of diabetes, leading to high morbidity and mortality. Early detection of these conditions is critical, yet traditional diagnostic markers often lack sensitivity in the initial stages. This study integrates conventional statistical methods with machine learning approaches to improve early diagnosis of CKD and CVD in diabetic patients. Descriptive and inferential statistics were computed in SPSS to explore associations between diseases and clinical or demographic factors. Patients were categorized into four groups: Group A both CKD and CVD, Group B CKD only, Group C CVD only, and Group D no disease. Statistical analysis revealed significant correlations: Serum Creatinine and Hypertension with CKD, and Cholesterol, Triglycerides, Myocardial Infarction, Stroke, and Hypertension with CVD. These results guided the selection of predictive features for machine learning models. Logistic Regression, Support Vector Machine, and Random Forest algorithms were implemented, with Random Forest showing the highest accuracy, particularly for CKD prediction. Ensemble models outperformed single classifiers in identifying high-risk diabetic patients. SPSS results further validated the significance of the key parameters integrated into the models. While challenges such as interpretability and class imbalance remain, this hybrid statistical machine learning framework offers a promising advancement toward early detection and risk stratification of diabetic complications compared to conventional diagnostic approaches.

Thinned COE random matrix models for DNA replication

2025-10-11T21:05:33Z

This paper details an observation that for more primitive organisms, such as some yeasts, the statistical distribution of the origins of replication sometimes looks remarkably like the distribution of eigenvalues from the Circular Orthogonal Ensemble (COE) of random matrices. This does not hold for more complex organisms, but a uniform thinning of the COE eigenvalues (which interpolates between the COE and uncorrelated, Poisson statistics) gives a platform to investigate characteristics of replication origin distribution in other species where data is available.