https://arxiv.org/api/4+skY1Bbf0aDYTQu5HO2KlWMZzc 2026-06-18T15:55:22Z 1596 330 15 http://arxiv.org/abs/2407.20534v1 BERT and LLMs-Based avGFP Brightness Prediction and Mutation Design 2024-07-30T04:27:21Z This study aims to utilize Transformer models and large language models (such as GPT and Claude) to predict the brightness of Aequorea victoria green fluorescent protein (avGFP) and design mutants with higher brightness. Considering the time and cost associated with traditional experimental screening methods, this study employs machine learning techniques to enhance research efficiency. We first read and preprocess a proprietary dataset containing approximately 140,000 protein sequences, including about 30,000 avGFP sequences. Subsequently, we constructed and trained a Transformer-based prediction model to screen and design new avGFP mutants that are expected to exhibit higher brightness. Our methodology consists of two primary stages: first, the construction of a scoring model using BERT, and second, the screening and generation of mutants using mutation site statistics and large language models. Through the analysis of predictive results, we designed and screened 10 new high-brightness avGFP sequences. This study not only demonstrates the potential of deep learning in protein design but also provides new perspectives and methodologies for future research by integrating prior knowledge from large language models. 2024-07-30T04:27:21Z X. Guo W. Che http://arxiv.org/abs/2407.20116v2 Modelling vitamin D food fortification among Aboriginal and Torres Strait Islander peoples in Australia 2024-07-30T02:07:41Z Background: Low vitamin D intake and high prevalence of vitamin D deficiency (serum 25-hydroxyvitamin D concentration < 50 nmol/L) among Aboriginal and Torres Strait Islander peoples highlight a need for public health strategies to improve vitamin D status. As few foods contain naturally occurring vitamin D, fortification strategies may be needed to improve vitamin D intake and status among Aboriginal and Torres Strait Islander peoples. Objective: We aimed to model vitamin D food fortification scenarios among Aboriginal and Torres Strait Islander peoples. Methods: We used nationally representative food consumption data (n=4,109) and vitamin D food composition data to model four food fortification scenarios. The modelling for Scenario 1 included foods and maximum vitamin D concentrations permitted for fortification in Australia: i) dairy products and alternatives, ii) butter/margarine/oil spreads, iii) formulated beverages, and iv) selected ready-to-eat breakfast cereals. The modelling for Scenarios 2a-c included some vitamin D concentrations higher than permitted in Australia; Scenario 2c included bread, which is not permitted for vitamin D fortification in Australia. Scenario 2a: i) dairy products and alternatives, ii) butter/margarine/oil spreads, iii) formulated beverages. Scenario 2b: as per Scenario 2a plus selected ready-to-eat breakfast cereals. Scenario 2c: as per Scenario 2b plus bread. Results: Vitamin D fortification of a range of staple foods could potentially increase vitamin D intake among Aboriginal and Torres Strait Islander peoples by ~ 3-6 μg/day. Scenario 2c showed the highest potential median vitamin D intake increase to ~ 8 μg/day. Across all modelled scenarios, none of the participants had vitamin D intake above the Australian upper level of intake of 80 μg/day. 2024-07-29T15:47:00Z Belinda Neo Noel Nannup Dale Tilbrook Eleanor Dunlop John Jacky Carol Michie Cindy Prior Brad Farrant Carrington C. J. Shepherd Lucinda J. Black http://arxiv.org/abs/2403.00815v3 RAM-EHR: Retrieval Augmentation Meets Clinical Predictions on Electronic Health Records 2024-07-26T23:24:39Z We present RAM-EHR, a Retrieval AugMentation pipeline to improve clinical predictions on Electronic Health Records (EHRs). RAM-EHR first collects multiple knowledge sources, converts them into text format, and uses dense retrieval to obtain information related to medical concepts. This strategy addresses the difficulties associated with complex names for the concepts. RAM-EHR then augments the local EHR predictive model co-trained with consistency regularization to capture complementary information from patient visits and summarized knowledge. Experiments on two EHR datasets show the efficacy of RAM-EHR over previous knowledge-enhanced baselines (3.4% gain in AUROC and 7.2% gain in AUPR), emphasizing the effectiveness of the summarized knowledge from RAM-EHR for clinical prediction tasks. The code will be published at \url{https://github.com/ritaranx/RAM-EHR}. 2024-02-25T23:10:20Z ACL 2024 (Oral) ACL 2024 Ran Xu Wenqi Shi Yue Yu Yuchen Zhuang Bowen Jin May D. Wang Joyce C. Ho Carl Yang http://arxiv.org/abs/2407.17601v1 When Life Gives You Lemons, Squeeze Your Way Through: Understanding Citrus Avoidance Behaviour by Free-Ranging Dogs in India 2024-07-24T19:16:07Z Palatability of food is driven by multiple factors like taste, smell, texture, freshness, etc. and can be very variable across species. There are classic examples of local adaptations leading to speciation, driven by food availability. Urbanization across the world is causing rapid decline of biodiversity, while also driving local adaptations in some species. Free-ranging dogs are an interesting example of adaptation to a human-dominated environment across varied habitats. They have co-existed with humans for centuries and are a perfect model system for studying local adaptations. We attempted to understand a specific aspect of their scavenging behaviour in India: citrus aversion. Pet dogs are known to avoid citrus fruits and food contaminated by them. In India, lemons are used widely in the cuisine, and discarded in the garbage. Hence, free-ranging dogs, that typically are scavengers of human leftovers, are likely to encounter lemons and lemon-contaminated food on a regular basis. We carried out a population level experiment to test response of free-ranging dogs to chicken contaminated with various parts of lemon. The dogs avoided chicken contaminated with lemon juice the most. Further, when provided with chicken dipped in three different concentrations of lemon juice, the lowest concentration was most preferred. A survey confirmed that the local people use lemon in their diet extensively and also discard these with the leftovers. People avoided giving citrus contaminated food to their pets but did not follow the same caution for free-ranging dogs. This study revealed that free-ranging dogs in West Bengal, India, are well adapted to scavenging among citrus-contaminated garbage and have their own strategies to avoid the contamination as far as possible, while maximizing their preferred food intake. 2024-07-24T19:16:07Z Includes supplementary information Tuhin Subhra Pal Srijaya Nandi Rohan Sarkar Anindita Bhadra http://arxiv.org/abs/2407.13797v1 Quantifying vitamin D intake among Aboriginal and Torres Strait Islander peoples in Australia 2024-07-17T02:41:53Z Background/Objective: Vitamin D deficiency (serum 25-hydroxyvitamin D [25(OH)D] concentration <50 nmol/L) is prevalent among Aboriginal and Torres Strait Islander peoples in Australia. Alternative to sun exposure (the primary source of vitamin D), vitamin D can also be obtained from food (e.g., fish, eggs, and meat) and supplements. However, vitamin D intake among Aboriginal and Torres Strait Islander peoples is currently unknown. We aimed to provide the first quantification of vitamin D intake using nationally representative data from Aboriginal and Torres Strait Islander peoples. Methods: We used food consumption data collected in the 2012-2013 National Aboriginal and Torres Strait Islander Nutrition and Physical Activity Survey (n = 4,109) and vitamin D food composition data to quantify mean absolute vitamin D intake by sex, age group, and remoteness of location. Differences in mean vitamin D intake between sexes and between remoteness of location were assessed using the 95% confidence interval (95% CI). Results: The mean (standard deviation (SD)) vitamin D intake among Aboriginal and Torres Strait Islander peoples was 2.9 (3.0) μg/day. Males had a statistically significantly higher mean (SD) [95% CI] vitamin D intake (3.2 (3.1) [3.0-3.4] μg/day) than females (2.6 (2.7) [2.4-2.7] μg/day). There were no statistically significant differences between mean (SD) [95% CI] vitamin D intake in non-remote (2.9 (2.2) [2.7-3.1] μg/day) and remote areas (2.8 (4.8) [2.6-3.0] μg/day). Conclusions: Vitamin D intake among Aboriginal and Torres Strait Islander peoples is low. Food-based public health strategies could be developed to promote higher vitamin D intake among this population. 2024-07-17T02:41:53Z Belinda Neo Dale Tilbrook Noel Nannup Alison Daly Eleanor Dunlop John Jacky Carol Michie Cindy Prior Brad Farrant Carrington C. J. Shepherd Lucinda J. Black http://arxiv.org/abs/2407.06211v1 Synthetic data: How could it be used for infectious disease research? 2024-07-03T17:13:04Z Over the last three to five years, it has become possible to generate machine learning synthetic data for healthcare-related uses. However, concerns have been raised about potential negative factors associated with the possibilities of artificial dataset generation. These include the potential misuse of generative artificial intelligence (AI) in fields such as cybercrime, the use of deepfakes and fake news to deceive or manipulate, and displacement of human jobs across various market sectors. Here, we consider both current and future positive advances and possibilities with synthetic datasets. Synthetic data offers significant benefits, particularly in data privacy, research, in balancing datasets and reducing bias in machine learning models. Generative AI is an artificial intelligence genre capable of creating text, images, video or other data using generative models. The recent explosion of interest in GenAI was heralded by the invention and speedy move to use of large language models (LLM). These computational models are able to achieve general-purpose language generation and other natural language processing tasks and are based on transformer architectures, which made an evolutionary leap from previous neural network architectures. Fuelled by the advent of improved GenAI techniques and wide scale usage, this is surely the time to consider how synthetic data can be used to advance infectious disease research. In this commentary we aim to create an overview of the current and future position of synthetic data in infectious disease research. 2024-07-03T17:13:04Z Styliani-Christina Fragkouli Dhwani Solanki Leyla J Castro Fotis E Psomopoulos Núria Queralt-Rosinach Davide Cirillo Lisa C Crossman 10.1080/17460913.2024.2400853 http://arxiv.org/abs/2210.04627v4 Systematizing Cellular Complexity: A Hilbertian Approach To Biological Problems 2024-07-02T01:02:46Z Examining individual components of cellular systems has been successful in uncovering molecular reactions and interactions. However, the challenge lies in integrating these components into a comprehensive system-scale map. This difficulty arises due to factors such as missing links (unknown variables), overlooked nonlinearities in high-dimensional parameter space, downplayed natural noisiness and stochasticity, and a lack of focus on causal influence and temporal dynamics. Composite static and phenomenological descriptions, while appearing complicated, lack the essence of what makes the biological systems truly "complex". The formalization of system-level problems is therefore important in constructing a meta-theory of biology. Addressing fundamental aspects of cellular regulation, adaptability, and noise management is vital for understanding the robustness and functionality of biological systems. These aspects encapsulate the challenges that cells face in maintaining stability, responding to environmental changes, and harnessing noise for functionality. This work examines these key problems that cells must solve, serving as a template for such formalization and as a step towards the axiomatization of biological investigations. Through a detailed exploration of cellular mechanisms, particularly homeostatic configuration, ion channels and harnessing noise, this paper aims to illustrate complex concepts and theories in a tangible context, providing a bridge between abstract theoretical frameworks and concrete biological phenomena. 2022-10-05T18:15:04Z Nima Dehghani http://arxiv.org/abs/2408.01425v1 Comparative Evaluation of the Proximate and Cytogenotoxicity of Ash and Rice Chips Used as Mango Fruit Artificial Ripening Agents in Birnin Kebbi, Nigeria 2024-06-29T12:06:44Z The high demand for mango (Mangifera indica L.) fruits has led sellers to employ ripening agents. However, concerns are growing regarding the potential toxicities of induced ripening, emphasizing the need for scientific investigation. Samples of artificially and naturally ripened mangoes were analyzed for proximate composition using standard protocols. Cytogenotoxicity was then assessed using the Allium cepa L. toxicity test. Twenty (20)A. cepa (onion) bulbs were used, with 5 ripened naturally, 5 with wood ash, 5 with herbaceous ash, and 5 with rice chips, all grown over tap water for five days. The root tips of the bulbs were assayed and examined for chromosomal aberrations. The results revealed a significant (P<0.05) increase in moisture, protein, and ash content of mangoes as ripening agents were introduced. Mangoes ripened with wood ash exhibited the highest moisture content (81%), while those ripened with rice chips had the highest protein (0.5%) and ash content (1.5%). Naturally ripened mangoes displayed the highest fat (0.0095%) and fiber (11.46%) contents. The A. cepa toxicity test indicated significant (p<0.05) differences in the root growth of mangoes ripened with various agents. Wood ash resulted in the highest root growth (2.62cm), while herbaceous ash had the least (2.18%). Chromosomal aberrations, including sticky, vagrant, and laggard abnormalities, were observed in all agents, with herbaceous ash exhibiting the highest and rice chips the least. The obtained results suggest that induced ripening of the fruits could induce toxicities, highlighting the necessity for public awareness regarding the potential dangers posed by these agents. 2024-06-29T12:06:44Z Accept CD Obadiah TO Yahaya AA Aliero M Abdulkareem http://arxiv.org/abs/2407.00420v1 Cement Dust Exposure and Risk of Hyperglycemia and Overweight among Artisans and Residents Close to a Cement Factory in Sokoto, Nigeria 2024-06-29T12:00:17Z The potential health risks of cement dust exposure are increasingly raising concern worldwide as the cement industry expands in response to rising cement demand. This necessitates the need to determine the nature of the risks in order to develop appropriate measures. This study determined the effects of cement dust exposure on the weight and blood glucose levels of people residing or working around a cement company in Sokoto, Nigeria. Demographic information was obtained using questionnaires from 72 participants, which included age, gender, educational level, exposure hours, occupation, and lifestyle. The blood glucose levels and body mass index (BMI) were measured using a Fine Test glucometer and a mechanical scale, respectively. The results showed that most of the people living or working around the cement company were middle-aged men (31-40; 42.06%) with a primary (33.33%) or secondary (45.83%) school education. It showed that 30 (41.69%) of the participants were overweight while 5 (6.94%) were obese. Additionally, 52.78% of the participants were diabetic while 31.94% were prediabetic. Participants that were exposed for long hours (> 15 hours per day) were the most diabetic (20% of the participants), followed by smokers (15%), and artisans (7%). It can be concluded that exposure to cement dust from the company increased the risk of overweight, obesity, and hyperglycemia among the participants. These health risks were worsened by daily long hours of exposure, smoking, and artisanal pollutant exposure. Human settlements and artisans should not be located near the cement company, and the company should minimize pollutant emissions. 2024-06-29T12:00:17Z accept T Yahaya KA Sani E Oladele E Yawa M Musa M Abubakar R Sulaiman M Bilyaminu 10.5281/zenodo.10852417 http://arxiv.org/abs/2407.00408v1 Rooting behavior of pomegranate (Punica granatum L.) hardwood cuttings in relation to genotype and irrigation frequency 2024-06-29T11:22:55Z The study was conducted to determine the best irrigation frequency for rooting hardwood cuttings of some pomegranate genotypes that are cultivated in Halabja province, Kurdistan Region, Iraq. The hardwood cuttings were collected from 11 genotypes, which were 'Salakhani Trsh' (G1), 'Salakhani Mekhosh' (G2), 'Amriki' (G3), 'Twekl Sury Trsh' (G4), 'Twekl Astury Naw Spy' (G5), 'Hanara Sherina' (G6), 'Kawa Hanary Sherin' (G7), 'Kawa Hanary Trsh' (G8), 'Malesay Twekl Asture' (G9), 'Malesay Twekl Tank' (G10), and 'Sura Hanary Trsh' (G11). The genotypes were subjected to irrigation applications by 1-day, 2-day, 7-day, or 10-day frequencies. Among pomegranates, G11, G6, and G7 produced 95, 90, and 83% rooting percentages, which were significantly higher than the rest of other genotypes. The lowest rooting percentages (28, 36, 38, and 40%) were found in G1, G5, G3, and G10, respectively. The effect of irrigation frequencies on the genotypes confirmed that a 7-day frequency was the best irrigation frequency to achieve the maximum rooting percentages (93, 86, 80, 73, 53, and 40%) in G6, G9, G2, G4, G3, and G1, respectively. In contrast, the minimum rooting percentage (20%) was recorded in G3 with a 1-day frequency and in G1 with 10-day frequency. In this study, it was found that the cuttings of G11, G6, and G7 had the best ability to form roots, and irrigation with a 7-day frequency was the best for the cuttings of all the 11 pomegranate genotypes investigated. 2024-06-29T11:22:55Z Kocher Omer Salih Aram Akram Mohammed Jamal Mahmood Faraj Anwar Mohammed Raouf Nawroz Abdul-Razzak Tahir http://arxiv.org/abs/1210.7091v2 Development of Hydrogen Bonding Magnetic Reaction-based Gene Regulation through Cyclic Electromagnetic DNA Simulation in Double-Stranded DNA 2024-06-29T07:11:00Z The proton-magnetic reaction is commonly used in MRI machines with a strong magnetic field of over 1 T, while this study hypothesized that the electron magnetic reaction of hydrogen could affect the hydrogen bonds of double-stranded DNA (dsDNA) at a low magnetic field below 0.01 T. The goal is to develop a hydrogen bonding magnetic reaction-based gene regulation (HBMR-GR) system. The polarities of DNA base pairs are derived from the relative electrostatic charge between purines and pyrimidines, which become positively and negatively charged, respectively. The Pyu dsDNAs with pyrimidine(s)-purine(s) sequences, ds3T3A, ds3C3G, and ds3C3A, showed stronger DNA hybridization potential, increased infrared absorption at 3400-3200 cm-1, and a unique DNA conformation in HPLC analysis compared to the corresponding Puy dsDNAs. To target the three-dimensional structure of dsDNA based on the DNA base pair polarities, one can use cyclic electromagnetic DNA simulation (CEDS) with approximately 25% efficiency for randomly oriented dsDNAs. CEDS was found to induce sequence-specific hybridization of target oligo-dsDNAs in 0.005M NaCl solution and sequence-specific conformation of oligo-dsDNAs in 0.1M NaCl solution. It was found that the Pyu oligo-dsDNAs were more responsible for the hybridization and conformational changes by CEDS than the Puy oligo-dsDNAs. CEDS decreased ethidium bromide (EtBr) DNA intercalation and spermidine DNA condensation depending on CEDS time in the binding assay. The results also included that the Pyu oligo-dsDNAs were more responsible for CEDS by forming stable and unique conformation of oligo-dsDNA than the Therefore, it is postulated that the low-level HBMR-based CEDS can enhance the hybridization potential of oligo-dsDNAs and subsequently lead to the unique DNA conformation required for the initiation of various DNA functions. 2012-10-26T10:25:03Z Please, find two manuscripts "Development of Hydrogen Bonding Magnetic Reaction-based Gene Regulation through Cyclic Electromagnetic DNA Simulation in Double-Stranded DNA" and "Application of Cyclic Electromagnetic DNA Simulation to Target Oncogenesis-related miRNAs and DNA Motifs: Changes of Protein Signaling Pathway System in RAW 264.7 Cells" Yeon Sook Kim Dae Gwan Lee Suk Keun Lee http://arxiv.org/abs/2407.00332v1 Machine Learning Models for Dengue Forecasting in Singapore 2024-06-29T06:27:52Z With emerging prevalence beyond traditionally endemic regions, the global burden of dengue disease is forecasted to be one of the fastest growing. With limited direct treatment or vaccination currently available, prevention through vector control is widely believed to be the most effective form of managing outbreaks. This study examines traditional state space models (moving average, autoregressive, ARIMA, SARIMA), supervised learning techniques (XGBoost, SVM, KNN) and deep networks (LSTM, CNN, ConvLSTM) for forecasting weekly dengue cases in Singapore. Meteorological data and search engine trends were included as features for ML techniques. Forecasts using CNNs yielded lowest RMSE in weekly cases in 2019. 2024-06-29T06:27:52Z 12 pages, 6 figures Zi Iun Lai Wai Kit Fung Enquan Chew http://arxiv.org/abs/2406.16364v3 The unpaved road towards efficient selective breeding in insects for food and feed 2024-06-26T06:37:45Z Insect production for food and feed presents a promising supplement to ensure food safety and address the adverse impacts of agriculture on climate and environment in the future. However, optimisation is required for insect production to realise its full potential. This can be by targeted improvement of traits of interest through selective breeding, an approach which has so far been underexplored and underutilised in insect farming. Here we present a comprehensive review of the selective breeding framework in the context of insect production. We systematically evaluate adjustments of selective breeding techniques to the realm of insects and highlight the essential components integral to the breeding process. The discussion covers every step of a conventional breeding scheme, such as formulation of breeding objectives, phenotyping, estimation of genetic parameters and breeding values, selection of appropriate breeding strategies, and mitigation of issues associated with genetic diversity depletion and inbreeding. This review combines knowledge from diverse disciplines, bridging the gap between animal breeding, quantitative genetics, evolutionary biology, and entomology, offering an integrated view of the insect breeding research area and uniting knowledge which has previously remained scattered across diverse fields of expertise. 2024-06-24T07:14:48Z Laura Skrubbeltrang Hansen Stine Frey Laursen Simon Bahrndorff Jesper Givskov Sørensen Goutam Sahana Torsten Nygaard Kristensen Hanne Marie Nielsen 10.1111/eea.13526 http://arxiv.org/abs/2403.03335v2 From virtual patients to digital twins in immuno-oncology: lessons learned from mechanistic quantitative systems pharmacology modeling 2024-06-23T14:29:55Z Virtual patients and digital patients/twins are two similar concepts gaining increasing attention in health care with goals to accelerate drug development and improve patients' survival, but with their own limitations. Although methods have been proposed to generate virtual patient populations using mechanistic models, there are limited number of applications in immuno-oncology research. Furthermore, due to the stricter requirements of digital twins, they are often generated in a study-specific manner with models customized to particular clinical settings (e.g., treatment, cancer, and data types). Here, we discuss the challenges for virtual patient generation in immuno-oncology with our most recent experiences, initiatives to develop digital twins, and how research on these two concepts can inform each other. 2024-03-05T21:38:05Z 21 pages, 1 figure NPJ Digit Med. 2024 Hanwen Wang Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA Theinmozhi Arulraj Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA Alberto Ippolito Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA Aleksander S. Popel Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA Departments of Medicine and Oncology, and the Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA http://arxiv.org/abs/2406.15963v1 Effectiveness of ChatGPT in explaining complex medical reports to patients 2024-06-23T00:04:07Z Electronic health records contain detailed information about the medical condition of patients, but they are difficult for patients to understand even if they have access to them. We explore whether ChatGPT (GPT 4) can help explain multidisciplinary team (MDT) reports to colorectal and prostate cancer patients. These reports are written in dense medical language and assume clinical knowledge, so they are a good test of the ability of ChatGPT to explain complex medical reports to patients. We asked clinicians and lay people (not patients) to review explanations and responses of ChatGPT. We also ran three focus groups (including cancer patients, caregivers, computer scientists, and clinicians) to discuss output of ChatGPT. Our studies highlighted issues with inaccurate information, inappropriate language, limited personalization, AI distrust, and challenges integrating large language models (LLMs) into clinical workflow. These issues will need to be resolved before LLMs can be used to explain complex personal medical information to patients. 2024-06-23T00:04:07Z under review Mengxuan Sun Ehud Reiter Anne E Kiltie George Ramsay Lisa Duncan Peter Murchie Rosalind Adam