https://arxiv.org/api/4+skY1Bbf0aDYTQu5HO2KlWMZzc2026-06-18T15:55:22Z159633015http://arxiv.org/abs/2407.20534v1BERT and LLMs-Based avGFP Brightness Prediction and Mutation Design2024-07-30T04:27:21ZThis study aims to utilize Transformer models and large language models (such as GPT and Claude) to predict the brightness of Aequorea victoria green fluorescent protein (avGFP) and design mutants with higher brightness. Considering the time and cost associated with traditional experimental screening methods, this study employs machine learning techniques to enhance research efficiency. We first read and preprocess a proprietary dataset containing approximately 140,000 protein sequences, including about 30,000 avGFP sequences. Subsequently, we constructed and trained a Transformer-based prediction model to screen and design new avGFP mutants that are expected to exhibit higher brightness.
Our methodology consists of two primary stages: first, the construction of a scoring model using BERT, and second, the screening and generation of mutants using mutation site statistics and large language models. Through the analysis of predictive results, we designed and screened 10 new high-brightness avGFP sequences. This study not only demonstrates the potential of deep learning in protein design but also provides new perspectives and methodologies for future research by integrating prior knowledge from large language models.2024-07-30T04:27:21ZX. GuoW. Chehttp://arxiv.org/abs/2407.20116v2Modelling vitamin D food fortification among Aboriginal and Torres Strait Islander peoples in Australia2024-07-30T02:07:41ZBackground: Low vitamin D intake and high prevalence of vitamin D deficiency (serum 25-hydroxyvitamin D concentration < 50 nmol/L) among Aboriginal and Torres Strait Islander peoples highlight a need for public health strategies to improve vitamin D status. As few foods contain naturally occurring vitamin D, fortification strategies may be needed to improve vitamin D intake and status among Aboriginal and Torres Strait Islander peoples. Objective: We aimed to model vitamin D food fortification scenarios among Aboriginal and Torres Strait Islander peoples. Methods: We used nationally representative food consumption data (n=4,109) and vitamin D food composition data to model four food fortification scenarios. The modelling for Scenario 1 included foods and maximum vitamin D concentrations permitted for fortification in Australia: i) dairy products and alternatives, ii) butter/margarine/oil spreads, iii) formulated beverages, and iv) selected ready-to-eat breakfast cereals. The modelling for Scenarios 2a-c included some vitamin D concentrations higher than permitted in Australia; Scenario 2c included bread, which is not permitted for vitamin D fortification in Australia. Scenario 2a: i) dairy products and alternatives, ii) butter/margarine/oil spreads, iii) formulated beverages. Scenario 2b: as per Scenario 2a plus selected ready-to-eat breakfast cereals. Scenario 2c: as per Scenario 2b plus bread. Results: Vitamin D fortification of a range of staple foods could potentially increase vitamin D intake among Aboriginal and Torres Strait Islander peoples by ~ 3-6 μg/day. Scenario 2c showed the highest potential median vitamin D intake increase to ~ 8 μg/day. Across all modelled scenarios, none of the participants had vitamin D intake above the Australian upper level of intake of 80 μg/day.2024-07-29T15:47:00ZBelinda NeoNoel NannupDale TilbrookEleanor DunlopJohn JackyCarol MichieCindy PriorBrad FarrantCarrington C. J. ShepherdLucinda J. Blackhttp://arxiv.org/abs/2403.00815v3RAM-EHR: Retrieval Augmentation Meets Clinical Predictions on Electronic Health Records2024-07-26T23:24:39ZWe present RAM-EHR, a Retrieval AugMentation pipeline to improve clinical predictions on Electronic Health Records (EHRs). RAM-EHR first collects multiple knowledge sources, converts them into text format, and uses dense retrieval to obtain information related to medical concepts. This strategy addresses the difficulties associated with complex names for the concepts. RAM-EHR then augments the local EHR predictive model co-trained with consistency regularization to capture complementary information from patient visits and summarized knowledge. Experiments on two EHR datasets show the efficacy of RAM-EHR over previous knowledge-enhanced baselines (3.4% gain in AUROC and 7.2% gain in AUPR), emphasizing the effectiveness of the summarized knowledge from RAM-EHR for clinical prediction tasks. The code will be published at \url{https://github.com/ritaranx/RAM-EHR}.2024-02-25T23:10:20ZACL 2024 (Oral)ACL 2024Ran XuWenqi ShiYue YuYuchen ZhuangBowen JinMay D. WangJoyce C. HoCarl Yanghttp://arxiv.org/abs/2407.17601v1When Life Gives You Lemons, Squeeze Your Way Through: Understanding Citrus Avoidance Behaviour by Free-Ranging Dogs in India2024-07-24T19:16:07ZPalatability of food is driven by multiple factors like taste, smell, texture, freshness, etc. and can be very variable across species. There are classic examples of local adaptations leading to speciation, driven by food availability. Urbanization across the world is causing rapid decline of biodiversity, while also driving local adaptations in some species. Free-ranging dogs are an interesting example of adaptation to a human-dominated environment across varied habitats. They have co-existed with humans for centuries and are a perfect model system for studying local adaptations. We attempted to understand a specific aspect of their scavenging behaviour in India: citrus aversion. Pet dogs are known to avoid citrus fruits and food contaminated by them. In India, lemons are used widely in the cuisine, and discarded in the garbage. Hence, free-ranging dogs, that typically are scavengers of human leftovers, are likely to encounter lemons and lemon-contaminated food on a regular basis. We carried out a population level experiment to test response of free-ranging dogs to chicken contaminated with various parts of lemon. The dogs avoided chicken contaminated with lemon juice the most. Further, when provided with chicken dipped in three different concentrations of lemon juice, the lowest concentration was most preferred. A survey confirmed that the local people use lemon in their diet extensively and also discard these with the leftovers. People avoided giving citrus contaminated food to their pets but did not follow the same caution for free-ranging dogs. This study revealed that free-ranging dogs in West Bengal, India, are well adapted to scavenging among citrus-contaminated garbage and have their own strategies to avoid the contamination as far as possible, while maximizing their preferred food intake.2024-07-24T19:16:07ZIncludes supplementary informationTuhin Subhra PalSrijaya NandiRohan SarkarAnindita Bhadrahttp://arxiv.org/abs/2407.13797v1Quantifying vitamin D intake among Aboriginal and Torres Strait Islander peoples in Australia2024-07-17T02:41:53ZBackground/Objective: Vitamin D deficiency (serum 25-hydroxyvitamin D [25(OH)D] concentration <50 nmol/L) is prevalent among Aboriginal and Torres Strait Islander peoples in Australia. Alternative to sun exposure (the primary source of vitamin D), vitamin D can also be obtained from food (e.g., fish, eggs, and meat) and supplements. However, vitamin D intake among Aboriginal and Torres Strait Islander peoples is currently unknown. We aimed to provide the first quantification of vitamin D intake using nationally representative data from Aboriginal and Torres Strait Islander peoples. Methods: We used food consumption data collected in the 2012-2013 National Aboriginal and Torres Strait Islander Nutrition and Physical Activity Survey (n = 4,109) and vitamin D food composition data to quantify mean absolute vitamin D intake by sex, age group, and remoteness of location. Differences in mean vitamin D intake between sexes and between remoteness of location were assessed using the 95% confidence interval (95% CI). Results: The mean (standard deviation (SD)) vitamin D intake among Aboriginal and Torres Strait Islander peoples was 2.9 (3.0) μg/day. Males had a statistically significantly higher mean (SD) [95% CI] vitamin D intake (3.2 (3.1) [3.0-3.4] μg/day) than females (2.6 (2.7) [2.4-2.7] μg/day). There were no statistically significant differences between mean (SD) [95% CI] vitamin D intake in non-remote (2.9 (2.2) [2.7-3.1] μg/day) and remote areas (2.8 (4.8) [2.6-3.0] μg/day). Conclusions: Vitamin D intake among Aboriginal and Torres Strait Islander peoples is low. Food-based public health strategies could be developed to promote higher vitamin D intake among this population.2024-07-17T02:41:53ZBelinda NeoDale TilbrookNoel NannupAlison DalyEleanor DunlopJohn JackyCarol MichieCindy PriorBrad FarrantCarrington C. J. ShepherdLucinda J. Blackhttp://arxiv.org/abs/2407.06211v1Synthetic data: How could it be used for infectious disease research?2024-07-03T17:13:04ZOver the last three to five years, it has become possible to generate machine learning synthetic data for healthcare-related uses. However, concerns have been raised about potential negative factors associated with the possibilities of artificial dataset generation. These include the potential misuse of generative artificial intelligence (AI) in fields such as cybercrime, the use of deepfakes and fake news to deceive or manipulate, and displacement of human jobs across various market sectors.
Here, we consider both current and future positive advances and possibilities with synthetic datasets. Synthetic data offers significant benefits, particularly in data privacy, research, in balancing datasets and reducing bias in machine learning models. Generative AI is an artificial intelligence genre capable of creating text, images, video or other data using generative models. The recent explosion of interest in GenAI was heralded by the invention and speedy move to use of large language models (LLM). These computational models are able to achieve general-purpose language generation and other natural language processing tasks and are based on transformer architectures, which made an evolutionary leap from previous neural network architectures.
Fuelled by the advent of improved GenAI techniques and wide scale usage, this is surely the time to consider how synthetic data can be used to advance infectious disease research. In this commentary we aim to create an overview of the current and future position of synthetic data in infectious disease research.2024-07-03T17:13:04ZStyliani-Christina FragkouliDhwani SolankiLeyla J CastroFotis E PsomopoulosNúria Queralt-RosinachDavide CirilloLisa C Crossman10.1080/17460913.2024.2400853http://arxiv.org/abs/2210.04627v4Systematizing Cellular Complexity: A Hilbertian Approach To Biological Problems2024-07-02T01:02:46ZExamining individual components of cellular systems has been successful in uncovering molecular reactions and interactions. However, the challenge lies in integrating these components into a comprehensive system-scale map. This difficulty arises due to factors such as missing links (unknown variables), overlooked nonlinearities in high-dimensional parameter space, downplayed natural noisiness and stochasticity, and a lack of focus on causal influence and temporal dynamics. Composite static and phenomenological descriptions, while appearing complicated, lack the essence of what makes the biological systems truly "complex".
The formalization of system-level problems is therefore important in constructing a meta-theory of biology. Addressing fundamental aspects of cellular regulation, adaptability, and noise management is vital for understanding the robustness and functionality of biological systems. These aspects encapsulate the challenges that cells face in maintaining stability, responding to environmental changes, and harnessing noise for functionality. This work examines these key problems that cells must solve, serving as a template for such formalization and as a step towards the axiomatization of biological investigations. Through a detailed exploration of cellular mechanisms, particularly homeostatic configuration, ion channels and harnessing noise, this paper aims to illustrate complex concepts and theories in a tangible context, providing a bridge between abstract theoretical frameworks and concrete biological phenomena.2022-10-05T18:15:04ZNima Dehghanihttp://arxiv.org/abs/2408.01425v1Comparative Evaluation of the Proximate and Cytogenotoxicity of Ash and Rice Chips Used as Mango Fruit Artificial Ripening Agents in Birnin Kebbi, Nigeria2024-06-29T12:06:44ZThe high demand for mango (Mangifera indica L.) fruits has led sellers to employ ripening agents. However, concerns are growing regarding the potential toxicities of induced ripening, emphasizing the need for scientific investigation. Samples of artificially and naturally ripened mangoes were analyzed for proximate composition using standard protocols. Cytogenotoxicity was then assessed using the Allium cepa L. toxicity test. Twenty (20)A. cepa (onion) bulbs were used, with 5 ripened naturally, 5 with wood ash, 5 with herbaceous ash, and 5 with rice chips, all grown over tap water for five days. The root tips of the bulbs were assayed and examined for chromosomal aberrations. The results revealed a significant (P<0.05) increase in moisture, protein, and ash content of mangoes as ripening agents were introduced. Mangoes ripened with wood ash exhibited the highest moisture content (81%), while those ripened with rice chips had the highest protein (0.5%) and ash content (1.5%). Naturally ripened mangoes displayed the highest fat (0.0095%) and fiber (11.46%) contents. The A. cepa toxicity test indicated significant (p<0.05) differences in the root growth of mangoes ripened with various agents. Wood ash resulted in the highest root growth (2.62cm), while herbaceous ash had the least (2.18%). Chromosomal aberrations, including sticky, vagrant, and laggard abnormalities, were observed in all agents, with herbaceous ash exhibiting the highest and rice chips the least. The obtained results suggest that induced ripening of the fruits could induce toxicities, highlighting the necessity for public awareness regarding the potential dangers posed by these agents.2024-06-29T12:06:44ZAcceptCD ObadiahTO YahayaAA AlieroM Abdulkareemhttp://arxiv.org/abs/2407.00420v1Cement Dust Exposure and Risk of Hyperglycemia and Overweight among Artisans and Residents Close to a Cement Factory in Sokoto, Nigeria2024-06-29T12:00:17ZThe potential health risks of cement dust exposure are increasingly raising concern worldwide as the cement industry expands in response to rising cement demand. This necessitates the need to determine the nature of the risks in order to develop appropriate measures. This study determined the effects of cement dust exposure on the weight and blood glucose levels of people residing or working around a cement company in Sokoto, Nigeria. Demographic information was obtained using questionnaires from 72 participants, which included age, gender, educational level, exposure hours, occupation, and lifestyle. The blood glucose levels and body mass index (BMI) were measured using a Fine Test glucometer and a mechanical scale, respectively. The results showed that most of the people living or working around the cement company were middle-aged men (31-40; 42.06%) with a primary (33.33%) or secondary (45.83%) school education. It showed that 30 (41.69%) of the participants were overweight while 5 (6.94%) were obese. Additionally, 52.78% of the participants were diabetic while 31.94% were prediabetic. Participants that were exposed for long hours (> 15 hours per day) were the most diabetic (20% of the participants), followed by smokers (15%), and artisans (7%). It can be concluded that exposure to cement dust from the company increased the risk of overweight, obesity, and hyperglycemia among the participants. These health risks were worsened by daily long hours of exposure, smoking, and artisanal pollutant exposure. Human settlements and artisans should not be located near the cement company, and the company should minimize pollutant emissions.2024-06-29T12:00:17ZacceptT YahayaKA SaniE OladeleE YawaM MusaM AbubakarR SulaimanM Bilyaminu10.5281/zenodo.10852417http://arxiv.org/abs/2407.00408v1Rooting behavior of pomegranate (Punica granatum L.) hardwood cuttings in relation to genotype and irrigation frequency2024-06-29T11:22:55ZThe study was conducted to determine the best irrigation frequency for rooting hardwood cuttings of some pomegranate genotypes that are cultivated in Halabja province, Kurdistan Region, Iraq. The hardwood cuttings were collected from 11 genotypes, which were 'Salakhani Trsh' (G1), 'Salakhani Mekhosh' (G2), 'Amriki' (G3), 'Twekl Sury Trsh' (G4), 'Twekl Astury Naw Spy' (G5), 'Hanara Sherina' (G6), 'Kawa Hanary Sherin' (G7), 'Kawa Hanary Trsh' (G8), 'Malesay Twekl Asture' (G9), 'Malesay Twekl Tank' (G10), and 'Sura Hanary Trsh' (G11). The genotypes were subjected to irrigation applications by 1-day, 2-day, 7-day, or 10-day frequencies. Among pomegranates, G11, G6, and G7 produced 95, 90, and 83% rooting percentages, which were significantly higher than the rest of other genotypes. The lowest rooting percentages (28, 36, 38, and 40%) were found in G1, G5, G3, and G10, respectively. The effect of irrigation frequencies on the genotypes confirmed that a 7-day frequency was the best irrigation frequency to achieve the maximum rooting percentages (93, 86, 80, 73, 53, and 40%) in G6, G9, G2, G4, G3, and G1, respectively. In contrast, the minimum rooting percentage (20%) was recorded in G3 with a 1-day frequency and in G1 with 10-day frequency. In this study, it was found that the cuttings of G11, G6, and G7 had the best ability to form roots, and irrigation with a 7-day frequency was the best for the cuttings of all the 11 pomegranate genotypes investigated.2024-06-29T11:22:55ZKocher Omer SalihAram Akram MohammedJamal Mahmood FarajAnwar Mohammed RaoufNawroz Abdul-Razzak Tahirhttp://arxiv.org/abs/1210.7091v2Development of Hydrogen Bonding Magnetic Reaction-based Gene Regulation through Cyclic Electromagnetic DNA Simulation in Double-Stranded DNA2024-06-29T07:11:00ZThe proton-magnetic reaction is commonly used in MRI machines with a strong magnetic field of over 1 T, while this study hypothesized that the electron magnetic reaction of hydrogen could affect the hydrogen bonds of double-stranded DNA (dsDNA) at a low magnetic field below 0.01 T. The goal is to develop a hydrogen bonding magnetic reaction-based gene regulation (HBMR-GR) system. The polarities of DNA base pairs are derived from the relative electrostatic charge between purines and pyrimidines, which become positively and negatively charged, respectively. The Pyu dsDNAs with pyrimidine(s)-purine(s) sequences, ds3T3A, ds3C3G, and ds3C3A, showed stronger DNA hybridization potential, increased infrared absorption at 3400-3200 cm-1, and a unique DNA conformation in HPLC analysis compared to the corresponding Puy dsDNAs. To target the three-dimensional structure of dsDNA based on the DNA base pair polarities, one can use cyclic electromagnetic DNA simulation (CEDS) with approximately 25% efficiency for randomly oriented dsDNAs. CEDS was found to induce sequence-specific hybridization of target oligo-dsDNAs in 0.005M NaCl solution and sequence-specific conformation of oligo-dsDNAs in 0.1M NaCl solution. It was found that the Pyu oligo-dsDNAs were more responsible for the hybridization and conformational changes by CEDS than the Puy oligo-dsDNAs. CEDS decreased ethidium bromide (EtBr) DNA intercalation and spermidine DNA condensation depending on CEDS time in the binding assay. The results also included that the Pyu oligo-dsDNAs were more responsible for CEDS by forming stable and unique conformation of oligo-dsDNA than the Therefore, it is postulated that the low-level HBMR-based CEDS can enhance the hybridization potential of oligo-dsDNAs and subsequently lead to the unique DNA conformation required for the initiation of various DNA functions.2012-10-26T10:25:03ZPlease, find two manuscripts "Development of Hydrogen Bonding Magnetic Reaction-based Gene Regulation through Cyclic Electromagnetic DNA Simulation in Double-Stranded DNA" and "Application of Cyclic Electromagnetic DNA Simulation to Target Oncogenesis-related miRNAs and DNA Motifs: Changes of Protein Signaling Pathway System in RAW 264.7 Cells"Yeon Sook KimDae Gwan LeeSuk Keun Leehttp://arxiv.org/abs/2407.00332v1Machine Learning Models for Dengue Forecasting in Singapore2024-06-29T06:27:52ZWith emerging prevalence beyond traditionally endemic regions, the global burden of dengue disease is forecasted to be one of the fastest growing. With limited direct treatment or vaccination currently available, prevention through vector control is widely believed to be the most effective form of managing outbreaks. This study examines traditional state space models (moving average, autoregressive, ARIMA, SARIMA), supervised learning techniques (XGBoost, SVM, KNN) and deep networks (LSTM, CNN, ConvLSTM) for forecasting weekly dengue cases in Singapore. Meteorological data and search engine trends were included as features for ML techniques. Forecasts using CNNs yielded lowest RMSE in weekly cases in 2019.2024-06-29T06:27:52Z12 pages, 6 figuresZi Iun LaiWai Kit FungEnquan Chewhttp://arxiv.org/abs/2406.16364v3The unpaved road towards efficient selective breeding in insects for food and feed2024-06-26T06:37:45ZInsect production for food and feed presents a promising supplement to ensure food safety and address the adverse impacts of agriculture on climate and environment in the future. However, optimisation is required for insect production to realise its full potential. This can be by targeted improvement of traits of interest through selective breeding, an approach which has so far been underexplored and underutilised in insect farming. Here we present a comprehensive review of the selective breeding framework in the context of insect production. We systematically evaluate adjustments of selective breeding techniques to the realm of insects and highlight the essential components integral to the breeding process. The discussion covers every step of a conventional breeding scheme, such as formulation of breeding objectives, phenotyping, estimation of genetic parameters and breeding values, selection of appropriate breeding strategies, and mitigation of issues associated with genetic diversity depletion and inbreeding. This review combines knowledge from diverse disciplines, bridging the gap between animal breeding, quantitative genetics, evolutionary biology, and entomology, offering an integrated view of the insect breeding research area and uniting knowledge which has previously remained scattered across diverse fields of expertise.2024-06-24T07:14:48ZLaura Skrubbeltrang HansenStine Frey LaursenSimon BahrndorffJesper Givskov SørensenGoutam SahanaTorsten Nygaard KristensenHanne Marie Nielsen10.1111/eea.13526http://arxiv.org/abs/2403.03335v2From virtual patients to digital twins in immuno-oncology: lessons learned from mechanistic quantitative systems pharmacology modeling2024-06-23T14:29:55ZVirtual patients and digital patients/twins are two similar concepts gaining increasing attention in health care with goals to accelerate drug development and improve patients' survival, but with their own limitations. Although methods have been proposed to generate virtual patient populations using mechanistic models, there are limited number of applications in immuno-oncology research. Furthermore, due to the stricter requirements of digital twins, they are often generated in a study-specific manner with models customized to particular clinical settings (e.g., treatment, cancer, and data types). Here, we discuss the challenges for virtual patient generation in immuno-oncology with our most recent experiences, initiatives to develop digital twins, and how research on these two concepts can inform each other.2024-03-05T21:38:05Z21 pages, 1 figureNPJ Digit Med. 2024Hanwen WangDepartment of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USATheinmozhi ArulrajDepartment of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USAAlberto IppolitoDepartment of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USAAleksander S. PopelDepartment of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USADepartments of Medicine and Oncology, and the Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USAhttp://arxiv.org/abs/2406.15963v1Effectiveness of ChatGPT in explaining complex medical reports to patients2024-06-23T00:04:07ZElectronic health records contain detailed information about the medical condition of patients, but they are difficult for patients to understand even if they have access to them. We explore whether ChatGPT (GPT 4) can help explain multidisciplinary team (MDT) reports to colorectal and prostate cancer patients. These reports are written in dense medical language and assume clinical knowledge, so they are a good test of the ability of ChatGPT to explain complex medical reports to patients. We asked clinicians and lay people (not patients) to review explanations and responses of ChatGPT. We also ran three focus groups (including cancer patients, caregivers, computer scientists, and clinicians) to discuss output of ChatGPT. Our studies highlighted issues with inaccurate information, inappropriate language, limited personalization, AI distrust, and challenges integrating large language models (LLMs) into clinical workflow. These issues will need to be resolved before LLMs can be used to explain complex personal medical information to patients.2024-06-23T00:04:07Zunder reviewMengxuan SunEhud ReiterAnne E KiltieGeorge RamsayLisa DuncanPeter MurchieRosalind Adam