ABSTRACT
Microbes are found in nearly every habitat and organism on the planet, where they are critical to host health, fitness, and metabolism. In most organisms, few microbes are inherited at birth; instead, acquiring microbiomes generally involves complicated interactions between the environment, hosts, and symbionts. Despite the criticality of microbiome acquisition, we know little about where hosts' microbes reside when not in or on hosts of interest. Because microbes span a continuum ranging from generalists associating with multiple hosts and habitats to specialists with narrower host ranges, identifying potential sources of microbial diversity that can contribute to the microbiomes of unrelated hosts is a gap in our understanding of microbiome assembly. Microbial dispersal attenuates with distance, so identifying sources and sinks requires data from microbiomes that are contemporary and near enough for potential microbial transmission. Here, we characterize microbiomes across adjacent terrestrial and aquatic hosts and habitats throughout an entire watershed, showing that the most species-poor microbiomes are partial subsets of the most species-rich and that microbiomes of plants and animals are nested within those of their environments. Furthermore, we show that the host and habitat range of a microbe within a single ecosystem predicts its global distribution, a relationship with implications for global microbial assembly processes. Thus, the tendency for microbes to occupy multiple habitats and unrelated hosts enables persistent microbiomes, even when host populations are disjunct. Our whole-watershed census demonstrates how a nested distribution of microbes, following the trophic hierarchies of hosts, can shape microbial acquisition.
Subject(s)
Ecosystem , Microbiota , Plants , Animals , Bacteria , Plants/microbiologyABSTRACT
BACKGROUND: Artificial intelligence (AI) large language models (LLMs) such as ChatGPT have demonstrated the ability to pass standardized exams. These models are not trained for a specific task, but instead trained to predict sequences of text from large corpora of documents sourced from the internet. It has been shown that even models trained on this general task can pass exams in a variety of domain-specific fields, including the United States Medical Licensing Examination. We asked if large language models would perform as well on a much narrower subdomain tests designed for medical specialists. Furthermore, we wanted to better understand how progressive generations of GPT (generative pre-trained transformer) models may be evolving in the completeness and sophistication of their responses even while generational training remains general. In this study, we evaluated the performance of two versions of GPT (GPT 3 and 4) on their ability to pass the certification exam given to physicians to work as osteoporosis specialists and become a certified clinical densitometrists. The CCD exam has a possible score range of 150 to 400. To pass, you need a score of 300. METHODS: A 100-question multiple-choice practice exam was obtained from a 3rd party exam preparation website that mimics the accredited certification tests given by the ISCD (International Society for Clinical Densitometry). The exam was administered to two versions of GPT, the free version (GPT Playground) and ChatGPT+, which are based on GPT-3 and GPT-4, respectively (OpenAI, San Francisco, CA). The systems were prompted with the exam questions verbatim. If the response was purely textual and did not specify which of the multiple-choice answers to select, the authors matched the text to the closest answer. Each exam was graded and an estimated ISCD score was provided from the exam website. In addition, each response was evaluated by a rheumatologist CCD and ranked for accuracy using a 5-level scale. The two GPT versions were compared in terms of response accuracy and length. RESULTS: The average response length was 11.6 ±19 words for GPT-3 and 50.0±43.6 words for GPT-4. GPT-3 answered 62 questions correctly resulting in a failing ISCD score of 289. However, GPT-4 answered 82 questions correctly with a passing score of 342. GPT-3 scored highest on the "Overview of Low Bone Mass and Osteoporosis" category (72â¯% correct) while GPT-4 scored well above 80â¯% accuracy on all categories except "Imaging Technology in Bone Health" (65â¯% correct). Regarding subjective accuracy, GPT-3 answered 23 questions with nonsensical or totally wrong responses while GPT-4 had no responses in that category. CONCLUSION: If this had been an actual certification exam, GPT-4 would now have a CCD suffix to its name even after being trained using general internet knowledge. Clearly, more goes into physician training than can be captured in this exam. However, GPT algorithms may prove to be valuable physician aids in the diagnoses and monitoring of osteoporosis and other diseases.
Subject(s)
Artificial Intelligence , Certification , Humans , Osteoporosis/diagnosis , Clinical Competence , Educational Measurement/methods , United StatesABSTRACT
Background The ability of deep learning (DL) models to classify women as at risk for either screening mammography-detected or interval cancer (not detected at mammography) has not yet been explored in the literature. Purpose To examine the ability of DL models to estimate the risk of interval and screening-detected breast cancers with and without clinical risk factors. Materials and Methods This study was performed on 25 096 digital screening mammograms obtained from January 2006 to December 2013. The mammograms were obtained in 6369 women without breast cancer, 1609 of whom developed screening-detected breast cancer and 351 of whom developed interval invasive breast cancer. A DL model was trained on the negative mammograms to classify women into those who did not develop cancer and those who developed screening-detected cancer or interval invasive cancer. Model effectiveness was evaluated as a matched concordance statistic (C statistic) in a held-out 26% (1669 of 6369) test set of the mammograms. Results The C statistics and odds ratios for comparing patients with screening-detected cancer versus matched controls were 0.66 (95% CI: 0.63, 0.69) and 1.25 (95% CI: 1.17, 1.33), respectively, for the DL model, 0.62 (95% CI: 0.59, 0.65) and 2.14 (95% CI: 1.32, 3.45) for the clinical risk factors with the Breast Imaging Reporting and Data System (BI-RADS) density model, and 0.66 (95% CI: 0.63, 0.69) and 1.21 (95% CI: 1.13, 1.30) for the combined DL and clinical risk factors model. For comparing patients with interval cancer versus controls, the C statistics and odds ratios were 0.64 (95% CI: 0.58, 0.71) and 1.26 (95% CI: 1.10, 1.45), respectively, for the DL model, 0.71 (95% CI: 0.65, 0.77) and 7.25 (95% CI: 2.94, 17.9) for the risk factors with BI-RADS density (b rated vs non-b rated) model, and 0.72 (95% CI: 0.66, 0.78) and 1.10 (95% CI: 0.94, 1.29) for the combined DL and clinical risk factors model. The P values between the DL, BI-RADS, and combined model's ability to detect screen and interval cancer were .99, .002, and .03, respectively. Conclusion The deep learning model outperformed in determining screening-detected cancer risk but underperformed for interval cancer risk when compared with clinical risk factors including breast density. © RSNA, 2021 Online supplemental material is available for this article. See also the editorial by Bae and Kim in this issue.
Subject(s)
Breast Neoplasms/diagnostic imaging , Deep Learning/statistics & numerical data , Mammography/methods , Mass Screening/statistics & numerical data , Radiographic Image Interpretation, Computer-Assisted/methods , Breast/diagnostic imaging , Case-Control Studies , Female , Humans , Middle Aged , Predictive Value of Tests , Prospective Studies , Reproducibility of Results , United StatesABSTRACT
Random backpropagation (RBP) is a variant of the backpropagation algorithm for training neural networks, where the transpose of the forward matrices are replaced by fixed random matrices in the calculation of the weight updates. It is remarkable both because of its effectiveness, in spite of using random matrices to communicate error information, and because it completely removes the taxing requirement of maintaining symmetric weights in a physical neural system. To better understand random backpropagation, we first connect it to the notions of local learning and learning channels. Through this connection, we derive several alternatives to RBP, including skipped RBP (SRPB), adaptive RBP (ARBP), sparse RBP, and their combinations (e.g. ASRBP) and analyze their computational complexity. We then study their behavior through simulations using the MNIST and CIFAR-10 bechnmark datasets. These simulations show that most of these variants work robustly, almost as well as backpropagation, and that multiplication by the derivatives of the activation functions is important. As a follow-up, we study also the low-end of the number of bits required to communicate error information over the learning channel. We then provide partial intuitive explanations for some of the remarkable properties of RBP and its variations. Finally, we prove several mathematical results, including the convergence to fixed points of linear chains of arbitrary length, the convergence to fixed points of linear autoencoders with decorrelated data, the long-term existence of solutions for linear systems with a single hidden layer and convergence in special cases, and the convergence to fixed points of non-linear chains, when the derivative of the activation functions is included.
ABSTRACT
Both the American College of Rheumatology (ACR) and European League Against Rheumatism (EULAR) guidelines recommend the use of methotrexate (MTX) for the treatment of rheumatoid arthritis (RA) when there is no contraindication. While MTX is the foundation of RA therapy (Singh et al. in Arthritis Care Res 64:625-639,2012), absorption saturation compromises its oral bioavailability (BA). Differences in the relative BA of oral versus subcutaneous (SC) MTX demonstrate the need for guidance on successful dose-conversion strategies. This study was designed to compare MTX PK profiles as a result of MTX administration via three different treatment administrations: oral, SC MTX administered via an auto-injector (MTXAI) into the abdomen (MTXAIab) and into the thigh (MTXAIth). In this paper, we establish a dose-conversion method based on the BA of MTX from oral and SC administration. SC administration provided higher exposure of MTX than the same dose given orally. Unlike the exposure limitations of oral MTX, dose-proportional exposure was seen with SC MTX.
Subject(s)
Antirheumatic Agents/administration & dosage , Arthritis, Rheumatoid/drug therapy , Methotrexate/administration & dosage , Administration, Oral , Aged , Antirheumatic Agents/therapeutic use , Cross-Over Studies , Dose-Response Relationship, Drug , Female , Humans , Injections, Subcutaneous , Male , Methotrexate/therapeutic use , Middle Aged , Treatment OutcomeABSTRACT
MOTIVATION: Animal models are widely used in biomedical research for reasons ranging from practical to ethical. An important issue is whether rodent models are predictive of human biology. This has been addressed recently in the framework of a series of challenges designed by the systems biology verification for Industrial Methodology for Process Verification in Research (sbv IMPROVER) initiative. In particular, one of the sub-challenges was devoted to the prediction of protein phosphorylation responses in human bronchial epithelial cells, exposed to a number of different chemical stimuli, given the responses in rat bronchial epithelial cells. Participating teams were asked to make inter-species predictions on the basis of available training examples, comprising transcriptomics and phosphoproteomics data. RESULTS: Here, the two best performing teams present their data-driven approaches and computational methods. In addition, post hoc analyses of the datasets and challenge results were performed by the participants and challenge organizers. The challenge outcome indicates that successful prediction of protein phosphorylation status in human based on rat phosphorylation levels is feasible. However, within the limitations of the computational tools used, the inclusion of gene expression data does not improve the prediction quality. The post hoc analysis of time-specific measurements sheds light on the signaling pathways in both species. AVAILABILITY AND IMPLEMENTATION: A detailed description of the dataset, challenge design and outcome is available at www.sbvimprover.com. The code used by team IGB is provided under http://github.com/uci-igb/improver2013. Implementations of the algorithms applied by team AMG are available at http://bhanot.biomaps.rutgers.edu/wiki/AMG-sc2-code.zip. CONTACT: meikelbiehl@gmail.com.
Subject(s)
Bronchi/metabolism , Epithelial Cells/metabolism , Gene Expression Profiling , Phosphoproteins/metabolism , Software , Systems Biology/methods , Algorithms , Animals , Bronchi/cytology , Cells, Cultured , Databases, Factual , Epithelial Cells/cytology , Gene Expression Regulation , Humans , Oligonucleotide Array Sequence Analysis , Phosphorylation , Rats , Species Specificity , Translational Research, BiomedicalABSTRACT
Machine learning (ML) and quantum mechanical (QM) methods can be used in two-way synergy to build chemical reaction expert systems. The proposed ML approach identifies electron sources and sinks among reactants and then ranks all source-sink pairs. This addresses a bottleneck of QM calculations by providing a prioritized list of mechanistic reaction steps. QM modeling can then be used to compute the transition states and activation energies of the top-ranked reactions, providing additional or improved examples of ranked source-sink pairs. Retraining the ML model closes the loop, producing more accurate predictions from a larger training set. The approach is demonstrated in detail using a small set of organic radical reactions.
Subject(s)
Machine Learning , Quantum Theory , Models, Molecular , Molecular Conformation , ThermodynamicsABSTRACT
Dropout is a recently introduced algorithm for training neural network by randomly dropping units during training to prevent their co-adaptation. A mathematical analysis of some of the static and dynamic properties of dropout is provided using Bernoulli gating variables, general enough to accommodate dropout on units or connections, and with variable rates. The framework allows a complete analysis of the ensemble averaging properties of dropout in linear networks, which is useful to understand the non-linear case. The ensemble averaging properties of dropout in non-linear logistic networks result from three fundamental equations: (1) the approximation of the expectations of logistic functions by normalized geometric means, for which bounds and estimates are derived; (2) the algebraic equality between normalized geometric means of logistic functions with the logistic of the means, which mathematically characterizes logistic functions; and (3) the linearity of the means with respect to sums, as well as products of independent variables. The results are also extended to other classes of transfer functions, including rectified linear functions. Approximation errors tend to cancel each other and do not accumulate. Dropout can also be connected to stochastic neurons and used to predict firing rates, and to backpropagation by viewing the backward propagation as ensemble averaging in a dropout linear network. Moreover, the convergence properties of dropout can be understood in terms of stochastic gradient descent. Finally, for the regularization properties of dropout, the expectation of the dropout gradient is the gradient of the corresponding approximation ensemble, regularized by an adaptive weight decay term with a propensity for self-consistent variance minimization and sparse representations.
ABSTRACT
BACKGROUND: Body shape, an intuitive health indicator, is deterministically driven by body composition. We developed and validated a deep learning model that generates accurate dual-energy X-ray absorptiometry (DXA) scans from three-dimensional optical body scans (3DO), enabling compositional analysis of the whole body and specified subregions. Previous works on generative medical imaging models lack quantitative validation and only report quality metrics. METHODS: Our model was self-supervised pretrained on two large clinical DXA datasets and fine-tuned using the Shape Up! Adults study dataset. Model-predicted scans from a holdout test set were evaluated using clinical commercial DXA software for compositional accuracy. RESULTS: Predicted DXA scans achieve R2 of 0.73, 0.89, and 0.99 and RMSEs of 5.32, 6.56, and 4.15 kg for total fat mass (FM), fat-free mass (FFM), and total mass, respectively. Custom subregion analysis results in R2s of 0.70-0.89 for left and right thigh composition. We demonstrate the ability of models to produce quantitatively accurate visualizations of soft tissue and bone, confirming a strong relationship between body shape and composition. CONCLUSIONS: This work highlights the potential of generative models in medical imaging and reinforces the importance of quantitative validation for assessing their clinical utility.
Body composition, measured quantities of muscle, fat, and bone, is typically assessed through dual energy X-ray absorptiometry (DXA) scans, which requires specialized equipment, trained technicians and involves exposure to radiation. Exterior body shape is dependent on body composition and recent technological advances have made three-dimensional (3D) scanning for body shape accessible and virtually ubiquitous. We developed a model which uses 3D body surface scan inputs to generate DXA scans. When analyzed with commercial software that is used clinically, our model generated images yielded accurate quantities of fat, lean, and bone. Our work highlights the strong relationship between exterior body shape and interior composition. Moreover, it suggests that with enhanced accuracy, such medical imaging models could be more widely adopted in clinical care, making the analysis of body composition more accessible and easier to obtain.
ABSTRACT
Predicting the 3D structures of small molecules is a common problem in chemoinformatics. Even the best methods are inaccurate for complex molecules, and there is a large gap in accuracy between proprietary and free algorithms. Previous work presented COSMOS, a novel data-driven algorithm that uses knowledge of known structures from the Cambridge Structural Database and demonstrates performance that was competitive with proprietary algorithms. However, dependence on the Cambridge Structural Database prevented its widespread use. Here, we present an updated version of the COSMOS structure predictor, complete with a free structure library derived from open data sources. We demonstrate that COSMOS performs better than other freely available methods, with a mean RMSD of 1.16 and 1.68 Å for organic and metal-organic structures, respectively, and a mean prediction time of 60 ms per molecule. This is a 17% and 20% reduction, respectively, in RMSD compared to the free predictor provided by Open Babel, and it is 10 times faster. The ChemDB Web portal provides a COSMOS prediction Web server, as well as downloadable copies of the COSMOS executable and library of molecular substructures.
Subject(s)
Heterocyclic Compounds/chemistry , Organometallic Compounds/chemistry , Small Molecule Libraries/chemistry , Software , Algorithms , Crystallography, X-Ray , Databases, Chemical , Databases, Factual , Molecular ConformationABSTRACT
Background: Mortality research has identified biomarkers predictive of all-cause mortality risk. Most of these markers, such as body mass index, are predictive cross-sectionally, while for others the longitudinal change has been shown to be predictive, for instance greater-than-average muscle and weight loss in older adults. And while sometimes markers are derived from imaging modalities such as DXA, full scans are rarely used. This study builds on that knowledge and tests two hypotheses to improve all-cause mortality prediction. The first hypothesis is that features derived from raw total-body DXA imaging using deep learning are predictive of all-cause mortality with and without clinical risk factors, meanwhile, the second hypothesis states that sequential total-body DXA scans and recurrent neural network models outperform comparable models using only one observation with and without clinical risk factors. Methods: Multiple deep neural network architectures were designed to test theses hypotheses. The models were trained and evaluated on data from the 16-year-long Health, Aging, and Body Composition Study including over 15,000 scans from over 3000 older, multi-race male and female adults. This study further used explainable AI techniques to interpret the predictions and evaluate the contribution of different inputs. Results: The results demonstrate that longitudinal total-body DXA scans are predictive of all-cause mortality and improve performance of traditional mortality prediction models. On a held-out test set, the strongest model achieves an area under the receiver operator characteristic curve of 0.79. Conclusion: This study demonstrates the efficacy of deep learning for the analysis of DXA medical imaging in a cross-sectional and longitudinal setting. By analyzing the trained deep learning models, this work also sheds light on what constitutes healthy aging in a diverse cohort.
ABSTRACT
Background: While breast imaging such as full-field digital mammography and digital breast tomosynthesis have helped to reduced breast cancer mortality, issues with low specificity exist resulting in unnecessary biopsies. The fundamental information used in diagnostic decisions are primarily based in lesion morphology. We explore a dual-energy compositional breast imaging technique known as three-compartment breast (3CB) to show how the addition of compositional information improves malignancy detection. Methods: Women who presented with Breast Imaging-Reporting and Data System (BI-RADS) diagnostic categories 4 or 5 and who were scheduled for breast biopsies were consecutively recruited for both standard mammography and 3CB imaging. Computer-aided detection (CAD) software was used to assign a morphology-based prediction of malignancy for all biopsied lesions. Compositional signatures for all lesions were calculated using 3CB imaging and a neural network evaluated CAD predictions with composition to predict a new probability of malignancy. CAD and neural network predictions were compared to the biopsy pathology. Results: The addition of 3CB compositional information to CAD improves malignancy predictions resulting in an area under the receiver operating characteristic curve (AUC) of 0.81 (confidence interval (CI) of 0.74-0.88) on a held-out test set, while CAD software alone achieves an AUC of 0.69 (CI 0.60-0.78). We also identify that invasive breast cancers have a unique compositional signature characterized by reduced lipid content and increased water and protein content when compared to surrounding tissues. Conclusion: Clinically, 3CB may potentially provide increased accuracy in predicting malignancy and a feasible avenue to explore compositional breast imaging biomarkers.
ABSTRACT
In a physical neural system, learning rules must be local both in space and time. In order for learning to occur, non-local information must be communicated to the deep synapses through a communication channel, the deep learning channel. We identify several possible architectures for this learning channel (Bidirectional, Conjoined, Twin, Distinct) and six symmetry challenges: (1) symmetry of architectures; (2) symmetry of weights; (3) symmetry of neurons; (4) symmetry of derivatives; (5) symmetry of processing; and (6) symmetry of learning rules. Random backpropagation (RBP) addresses the second and third symmetry, and some of its variations, such as skipped RBP (SRBP) address the first and the fourth symmetry. Here we address the last two desirable symmetries showing through simulations that they can be achieved and that the learning channel is particularly robust to symmetry variations. Specifically, random backpropagation and its variations can be performed with the same non-linear neurons used in the main input-output forward channel, and the connections in the learning channel can be adapted using the same algorithm used in the forward channel, removing the need for any specialized hardware in the learning channel. Finally, we provide mathematical results in simple cases showing that the learning equations in the forward and backward channels converge to fixed points, for almost any initial conditions. In symmetric architectures, if the weights in both channels are small at initialization, adaptation in both channels leads to weights that are essentially symmetric during and after learning. Biological connections are discussed.
Subject(s)
Machine Learning , Neural Networks, ComputerABSTRACT
In a physical neural system, where storage and processing are intimately intertwined, the rules for adjusting the synaptic weights can only depend on variables that are available locally, such as the activity of the pre- and post-synaptic neurons, resulting in local learning rules. A systematic framework for studying the space of local learning rules is obtained by first specifying the nature of the local variables, and then the functional form that ties them together into each learning rule. Such a framework enables also the systematic discovery of new learning rules and exploration of relationships between learning rules and group symmetries. We study polynomial local learning rules stratified by their degree and analyze their behavior and capabilities in both linear and non-linear units and networks. Stacking local learning rules in deep feedforward networks leads to deep local learning. While deep local learning can learn interesting representations, it cannot learn complex input-output functions, even when targets are available for the top layer. Learning complex input-output functions requires local deep learning where target information is communicated to the deep layers through a backward learning channel. The nature of the communicated information about the targets and the structure of the learning channel partition the space of learning algorithms. For any learning algorithm, the capacity of the learning channel can be defined as the number of bits provided about the error gradient per weight, divided by the number of required operations per weight. We estimate the capacity associated with several learning algorithms and show that backpropagation outperforms them by simultaneously maximizing the information rate and minimizing the computational cost. This result is also shown to be true for recurrent networks, by unfolding them in time. The theory clarifies the concept of Hebbian learning, establishes the power and limitations of local learning rules, introduces the learning channel which enables a formal analysis of the optimality of backpropagation, and explains the sparsity of the space of learning rules discovered so far.