RESUMO
Background Computational models on the basis of deep neural networks are increasingly used to analyze health care data. However, the efficacy of traditional computational models in radiology is a matter of debate. Purpose To evaluate the accuracy and efficiency of a combined machine and deep learning approach for early breast cancer detection applied to a linked set of digital mammography images and electronic health records. Materials and Methods In this retrospective study, 52 936 images were collected in 13 234 women who underwent at least one mammogram between 2013 and 2017, and who had health records for at least 1 year before undergoing mammography. The algorithm was trained on 9611 mammograms and health records of women to make two breast cancer predictions: to predict biopsy malignancy and to differentiate normal from abnormal screening examinations. The study estimated the association of features with outcomes by using t test and Fisher exact test. The model comparisons were performed with a 95% confidence interval (CI) or by using the DeLong test. Results The resulting algorithm was validated in 1055 women and tested in 2548 women (mean age, 55 years ± 10 [standard deviation]). In the test set, the algorithm identified 34 of 71 (48%) false-negative findings on mammograms. For the malignancy prediction objective, the algorithm obtained an area under the receiver operating characteristic curve (AUC) of 0.91 (95% CI: 0.89, 0.93), with specificity of 77.3% (95% CI: 69.2%, 85.4%) at a sensitivity of 87%. When trained on clinical data alone, the model performed significantly better than the Gail model (AUC, 0.78 vs 0.54, respectively; P < .004). Conclusion The algorithm, which combined machine-learning and deep-learning approaches, can be applied to assess breast cancer at a level comparable to radiologists and has the potential to substantially reduce missed diagnoses of breast cancer. © RSNA, 2019 Online supplemental material is available for this article.
Assuntos
Neoplasias da Mama/diagnóstico por imagem , Aprendizado Profundo , Registros Eletrônicos de Saúde , Mamografia/métodos , Interpretação de Imagem Radiográfica Assistida por Computador/métodos , Mama/diagnóstico por imagem , Feminino , Humanos , Pessoa de Meia-Idade , Valor Preditivo dos Testes , Reprodutibilidade dos Testes , Estudos Retrospectivos , Sensibilidade e EspecificidadeRESUMO
Intron positions upon the mRNA transcript are sometimes remarkably conserved even across distantly related eukaryotic species. This has made the comparison of intron-exon architectures across orthologous transcripts a very useful tool for studying various evolutionary processes. Moreover, the wide range of functions associated with introns may confer biological meaning to evolutionary changes in gene architectures. Yet, there is currently no database that offers such comparative information. Here, we present JuncDB (http://juncdb.carmelab.huji.ac.il/), an exon-exon junction database dedicated to the comparison of architectures between orthologous transcripts. It covers nearly 40,000 sets of orthologous transcripts spanning 88 eukaryotic species. JuncDB offers a user-friendly interface, access to detailed information, instructive graphical displays of the comparative data and easy ways to download data to a local computer. In addition, JuncDB allows the analysis to be carried out either on specific genes, or at a genome-wide level for any selected group of species.
Assuntos
Bases de Dados de Ácidos Nucleicos , Éxons , Humanos , Internet , Íntrons , RNA Mensageiro/química , Alinhamento de SequênciaRESUMO
An appreciable fraction of introns is thought to have some function, but there is no obvious way to predict which specific intron is likely to be functional. We hypothesize that functional introns experience a different selection regime than non-functional ones and will therefore show distinct evolutionary histories. In particular, we expect functional introns to be more resistant to loss, and that this would be reflected in high conservation of their position with respect to the coding sequence. To test this hypothesis, we focused on introns whose function comes about from microRNAs and snoRNAs that are embedded within their sequence. We built a data set of orthologous genes across 28 eukaryotic species, reconstructed the evolutionary histories of their introns and compared functional introns with the rest of the introns. We found that, indeed, the position of microRNA- and snoRNA-bearing introns is significantly more conserved. In addition, we found that both families of RNA genes settled within introns early during metazoan evolution. We identified several easily computable intronic properties that can be used to detect functional introns in general, thereby suggesting a new strategy to pinpoint non-coding cellular functions.
Assuntos
Íntrons , MicroRNAs/genética , RNA Nucleolar Pequeno/genética , Animais , Sequência de Bases , Biologia Computacional , Sequência Conservada , Evolução Molecular , Genes , HumanosRESUMO
Epilepsy is a severe chronic neurological disease affecting 60 million people worldwide. Primary treatment is with anti-seizure medicines (ASMs), but many patients continue to experience seizures. We used retrospective insurance claims data on 280,587 patients with uncontrolled epilepsy (UE), defined as status epilepticus, need for a rescue medicine, or admission or emergency visit for an epilepsy code. We conducted a computational risk ratio analysis between pairs of ASMs using a causal inference method, in order to match 1034 clinical factors and simulate randomization. Data was extracted from the MarketScan insurance claims Research Database records from 2011 to 2015. The cohort consisted of individuals over 18 years old with a diagnosis of epilepsy who took one of eight ASMs and had more than a year of history prior to the filling of the drug prescription. Seven ASM exposures were analyzed: topiramate, phenytoin, levetiracetam, gabapentin, lamotrigine, valproate, and carbamazepine or oxcarbazepine (treated as the same exposure). We calculated the risk ratio of UE between pairs of ASM after controlling for bias with inverse propensity weighting applied to 1034 factors, such as demographics, confounding illnesses, non-epileptic conditions treated by ASMs, etc. All ASMs exhibited a significant reduction in the prevalence of UE, but three drugs showed pair-wise differences compared to other ASMs. Topiramate consistently was associated with a lower risk of UE, with a mean risk ratio range of 0.68-0.93 (average 0.82, CI: 0.56-1.08). Phenytoin and levetiracetam were consistently associated with a higher risk of UE with mean risk ratio ranges of 1.11 to 1.47 (average 1.13, CI 0.98-1.65) and 1.15 to 1.43 (average 1.2, CI 0.72-1.69), respectively. Large-scale retrospective insurance claims data - combined with causal inference analysis - provides an opportunity to compare the effect of treatments in real-world data in populations 1,000-fold larger than those in typical randomized trials. Our causal analysis identified the clinically unexpected finding of topiramate as being associated with a lower risk of UE; and phenytoin and levetiracetam as associated with a higher risk of UE (compared to other studied drugs, not to baseline). However, we note that our data set for this study only used insurance claims events, which does not comprise actual seizure frequencies, nor a clear picture of side effects. Our results do not advocate for any change in practice but demonstrate that conclusions from large databases may differ from and supplement those of randomized trials and clinical practice and therefore may guide further investigation.
Assuntos
Epilepsia , Seguro , Humanos , Adolescente , Topiramato/uso terapêutico , Levetiracetam/uso terapêutico , Fenitoína/uso terapêutico , Estudos Retrospectivos , Epilepsia/tratamento farmacológico , Epilepsia/epidemiologia , Epilepsia/induzido quimicamenteRESUMO
Patients diagnosed with exudative neovascular age-related macular degeneration are commonly treated with anti-vascular endothelial growth factor (anti-VEGF) agents. However, response to treatment is heterogeneous, without a clinical explanation. Predicting suboptimal response at baseline will enable more efficient clinical trial designs for novel, future interventions and facilitate individualised therapies. In this multicentre study, we trained a multi-modal artificial intelligence (AI) system to identify suboptimal responders to the loading-phase of the anti-VEGF agent aflibercept from baseline characteristics. We collected clinical features and optical coherence tomography scans from 1720 eyes of 1612 patients between 2019 and 2021. We evaluated our AI system as a patient selection method by emulating hypothetical clinical trials of different sizes based on our test set. Our method detected up to 57.6% more suboptimal responders than random selection, and up to 24.2% more than any alternative selection criteria tested. Applying this method to the entry process of candidates into randomised controlled trials may contribute to the success of such trials and further inform personalised care.
RESUMO
Breast cancer (BC) risk models based on electronic health records (EHR) can assist physicians in estimating the probability of an individual with certain risk factors to develop BC in the future. In this retrospective study, we used clinical data combined with machine learning tools to assess the utility of a personalized BC risk model on 13,786 Israeli and 1,695 American women who underwent screening mammography in the years 2012-2018 and 2008-2018, respectively. Clinical features were extracted from EHR, personal questionnaires, and past radiologists' reports. Using a set of 1,547 features, the predictive ability for BC within 12 months was measured in both datasets and in sub-cohorts of interest. Our results highlight the improved performance of our model over previous established BC risk models, their ultimate potential for risk-based screening policies on first time patients and novel clinically relevant risk factors that can compensate for the absence of imaging history information.
Assuntos
Neoplasias da Mama , Humanos , Feminino , Mamografia , Estudos Retrospectivos , Detecção Precoce de Câncer , Mama , Medição de RiscoRESUMO
Many human introns carry out a function, in the sense that they are critical to maintain normal cellular activity. Their identification is fundamental to understanding cellular processes and disease. However, being noncoding elements, such functional introns are poorly predicted based on traditional approaches of sequence and structure conservation. Here, we generated a dataset of human functional introns that carry out different types of functions. We showed that functional introns share common characteristics, such as higher positional conservation along the coding sequence and reduced loss rates, regardless of their specific function. A unique property of the data is that if an intron is unknown to be functional, it still does not mean that it is indeed non-functional. We developed a probabilistic framework that explicitly accounts for this unique property, and predicts which specific human introns are functional. We show that we successfully predict function even when the algorithm is trained on introns with a different type of function. This ability has many implications in studying regulatory networks, gene regulation, the effect of mutations outside exons on human disease, and on our general understanding of intron evolution and their functional exaptation in mammals.
Assuntos
Sequência Conservada/genética , Íntrons/genética , Animais , Sequência de Bases , Bases de Dados Genéticas , Análise Discriminante , Genoma , Humanos , MicroRNAs/genética , MicroRNAs/metabolismo , Modelos Estatísticos , Fases de Leitura Aberta , Processamento Pós-Transcricional do RNA , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , RNA Nucleolar Pequeno/genética , RNA Nucleolar Pequeno/metabolismoRESUMO
Mathematic models of epidemics are the key tool for predicting future course of disease in a population and analyzing the effects of possible intervention policies. Typically, models that produce deterministic are applied for making predictions and reaching decisions. Stochastic modeling methods present an alternative. Here, we demonstrate by example why it is important that stochastic modeling be used in population health decision support systems.
Assuntos
Surtos de Doenças/estatística & dados numéricos , Métodos Epidemiológicos , Modelos Estatísticos , Técnicas de Apoio para a Decisão , Processos EstocásticosRESUMO
Epidemiological models are key tools in assessing intervention policies for population health management. Statistical models, fitted with survey or health system data, can be combined with lab and field studies to provide reliable predictions of future population-level disease dynamics distributions and the effects of interventions. All too often, however, the end result of epidemiological modeling and cost-effectiveness studies is in the form of a report or journal paper. These are inherently limited in their coverage of locations, policy options, and derived outcome measures. Here, we describe a tool to support population health policy planning. The tool allows users to explore simulations of various policies, to view and compare interventions spanning multiple variables, time points, and locations. The design's modular architecture, and data representation separate the modeling methods, the outcome measures calculations, and the visualizations, making each component easily replaceable. These advantages make it extremely versatile and suitable for multiple uses.
Assuntos
Política de Saúde , Modelos Estatísticos , Política Pública , Análise Custo-Benefício , Humanos , Saúde da PopulaçãoRESUMO
RNA-seq is becoming a preferred tool for genomics studies of model and non-model organisms. However, DNA-based analysis of organisms lacking sequenced genomes cannot rely on RNA-seq data alone to isolate most genes of interest, as DNA codes both exons and introns. With this in mind, we designed a novel tool, LEMONS, that exploits the evolutionary conservation of both exon/intron boundary positions and splice junction recognition signals to produce high throughput splice-junction predictions in the absence of a reference genome. When tested on multiple annotated vertebrate mRNA data, LEMONS accurately identified 87% (average) of the splice-junctions. LEMONS was then applied to our updated Mediterranean chameleon transcriptome, which lacks a reference genome, and predicted a total of 90,820 exon-exon junctions. We experimentally verified these splice-junction predictions by amplifying and sequencing twenty randomly selected genes from chameleon DNA templates. Exons and introns were detected in 19 of 20 of the positions predicted by LEMONS. To the best of our knowledge, LEMONS is currently the only experimentally verified tool that can accurately predict splice-junctions in organisms that lack a reference genome.
Assuntos
Biologia Computacional/métodos , Éxons , Íntrons , Sítios de Splice de RNA , Splicing de RNA , Software , Transcriptoma , Genômica/métodos , Motivos de Nucleotídeos , RNA Mensageiro/genética , Reprodutibilidade dos TestesRESUMO
The intron-exon architecture of many eukaryotic genes raises the intriguing question of whether this unique organization serves any function, or is it simply a result of the spread of functionless introns in eukaryotic genomes. In this review, we show that introns in contemporary species fulfill a broad spectrum of functions, and are involved in virtually every step of mRNA processing. We propose that this great diversity of intronic functions supports the notion that introns were indeed selfish elements in early eukaryotes, but then independently gained numerous functions in different eukaryotic lineages. We suggest a novel criterion of evolutionary conservation, dubbed intron positional conservation, which can identify functional introns.