Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 23
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Mol Psychiatry ; 29(2): 387-401, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38177352

RESUMEN

Applications of machine learning in the biomedical sciences are growing rapidly. This growth has been spurred by diverse cross-institutional and interdisciplinary collaborations, public availability of large datasets, an increase in the accessibility of analytic routines, and the availability of powerful computing resources. With this increased access and exposure to machine learning comes a responsibility for education and a deeper understanding of its bases and bounds, borne equally by data scientists seeking to ply their analytic wares in medical research and by biomedical scientists seeking to harness such methods to glean knowledge from data. This article provides an accessible and critical review of machine learning for a biomedically informed audience, as well as its applications in psychiatry. The review covers definitions and expositions of commonly used machine learning methods, and historical trends of their use in psychiatry. We also provide a set of standards, namely Guidelines for REporting Machine Learning Investigations in Neuropsychiatry (GREMLIN), for designing and reporting studies that use machine learning as a primary data-analysis approach. Lastly, we propose the establishment of the Machine Learning in Psychiatry (MLPsych) Consortium, enumerate its objectives, and identify areas of opportunity for future applications of machine learning in biological psychiatry. This review serves as a cautiously optimistic primer on machine learning for those on the precipice as they prepare to dive into the field, either as methodological practitioners or well-informed consumers.


Asunto(s)
Psiquiatría Biológica , Aprendizaje Automático , Humanos , Psiquiatría Biológica/métodos , Psiquiatría/métodos , Investigación Biomédica/métodos
2.
Proc Natl Acad Sci U S A ; 118(34)2021 08 24.
Artículo en Inglés | MEDLINE | ID: mdl-34413191

RESUMEN

Binary classification is one of the central problems in machine-learning research and, as such, investigations of its general statistical properties are of interest. We studied the ranking statistics of items in binary classification problems and observed that there is a formal and surprising relationship between the probability of a sample belonging to one of the two classes and the Fermi-Dirac distribution determining the probability that a fermion occupies a given single-particle quantum state in a physical system of noninteracting fermions. Using this equivalence, it is possible to compute a calibrated probabilistic output for binary classifiers. We show that the area under the receiver operating characteristics curve (AUC) in a classification problem is related to the temperature of an equivalent physical system. In a similar manner, the optimal decision threshold between the two classes is associated with the chemical potential of an equivalent physical system. Using our framework, we also derive a closed-form expression to calculate the variance for the AUC of a classifier. Finally, we introduce FiDEL (Fermi-Dirac-based ensemble learning), an ensemble learning algorithm that uses the calibrated nature of the classifier's output probability to combine possibly very different classifiers.

3.
PLoS Genet ; 17(6): e1009589, 2021 06.
Artículo en Inglés | MEDLINE | ID: mdl-34166362

RESUMEN

Cancer testis antigens (CTAs) are an extensive gene family with a unique expression pattern restricted to germ cells, but aberrantly reactivated in cancer tissues. Studies indicate that the expression (or re-expression) of CTAs within the MAGE-A family is common in hepatocellular carcinoma (HCC). However, no systematic characterization has yet been reported. The aim of this study is to perform a comprehensive profile of CTA de-regulation in HCC and experimentally evaluate the role of MAGEA3 as a driver of HCC progression. The transcriptomic analysis of 44 multi-regionally sampled HCCs from 12 patients identified high intra-tumor heterogeneity of CTAs. In addition, a subset of CTAs was significantly overexpressed in histologically poorly differentiated regions. Further analysis of CTAs in larger patient cohorts revealed high CTA expression related to worse overall survival and several other markers of poor prognosis. Functional analysis of MAGEA3 was performed in human HCC cell lines by gene silencing and in a genetic mouse model by overexpression of MAGEA3 in the liver. Knockdown of MAGEA3 decreased cell proliferation, colony formation and increased apoptosis. MAGEA3 overexpression was associated with more aggressive tumors in vivo. In conclusion MAGEA3 enhances tumor progression and should be considered as a novel therapeutic target in HCC.


Asunto(s)
Antígenos de Neoplasias/genética , Antígenos de Neoplasias/inmunología , Carcinoma Hepatocelular/patología , Neoplasias Hepáticas/patología , Proteínas de Neoplasias/genética , Testículo/inmunología , Transcriptoma , Apoptosis/genética , Carcinoma Hepatocelular/genética , Carcinoma Hepatocelular/inmunología , Proliferación Celular/genética , Progresión de la Enfermedad , Perfilación de la Expresión Génica , Humanos , Neoplasias Hepáticas/genética , Neoplasias Hepáticas/inmunología , Masculino , Pronóstico , Regulación hacia Arriba
4.
Bioinformatics ; 37(14): 2070-2072, 2021 08 04.
Artículo en Inglés | MEDLINE | ID: mdl-33241320

RESUMEN

SUMMARY: The advent of high-throughput technologies has provided researchers with measurements of thousands of molecular entities and enable the investigation of the internal regulatory apparatus of the cell. However, network inference from high-throughput data is far from being a solved problem. While a plethora of different inference methods have been proposed, they often lead to non-overlapping predictions, and many of them lack user-friendly implementations to enable their broad utilization. Here, we present Consensus Interaction Network Inference Service (COSIFER), a package and a companion web-based platform to infer molecular networks from expression data using state-of-the-art consensus approaches. COSIFER includes a selection of state-of-the-art methodologies for network inference and different consensus strategies to integrate the predictions of individual methods and generate robust networks. AVAILABILITY AND IMPLEMENTATION: COSIFER Python source code is available at https://github.com/PhosphorylatedRabbits/cosifer. The web service is accessible at https://ibm.biz/cosifer-aas. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Programas Informáticos , Consenso
5.
Gut ; 2021 Jul 28.
Artículo en Inglés | MEDLINE | ID: mdl-34321221

RESUMEN

OBJECTIVE: Surveillance tools for early cancer detection are suboptimal, including hepatocellular carcinoma (HCC), and biomarkers are urgently needed. Extracellular vesicles (EVs) have gained increasing scientific interest due to their involvement in tumour initiation and metastasis; however, most extracellular RNA (exRNA) blood-based biomarker studies are limited to annotated genomic regions. DESIGN: EVs were isolated with differential ultracentrifugation and integrated nanoscale deterministic lateral displacement arrays (nanoDLD) and quality assessed by electron microscopy, immunoblotting, nanoparticle tracking and deconvolution analysis. Genome-wide sequencing of the largely unexplored small exRNA landscape, including unannotated transcripts, identified and reproducibly quantified small RNA clusters (smRCs). Their key genomic features were delineated across biospecimens and EV isolation techniques in prostate cancer and HCC. Three independent exRNA cancer datasets with a total of 479 samples from 375 patients, including longitudinal samples, were used for this study. RESULTS: ExRNA smRCs were dominated by uncharacterised, unannotated small RNA with a consensus sequence of 20 nt. An unannotated 3-smRC signature was significantly overexpressed in plasma exRNA of patients with HCC (p<0.01, n=157). An independent validation in a phase 2 biomarker case-control study revealed 86% sensitivity and 91% specificity for the detection of early HCC from controls at risk (n=209) (area under the receiver operating curve (AUC): 0.87). The 3-smRC signature was independent of alpha-fetoprotein (p<0.0001) and a composite model yielded an increased AUC of 0.93. CONCLUSION: These findings directly lead to the prospect of a minimally invasive, blood-only, operator-independent clinical tool for HCC surveillance, thus highlighting the potential of unannotated smRCs for biomarker research in cancer.

6.
J Med Syst ; 45(6): 64, 2021 May 04.
Artículo en Inglés | MEDLINE | ID: mdl-33948743

RESUMEN

Ongoing research efforts have been examining how to utilize artificial intelligence technology to help healthcare consumers make sense of their clinical data, such as diagnostic radiology reports. How to promote the acceptance of such novel technology is a heated research topic. Recent studies highlight the importance of providing local explanations about AI prediction and model performance to help users determine whether to trust AI's predictions. Despite some efforts, limited empirical research has been conducted to quantitatively measure how AI explanations impact healthcare consumers' perceptions of using patient-facing, AI-powered healthcare systems. The aim of this study is to evaluate the effects of different AI explanations on people's perceptions of AI-powered healthcare system. In this work, we designed and deployed a large-scale experiment (N = 3,423) on Amazon Mechanical Turk (MTurk) to evaluate the effects of AI explanations on people's perceptions in the context of comprehending radiology reports. We created four groups based on two factors-the extent of explanations for the prediction (High vs. Low Transparency) and the model performance (Good vs. Weak AI Model)-and randomly assigned participants to one of the four conditions. Participants were instructed to classify a radiology report as describing a normal or abnormal finding, followed by completing a post-study survey to indicate their perceptions of the AI tool. We found that revealing model performance information can promote people's trust and perceived usefulness of system outputs, while providing local explanations for the rationale of a prediction can promote understandability but not necessarily trust. We also found that when model performance is low, the more information the AI system discloses, the less people would trust the system. Lastly, whether human agrees with AI predictions or not and whether the AI prediction is correct or not could also influence the effect of AI explanations. We conclude this paper by discussing implications for designing AI systems for healthcare consumers to interpret diagnostic report.


Asunto(s)
Inteligencia Artificial , Radiología , Atención a la Salud , Humanos , Percepción , Radiografía
7.
BMC Genomics ; 18(Suppl 3): 233, 2017 03 27.
Artículo en Inglés | MEDLINE | ID: mdl-28361685

RESUMEN

BACKGROUND: Metastasis via pelvic and/or para-aortic lymph nodes is a major risk factor for endometrial cancer. Lymph-node resection ameliorates risk but is associated with significant co-morbidities. Incidence in patients with stage I disease is 4-22% but no mechanism exists to accurately predict it. Therefore, national guidelines for primary staging surgery include pelvic and para-aortic lymph node dissection for all patients whose tumor exceeds 2cm in diameter. We sought to identify a robust molecular signature that can accurately classify risk of lymph node metastasis in endometrial cancer patients. 86 tumors matched for age and race, and evenly distributed between lymph node-positive and lymph node-negative cases, were selected as a training cohort. Genomic micro-RNA expression was profiled for each sample to serve as the predictive feature matrix. An independent set of 28 tumor samples was collected and similarly characterized to serve as a test cohort. RESULTS: A feature selection algorithm was designed for applications where the number of samples is far smaller than the number of measured features per sample. A predictive miRNA expression signature was developed using this algorithm, which was then used to predict the metastatic status of the independent test cohort. A weighted classifier, using 18 micro-RNAs, achieved 100% accuracy on the training cohort. When applied to the testing cohort, the classifier correctly predicted 90% of node-positive cases, and 80% of node-negative cases (FDR = 6.25%). CONCLUSION: Results indicate that the evaluation of the quantitative sparse-feature classifier proposed here in clinical trials may lead to significant improvement in the prediction of lymphatic metastases in endometrial cancer patients.


Asunto(s)
Neoplasias Endometriales/diagnóstico , Neoplasias Endometriales/genética , Genómica/métodos , Algoritmos , Biología Computacional/métodos , Femenino , Perfilación de la Expresión Génica/métodos , Humanos , MicroARNs/genética , Metástasis de la Neoplasia , Estadificación de Neoplasias , Pronóstico
8.
Bioinform Adv ; 4(1): vbae093, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39011276

RESUMEN

Motivation: The integration of vast, complex biological data with computational models offers profound insights and predictive accuracy. Yet, such models face challenges: poor generalization and limited labeled data. Results: To overcome these difficulties in binary classification tasks, we developed the Method for Optimal Classification by Aggregation (MOCA) algorithm, which addresses the problem of generalization by virtue of being an ensemble learning method and can be used in problems with limited or no labeled data. We developed both an unsupervised (uMOCA) and a supervised (sMOCA) variant of MOCA. For uMOCA, we show how to infer the MOCA weights in an unsupervised way, which are optimal under the assumption of class-conditioned independent classifier predictions. When it is possible to use labels, sMOCA uses empirically computed MOCA weights. We demonstrate the performance of uMOCA and sMOCA using simulated data as well as actual data previously used in Dialogue on Reverse Engineering and Methods (DREAM) challenges. We also propose an application of sMOCA for transfer learning where we use pre-trained computational models from a domain where labeled data are abundant and apply them to a different domain with less abundant labeled data. Availability and implementation: GitHub repository, https://github.com/robert-vogel/moca.

9.
iScience ; 27(3): 108905, 2024 Mar 15.
Artículo en Inglés | MEDLINE | ID: mdl-38390492

RESUMEN

Characterizing the effect of combination therapies is vital for treating diseases like cancer. We introduce correlated drug action (CDA), a baseline model for the study of drug combinations in both cell cultures and patient populations, which assumes that the efficacy of drugs in a combination may be correlated. We apply temporal CDA (tCDA) to clinical trial data, and demonstrate the utility of this approach in identifying possible synergistic combinations and others that can be explained in terms of monotherapies. Using MCF7 cell line data, we assess combinations with dose CDA (dCDA), a model that generalizes other proposed models (e.g., Bliss response-additivity, the dose equivalence principle), and introduce Excess over CDA (EOCDA), a new metric for identifying possible synergistic combinations in cell culture.

10.
iScience ; 27(1): 108770, 2024 Jan 19.
Artículo en Inglés | MEDLINE | ID: mdl-38261919

RESUMEN

The Centers for Disease Control and Prevention promoted the Test-to-Stay (TTS) program to facilitate in-person instruction in K-12 schools during COVID-19. This program delineates guidelines for schools to regularly test students and staff to minimize risks of infection transmission. TTS enrollment can be implemented via two different consent models: opt-in, in which students do not test regularly by default, and the opposite, opt-out model. We study the impacts of the two enrollment approaches on testing and positivity rates with data from 259 schools in Illinois. Our results indicate that after controlling for other covariates, schools following the opt-out model are associated with 84% higher testing rate and 30% lower positivity rate. If all schools adopted the opt-out model, 20% of the total lost school days could have been saved. The lower positivity rate among the opt-out group is largely explained by the higher testing rate in these schools, a manifestation of status quo bias.

11.
JAMA Netw Open ; 5(11): e2242343, 2022 11 01.
Artículo en Inglés | MEDLINE | ID: mdl-36409497

RESUMEN

Importance: With a shortfall in fellowship-trained breast radiologists, mammography screening programs are looking toward artificial intelligence (AI) to increase efficiency and diagnostic accuracy. External validation studies provide an initial assessment of how promising AI algorithms perform in different practice settings. Objective: To externally validate an ensemble deep-learning model using data from a high-volume, distributed screening program of an academic health system with a diverse patient population. Design, Setting, and Participants: In this diagnostic study, an ensemble learning method, which reweights outputs of the 11 highest-performing individual AI models from the Digital Mammography Dialogue on Reverse Engineering Assessment and Methods (DREAM) Mammography Challenge, was used to predict the cancer status of an individual using a standard set of screening mammography images. This study was conducted using retrospective patient data collected between 2010 and 2020 from women aged 40 years and older who underwent a routine breast screening examination and participated in the Athena Breast Health Network at the University of California, Los Angeles (UCLA). Main Outcomes and Measures: Performance of the challenge ensemble method (CEM) and the CEM combined with radiologist assessment (CEM+R) were compared with diagnosed ductal carcinoma in situ and invasive cancers within a year of the screening examination using performance metrics, such as sensitivity, specificity, and area under the receiver operating characteristic curve (AUROC). Results: Evaluated on 37 317 examinations from 26 817 women (mean [SD] age, 58.4 [11.5] years), individual model AUROC estimates ranged from 0.77 (95% CI, 0.75-0.79) to 0.83 (95% CI, 0.81-0.85). The CEM model achieved an AUROC of 0.85 (95% CI, 0.84-0.87) in the UCLA cohort, lower than the performance achieved in the Kaiser Permanente Washington (AUROC, 0.90) and Karolinska Institute (AUROC, 0.92) cohorts. The CEM+R model achieved a sensitivity (0.813 [95% CI, 0.781-0.843] vs 0.826 [95% CI, 0.795-0.856]; P = .20) and specificity (0.925 [95% CI, 0.916-0.934] vs 0.930 [95% CI, 0.929-0.932]; P = .18) similar to the radiologist performance. The CEM+R model had significantly lower sensitivity (0.596 [95% CI, 0.466-0.717] vs 0.850 [95% CI, 0.766-0.923]; P < .001) and specificity (0.803 [95% CI, 0.734-0.861] vs 0.945 [95% CI, 0.936-0.954]; P < .001) than the radiologist in women with a prior history of breast cancer and Hispanic women (0.894 [95% CI, 0.873-0.910] vs 0.926 [95% CI, 0.919-0.933]; P = .004). Conclusions and Relevance: This study found that the high performance of an ensemble deep-learning model for automated screening mammography interpretation did not generalize to a more diverse screening cohort, suggesting that the model experienced underspecification. This study suggests the need for model transparency and fine-tuning of AI models for specific target populations prior to their clinical adoption.


Asunto(s)
Neoplasias de la Mama , Mamografía , Humanos , Femenino , Adulto , Persona de Mediana Edad , Inteligencia Artificial , Neoplasias de la Mama/diagnóstico por imagen , Estudios Retrospectivos , Detección Precoz del Cáncer
12.
EBioMedicine ; 66: 103275, 2021 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-33745882

RESUMEN

BACKGROUND: Assistive automatic seizure detection can empower human annotators to shorten patient monitoring data review times. We present a proof-of-concept for a seizure detection system that is sensitive, automated, patient-specific, and tunable to maximise sensitivity while minimizing human annotation times. The system uses custom data preparation methods, deep learning analytics and electroencephalography (EEG) data. METHODS: Scalp EEG data of 365 patients containing 171,745 s ictal and 2,185,864 s interictal samples obtained from clinical monitoring systems were analysed as part of a crowdsourced artificial intelligence (AI) challenge. Participants were tasked to develop an ictal/interictal classifier with high sensitivity and low false alarm rates. We built a challenge platform that prevented participants from downloading or directly accessing the data while allowing crowdsourced model development. FINDINGS: The automatic detection system achieved tunable sensitivities between 75.00% and 91.60% allowing a reduction in the amount of raw EEG data to be reviewed by a human annotator by factors between 142x, and 22x respectively. The algorithm enables instantaneous reviewer-managed optimization of the balance between sensitivity and the amount of raw EEG data to be reviewed. INTERPRETATION: This study demonstrates the utility of deep learning for patient-specific seizure detection in EEG data. Furthermore, deep learning in combination with a human reviewer can provide the basis for an assistive data labelling system lowering the time of manual review while maintaining human expert annotation performance. FUNDING: IBM employed all IBM Research authors. Temple University employed all Temple University authors. The Icahn School of Medicine at Mount Sinai employed Eren Ahsen. The corresponding authors Stefan Harrer and Gustavo Stolovitzky declare that they had full access to all the data in the study and that they had final responsibility for the decision to submit for publication.


Asunto(s)
Inteligencia Artificial , Encéfalo/fisiopatología , Electroencefalografía , Neurólogos , Convulsiones/diagnóstico , Algoritmos , Análisis de Datos , Aprendizaje Profundo , Electroencefalografía/métodos , Electroencefalografía/normas , Epilepsia/diagnóstico , Humanos , Reproducibilidad de los Resultados
13.
Cell Syst ; 12(8): 827-838.e5, 2021 08 18.
Artículo en Inglés | MEDLINE | ID: mdl-34146471

RESUMEN

The accurate identification and quantitation of RNA isoforms present in the cancer transcriptome is key for analyses ranging from the inference of the impacts of somatic variants to pathway analysis to biomarker development and subtype discovery. The ICGC-TCGA DREAM Somatic Mutation Calling in RNA (SMC-RNA) challenge was a crowd-sourced effort to benchmark methods for RNA isoform quantification and fusion detection from bulk cancer RNA sequencing (RNA-seq) data. It concluded in 2018 with a comparison of 77 fusion detection entries and 65 isoform quantification entries on 51 synthetic tumors and 32 cell lines with spiked-in fusion constructs. We report the entries used to build this benchmark, the leaderboard results, and the experimental features associated with the accurate prediction of RNA species. This challenge required submissions to be in the form of containerized workflows, meaning each of the entries described is easily reusable through CWL and Docker containers at https://github.com/SMC-RNA-challenge. A record of this paper's transparent peer review process is included in the supplemental information.


Asunto(s)
Neoplasias , Humanos , Neoplasias/genética , Isoformas de Proteínas/genética , ARN/genética , RNA-Seq , Análisis de Secuencia de ARN
14.
J Comput Biol ; 27(9): 1337-1340, 2020 09.
Artículo en Inglés | MEDLINE | ID: mdl-31905016

RESUMEN

The increasing availability of complex data in biology and medicine has promoted the use of machine learning in classification tasks to address important problems in translational and fundamental science. Two important obstacles, however, may limit the unraveling of the full potential of machine learning in these fields: the lack of generalization of the resulting models and the limited number of labeled data sets in some applications. To address these important problems, we developed an unsupervised ensemble algorithm called strategy for unsupervised multiple method aggregation (SUMMA). By virtue of being an ensemble method, SUMMA is more robust to generalization than the predictions it combines. By virtue of being unsupervised, SUMMA does not require labeled data. SUMMA receives as input predictions from a diversity of models and estimates their classification performance even when labeled data are unavailable. It then uses these performance estimates to combine these different predictions into an ensemble model. SUMMA can be applied to a variety of binary classification problems in bioinformatics including but not limited to gene network inference, cancer diagnostics, drug response prediction, somatic mutation, and differential expression calling. In this application note, we introduce the R/PY-SUMMA packages, available in R or Python, that implement the SUMMA algorithm.


Asunto(s)
Biología Computacional/estadística & datos numéricos , Redes Reguladoras de Genes/genética , Aprendizaje Automático no Supervisado/estadística & datos numéricos , Algoritmos , Modelos Estadísticos
15.
Elife ; 92020 09 18.
Artículo en Inglés | MEDLINE | ID: mdl-32945258

RESUMEN

Our ability to discover effective drug combinations is limited, in part by insufficient understanding of how the transcriptional response of two monotherapies results in that of their combination. We analyzed matched time course RNAseq profiling of cells treated with single drugs and their combinations and found that the transcriptional signature of the synergistic combination was unique relative to that of either constituent monotherapy. The sequential activation of transcription factors in time in the gene regulatory network was implicated. The nature of this transcriptional cascade suggests that drug synergy may ensue when the transcriptional responses elicited by two unrelated individual drugs are correlated. We used these results as the basis of a simple prediction algorithm attaining an AUROC of 0.77 in the prediction of synergistic drug combinations in an independent dataset.


Asunto(s)
Combinación de Medicamentos , Sinergismo Farmacológico , Expresión Génica , Redes Reguladoras de Genes/fisiología , Transcriptoma , Algoritmos , Biología Computacional , Humanos , Células MCF-7 , RNA-Seq , Factores de Transcripción/metabolismo
16.
Life Sci Alliance ; 3(11)2020 11.
Artículo en Inglés | MEDLINE | ID: mdl-32972997

RESUMEN

Single-cell RNA-sequencing (scRNAseq) technologies are rapidly evolving. Although very informative, in standard scRNAseq experiments, the spatial organization of the cells in the tissue of origin is lost. Conversely, spatial RNA-seq technologies designed to maintain cell localization have limited throughput and gene coverage. Mapping scRNAseq to genes with spatial information increases coverage while providing spatial location. However, methods to perform such mapping have not yet been benchmarked. To fill this gap, we organized the DREAM Single-Cell Transcriptomics challenge focused on the spatial reconstruction of cells from the Drosophila embryo from scRNAseq data, leveraging as silver standard, genes with in situ hybridization data from the Berkeley Drosophila Transcription Network Project reference atlas. The 34 participating teams used diverse algorithms for gene selection and location prediction, while being able to correctly localize clusters of cells. Selection of predictor genes was essential for this task. Predictor genes showed a relatively high expression entropy, high spatial clustering and included prominent developmental genes such as gap and pair-rule genes and tissue markers. Application of the top 10 methods to a zebra fish embryo dataset yielded similar performance and statistical properties of the selected genes than in the Drosophila data. This suggests that methods developed in this challenge are able to extract generalizable properties of genes that are useful to accurately reconstruct the spatial arrangement of cells in tissues.


Asunto(s)
Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , Análisis de la Célula Individual/métodos , Análisis Espacial , Algoritmos , Animales , Bases de Datos Genéticas , Drosophila/genética , Predicción/métodos , Regulación del Desarrollo de la Expresión Génica/genética , Redes Reguladoras de Genes/genética , Análisis de Secuencia de ARN/métodos , Transcriptoma/genética , Pez Cebra/genética
17.
JAMA Netw Open ; 3(3): e200265, 2020 03 02.
Artículo en Inglés | MEDLINE | ID: mdl-32119094

RESUMEN

Importance: Mammography screening currently relies on subjective human interpretation. Artificial intelligence (AI) advances could be used to increase mammography screening accuracy by reducing missed cancers and false positives. Objective: To evaluate whether AI can overcome human mammography interpretation limitations with a rigorous, unbiased evaluation of machine learning algorithms. Design, Setting, and Participants: In this diagnostic accuracy study conducted between September 2016 and November 2017, an international, crowdsourced challenge was hosted to foster AI algorithm development focused on interpreting screening mammography. More than 1100 participants comprising 126 teams from 44 countries participated. Analysis began November 18, 2016. Main Outcomes and Measurements: Algorithms used images alone (challenge 1) or combined images, previous examinations (if available), and clinical and demographic risk factor data (challenge 2) and output a score that translated to cancer yes/no within 12 months. Algorithm accuracy for breast cancer detection was evaluated using area under the curve and algorithm specificity compared with radiologists' specificity with radiologists' sensitivity set at 85.9% (United States) and 83.9% (Sweden). An ensemble method aggregating top-performing AI algorithms and radiologists' recall assessment was developed and evaluated. Results: Overall, 144 231 screening mammograms from 85 580 US women (952 cancer positive ≤12 months from screening) were used for algorithm training and validation. A second independent validation cohort included 166 578 examinations from 68 008 Swedish women (780 cancer positive). The top-performing algorithm achieved an area under the curve of 0.858 (United States) and 0.903 (Sweden) and 66.2% (United States) and 81.2% (Sweden) specificity at the radiologists' sensitivity, lower than community-practice radiologists' specificity of 90.5% (United States) and 98.5% (Sweden). Combining top-performing algorithms and US radiologist assessments resulted in a higher area under the curve of 0.942 and achieved a significantly improved specificity (92.0%) at the same sensitivity. Conclusions and Relevance: While no single AI algorithm outperformed radiologists, an ensemble of AI algorithms combined with radiologist assessment in a single-reader screening environment improved overall accuracy. This study underscores the potential of using machine learning methods for enhancing mammography screening interpretation.


Asunto(s)
Neoplasias de la Mama/diagnóstico por imagen , Aprendizaje Profundo , Interpretación de Imagen Asistida por Computador/métodos , Mamografía/métodos , Radiólogos , Adulto , Anciano , Algoritmos , Inteligencia Artificial , Detección Precoz del Cáncer , Femenino , Humanos , Persona de Mediana Edad , Radiología , Sensibilidad y Especificidad , Suecia , Estados Unidos
18.
J Natl Cancer Inst ; 112(2): 179-190, 2020 02 01.
Artículo en Inglés | MEDLINE | ID: mdl-31095341

RESUMEN

BACKGROUND: A total of 10%-20% of patients develop long-term toxicity following radiotherapy for prostate cancer. Identification of common genetic variants associated with susceptibility to radiotoxicity might improve risk prediction and inform functional mechanistic studies. METHODS: We conducted an individual patient data meta-analysis of six genome-wide association studies (n = 3871) in men of European ancestry who underwent radiotherapy for prostate cancer. Radiotoxicities (increased urinary frequency, decreased urinary stream, hematuria, rectal bleeding) were graded prospectively. We used grouped relative risk models to test associations with approximately 6 million genotyped or imputed variants (time to first grade 2 or higher toxicity event). Variants with two-sided Pmeta less than 5 × 10-8 were considered statistically significant. Bayesian false discovery probability provided an additional measure of confidence. Statistically significant variants were evaluated in three Japanese cohorts (n = 962). All statistical tests were two-sided. RESULTS: Meta-analysis of the European ancestry cohorts identified three genomic signals: single nucleotide polymorphism rs17055178 with rectal bleeding (Pmeta = 6.2 × 10-10), rs10969913 with decreased urinary stream (Pmeta = 2.9 × 10-10), and rs11122573 with hematuria (Pmeta = 1.8 × 10-8). Fine-scale mapping of these three regions was used to identify another independent signal (rs147121532) associated with hematuria (Pconditional = 4.7 × 10-6). Credible causal variants at these four signals lie in gene-regulatory regions, some modulating expression of nearby genes. Previously identified variants showed consistent associations (rs17599026 with increased urinary frequency, rs7720298 with decreased urinary stream, rs1801516 with overall toxicity) in new cohorts. rs10969913 and rs17599026 had similar effects in the photon-treated Japanese cohorts. CONCLUSIONS: This study increases the understanding of the architecture of common genetic variants affecting radiotoxicity, points to novel radio-pathogenic mechanisms, and develops risk models for testing in clinical studies. Further multinational radiogenomics studies in larger cohorts are worthwhile.

19.
Sci Rep ; 9(1): 12970, 2019 09 10.
Artículo en Inglés | MEDLINE | ID: mdl-31506535

RESUMEN

Biological and regulatory mechanisms underlying many multi-gene expression-based disease biomarkers are often not readily evident. We describe an innovative framework, NeTFactor, that combines network analyses with gene expression data to identify transcription factors (TFs) that significantly and maximally regulate such a biomarker. NeTFactor uses a computationally-inferred context-specific gene regulatory network and applies topological, statistical, and optimization methods to identify regulator TFs. Application of NeTFactor to a multi-gene expression-based asthma biomarker identified ETS translocation variant 4 (ETV4) and peroxisome proliferator-activated receptor gamma (PPARG) as the biomarker's most significant TF regulators. siRNA-based knock down of these TFs in an airway epithelial cell line model demonstrated significant reduction of cytokine expression relevant to asthma, validating NeTFactor's top-scoring findings. While PPARG has been associated with airway inflammation, ETV4 has not yet been implicated in asthma, thus indicating the possibility of novel, disease-relevant discovery by NeTFactor. We also show that NeTFactor's results are robust when the gene regulatory network and biomarker are derived from independent data. Additionally, our application of NeTFactor to a different disease biomarker identified TF regulators of interest. These results illustrate that the application of NeTFactor to multi-gene expression-based biomarkers could yield valuable insights into regulatory mechanisms and biological processes underlying disease.


Asunto(s)
Algoritmos , Asma/genética , Asma/patología , Biomarcadores/análisis , Regulación de la Expresión Génica , Redes Reguladoras de Genes , Estudios de Casos y Controles , Estudios de Cohortes , Perfilación de la Expresión Génica , Humanos , Transducción de Señal , Factores de Transcripción/genética , Factores de Transcripción/metabolismo
20.
Nat Commun ; 10(1): 2674, 2019 06 17.
Artículo en Inglés | MEDLINE | ID: mdl-31209238

RESUMEN

The effectiveness of most cancer targeted therapies is short-lived. Tumors often develop resistance that might be overcome with drug combinations. However, the number of possible combinations is vast, necessitating data-driven approaches to find optimal patient-specific treatments. Here we report AstraZeneca's large drug combination dataset, consisting of 11,576 experiments from 910 combinations across 85 molecularly characterized cancer cell lines, and results of a DREAM Challenge to evaluate computational strategies for predicting synergistic drug pairs and biomarkers. 160 teams participated to provide a comprehensive methodological development and benchmarking. Winning methods incorporate prior knowledge of drug-target interactions. Synergy is predicted with an accuracy matching biological replicates for >60% of combinations. However, 20% of drug combinations are poorly predicted by all methods. Genomic rationale for synergy predictions are identified, including ADAM17 inhibitor antagonism when combined with PIK3CB/D inhibition contrasting to synergy when combined with other PI3K-pathway inhibitors in PIK3CA mutant cells.


Asunto(s)
Protocolos de Quimioterapia Combinada Antineoplásica/farmacología , Biología Computacional/métodos , Neoplasias/tratamiento farmacológico , Farmacogenética/métodos , Proteína ADAM17/antagonistas & inhibidores , Protocolos de Quimioterapia Combinada Antineoplásica/uso terapéutico , Benchmarking , Biomarcadores de Tumor/genética , Línea Celular Tumoral , Biología Computacional/normas , Conjuntos de Datos como Asunto , Antagonismo de Drogas , Resistencia a Antineoplásicos/efectos de los fármacos , Resistencia a Antineoplásicos/genética , Sinergismo Farmacológico , Genómica/métodos , Humanos , Terapia Molecular Dirigida/métodos , Mutación , Neoplasias/genética , Farmacogenética/normas , Fosfatidilinositol 3-Quinasas/genética , Inhibidores de las Quinasa Fosfoinosítidos-3 , Resultado del Tratamiento
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA