Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 78.000
Filtrar
Más filtros

Intervalo de año de publicación
1.
Cell ; 184(9): 2372-2383.e9, 2021 04 29.
Artículo en Inglés | MEDLINE | ID: mdl-33743213

RESUMEN

Vaccination elicits immune responses capable of potently neutralizing SARS-CoV-2. However, ongoing surveillance has revealed the emergence of variants harboring mutations in spike, the main target of neutralizing antibodies. To understand the impact of these variants, we evaluated the neutralization potency of 99 individuals that received one or two doses of either BNT162b2 or mRNA-1273 vaccines against pseudoviruses representing 10 globally circulating strains of SARS-CoV-2. Five of the 10 pseudoviruses, harboring receptor-binding domain mutations, including K417N/T, E484K, and N501Y, were highly resistant to neutralization. Cross-neutralization of B.1.351 variants was comparable to SARS-CoV and bat-derived WIV1-CoV, suggesting that a relatively small number of mutations can mediate potent escape from vaccine responses. While the clinical impact of neutralization resistance remains uncertain, these results highlight the potential for variants to escape from neutralizing humoral immunity and emphasize the need to develop broadly protective interventions against the evolving pandemic.


Asunto(s)
Anticuerpos Neutralizantes/inmunología , Anticuerpos Antivirales/inmunología , Vacunas contra la COVID-19/inmunología , Inmunidad Humoral , SARS-CoV-2/inmunología , Vacuna BNT162 , COVID-19/sangre , COVID-19/inmunología , COVID-19/virología , Células HEK293 , Humanos , Mutación/genética , Curva ROC , SARS-CoV-2/genética
2.
Cell ; 182(2): 317-328.e10, 2020 07 23.
Artículo en Inglés | MEDLINE | ID: mdl-32526205

RESUMEN

Hepatocellular carcinoma (HCC) is an aggressive malignancy with its global incidence and mortality rate continuing to rise, although early detection and surveillance are suboptimal. We performed serological profiling of the viral infection history in 899 individuals from an NCI-UMD case-control study using a synthetic human virome, VirScan. We developed a viral exposure signature and validated the results in a longitudinal cohort with 173 at-risk patients who had long-term follow-up for HCC development. Our viral exposure signature significantly associated with HCC status among at-risk individuals in the validation cohort (area under the curve: 0.91 [95% CI 0.87-0.96] at baseline and 0.98 [95% CI 0.97-1] at diagnosis). The signature identified cancer patients prior to a clinical diagnosis and was superior to alpha-fetoprotein. In summary, we established a viral exposure signature that can predict HCC among at-risk patients prior to a clinical diagnosis, which may be useful in HCC surveillance.


Asunto(s)
Carcinoma Hepatocelular/patología , Neoplasias Hepáticas/patología , Virosis/patología , Adulto , Anciano , Área Bajo la Curva , Carcinoma Hepatocelular/genética , Carcinoma Hepatocelular/metabolismo , Estudios de Casos y Controles , Estudios de Cohortes , Bases de Datos Genéticas , Femenino , Estudio de Asociación del Genoma Completo , Humanos , Desequilibrio de Ligamiento , Neoplasias Hepáticas/genética , Neoplasias Hepáticas/metabolismo , Masculino , Persona de Mediana Edad , Polimorfismo de Nucleótido Simple , Curva ROC , Factores de Riesgo , Virosis/complicaciones , Adulto Joven , alfa-Fetoproteínas/análisis
3.
Cell ; 178(2): 447-457.e5, 2019 07 11.
Artículo en Inglés | MEDLINE | ID: mdl-31257030

RESUMEN

Neurons in cortical circuits are often coactivated as ensembles, yet it is unclear whether ensembles play a functional role in behavior. Some ensemble neurons have pattern completion properties, triggering the entire ensemble when activated. Using two-photon holographic optogenetics in mouse primary visual cortex, we tested whether recalling ensembles by activating pattern completion neurons alters behavioral performance in a visual task. Disruption of behaviorally relevant ensembles by activation of non-selective neurons decreased performance, whereas activation of only two pattern completion neurons from behaviorally relevant ensembles improved performance, by reliably recalling the whole ensemble. Also, inappropriate behavioral choices were evoked by the mistaken activation of behaviorally relevant ensembles. Finally, in absence of visual stimuli, optogenetic activation of two pattern completion neurons could trigger behaviorally relevant ensembles and correct behavioral responses. Our results demonstrate a causal role of neuronal ensembles in a visually guided behavior and suggest that ensembles implement internal representations of perceptual states.


Asunto(s)
Conducta Animal , Corteza Visual/fisiología , Animales , Área Bajo la Curva , Calcio/metabolismo , Holografía , Procesamiento de Imagen Asistido por Computador , Masculino , Ratones , Ratones Endogámicos C57BL , Neuronas/metabolismo , Optogenética/métodos , Estimulación Luminosa , Fotones , Curva ROC
4.
Cell ; 172(5): 1122-1131.e9, 2018 02 22.
Artículo en Inglés | MEDLINE | ID: mdl-29474911

RESUMEN

The implementation of clinical-decision support algorithms for medical imaging faces challenges with reliability and interpretability. Here, we establish a diagnostic tool based on a deep-learning framework for the screening of patients with common treatable blinding retinal diseases. Our framework utilizes transfer learning, which trains a neural network with a fraction of the data of conventional approaches. Applying this approach to a dataset of optical coherence tomography images, we demonstrate performance comparable to that of human experts in classifying age-related macular degeneration and diabetic macular edema. We also provide a more transparent and interpretable diagnosis by highlighting the regions recognized by the neural network. We further demonstrate the general applicability of our AI system for diagnosis of pediatric pneumonia using chest X-ray images. This tool may ultimately aid in expediting the diagnosis and referral of these treatable conditions, thereby facilitating earlier treatment, resulting in improved clinical outcomes. VIDEO ABSTRACT.


Asunto(s)
Aprendizaje Profundo , Diagnóstico por Imagen , Neumonía/diagnóstico , Niño , Humanos , Redes Neurales de la Computación , Neumonía/diagnóstico por imagen , Curva ROC , Reproducibilidad de los Resultados , Tomografía de Coherencia Óptica
5.
Cell ; 174(6): 1361-1372.e10, 2018 09 06.
Artículo en Inglés | MEDLINE | ID: mdl-30193110

RESUMEN

A key aspect of genomic medicine is to make individualized clinical decisions from personal genomes. We developed a machine-learning framework to integrate personal genomes and electronic health record (EHR) data and used this framework to study abdominal aortic aneurysm (AAA), a prevalent irreversible cardiovascular disease with unclear etiology. Performing whole-genome sequencing on AAA patients and controls, we demonstrated its predictive precision solely from personal genomes. By modeling personal genomes with EHRs, this framework quantitatively assessed the effectiveness of adjusting personal lifestyles given personal genome baselines, demonstrating its utility as a personal health management tool. We showed that this new framework agnostically identified genetic components involved in AAA, which were subsequently validated in human aortic tissues and in murine models. Our study presents a new framework for disease genome analysis, which can be used for both health management and understanding the biological architecture of complex diseases. VIDEO ABSTRACT.


Asunto(s)
Aneurisma de la Aorta Abdominal/patología , Genómica , Animales , Aneurisma de la Aorta Abdominal/genética , Área Bajo la Curva , Modelos Animales de Enfermedad , Regulación de la Expresión Génica , Redes Reguladoras de Genes , Estudio de Asociación del Genoma Completo , Humanos , Aprendizaje Automático , Ratones , Polimorfismo de Nucleótido Simple , Mapas de Interacción de Proteínas , Curva ROC , Secuenciación Completa del Genoma
6.
Brief Bioinform ; 25(2)2024 Jan 22.
Artículo en Inglés | MEDLINE | ID: mdl-38385875

RESUMEN

Metabolomics and foodomics shed light on the molecular processes within living organisms and the complex food composition by leveraging sophisticated analytical techniques to systematically analyze the vast array of molecular features. The traditional feature-picking method often results in arbitrary selections of the model, feature ranking, and cut-off, which may lead to suboptimal results. Thus, a Multiple and Optimal Screening Subset (MOSS) approach was developed in this study to achieve a balance between a minimal number of predictors and high predictive accuracy during statistical model setup. The MOSS approach compares five commonly used models in the context of food matrix analysis, specifically bourbons. These models include Student's t-test, receiver operating characteristic curve, partial least squares-discriminant analysis (PLS-DA), random forests, and support vector machines. The approach employs cross-validation to identify promising subset feature candidates that contribute to food characteristic classification. It then determines the optimal subset size by comparing it to the corresponding top-ranked features. Finally, it selects the optimal feature subset by traversing all possible feature candidate combinations. By utilizing MOSS approach to analyze 1406 mass spectral features from a collection of 122 bourbon samples, we were able to generate a subset of features for bourbon age prediction with 88% accuracy. Additionally, MOSS increased the area under the curve performance of sweetness prediction to 0.898 with only four predictors compared with the top-ranked four features at 0.681 based on the PLS-DA model. Overall, we demonstrated that MOSS provides an efficient and effective approach for selecting optimal features compared with other frequently utilized methods.


Asunto(s)
Metabolómica , Proyectos de Investigación , Análisis Discriminante , Modelos Estadísticos , Curva ROC
7.
Brief Bioinform ; 25(3)2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38770720

RESUMEN

The normalization of RNA sequencing data is a primary step for downstream analysis. The most popular method used for the normalization is the trimmed mean of M values (TMM) and DESeq. The TMM tries to trim away extreme log fold changes of the data to normalize the raw read counts based on the remaining non-deferentially expressed genes. However, the major problem with the TMM is that the values of trimming factor M are heuristic. This paper tries to estimate the adaptive value of M in TMM based on Jaeckel's Estimator, and each sample acts as a reference to find the scale factor of each sample. The presented approach is validated on SEQC, MAQC2, MAQC3, PICKRELL and two simulated datasets with two-group and three-group conditions by varying the percentage of differential expression and the number of replicates. The performance of the present approach is compared with various state-of-the-art methods, and it is better in terms of area under the receiver operating characteristic curve and differential expression.


Asunto(s)
RNA-Seq , RNA-Seq/métodos , Humanos , Algoritmos , Análisis de Secuencia de ARN/métodos , Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , Curva ROC , Programas Informáticos
8.
Brief Bioinform ; 25(2)2024 Jan 22.
Artículo en Inglés | MEDLINE | ID: mdl-38426324

RESUMEN

Emerging clinical evidence suggests that sophisticated associations with circular ribonucleic acids (RNAs) (circRNAs) and microRNAs (miRNAs) are a critical regulatory factor of various pathological processes and play a critical role in most intricate human diseases. Nonetheless, the above correlations via wet experiments are error-prone and labor-intensive, and the underlying novel circRNA-miRNA association (CMA) has been validated by numerous existing computational methods that rely only on single correlation data. Considering the inadequacy of existing machine learning models, we propose a new model named BGF-CMAP, which combines the gradient boosting decision tree with natural language processing and graph embedding methods to infer associations between circRNAs and miRNAs. Specifically, BGF-CMAP extracts sequence attribute features and interaction behavior features by Word2vec and two homogeneous graph embedding algorithms, large-scale information network embedding and graph factorization, respectively. Multitudinous comprehensive experimental analysis revealed that BGF-CMAP successfully predicted the complex relationship between circRNAs and miRNAs with an accuracy of 82.90% and an area under receiver operating characteristic of 0.9075. Furthermore, 23 of the top 30 miRNA-associated circRNAs of the studies on data were confirmed in relevant experiences, showing that the BGF-CMAP model is superior to others. BGF-CMAP can serve as a helpful model to provide a scientific theoretical basis for the study of CMA prediction.


Asunto(s)
MicroARNs , Humanos , MicroARNs/genética , ARN Circular/genética , Curva ROC , Aprendizaje Automático , Algoritmos , Biología Computacional/métodos
9.
Brief Bioinform ; 25(5)2024 Jul 25.
Artículo en Inglés | MEDLINE | ID: mdl-39154195

RESUMEN

The microRNAs (miRNAs) play crucial roles in several biological processes. It is essential for a deeper insight into their functions and mechanisms by detecting their subcellular localizations. The traditional methods for determining miRNAs subcellular localizations are expensive. The computational methods are alternative ways to quickly predict miRNAs subcellular localizations. Although several computational methods have been proposed in this regard, the incomplete representations of miRNAs in these methods left the room for improvement. In this study, a novel computational method for predicting miRNA subcellular localizations, named PMiSLocMF, was developed. As lots of miRNAs have multiple subcellular localizations, this method was a multi-label classifier. Several properties of miRNA, such as miRNA sequences, miRNA functional similarity, miRNA-disease, miRNA-drug, and miRNA-mRNA associations were adopted for generating informative miRNA features. To this end, powerful algorithms [node2vec and graph attention auto-encoder (GATE)] and one newly designed scheme were adopted to process above properties, producing five feature types. All features were poured into self-attention and fully connected layers to make predictions. The cross-validation results indicated the high performance of PMiSLocMF with accuracy higher than 0.83, average area under the receiver operating characteristic curve (AUC) and area under the precision-recall curve (AUPR) exceeding 0.90 and 0.77, respectively. Such performance was better than all previous methods based on the same dataset. Further tests proved that using all feature types can improve the performance of PMiSLocMF, and GATE and self-attention layer can help enhance the performance. Finally, we deeply analyzed the influence of miRNA associations with diseases, drugs, and mRNAs on PMiSLocMF. The dataset and codes are available at https://github.com/Gu20201017/PMiSLocMF.


Asunto(s)
Algoritmos , Biología Computacional , MicroARNs , MicroARNs/genética , MicroARNs/metabolismo , Biología Computacional/métodos , Humanos , Programas Informáticos , ARN Mensajero/genética , ARN Mensajero/metabolismo , Curva ROC
10.
Brief Bioinform ; 25(5)2024 Jul 25.
Artículo en Inglés | MEDLINE | ID: mdl-39222060

RESUMEN

Instruction-tuned large language models (LLMs) demonstrate exceptional ability to align with human intentions. We present an LLM-based model-instruction-tuned LLM for assessment of cancer (iLLMAC)-that can detect cancer using cell-free deoxyribonucleic acid (cfDNA) end-motif profiles. Developed on plasma cfDNA sequencing data from 1135 cancer patients and 1106 controls across three datasets, iLLMAC achieved area under the receiver operating curve (AUROC) of 0.866 [95% confidence interval (CI), 0.773-0.959] for cancer diagnosis and 0.924 (95% CI, 0.841-1.0) for hepatocellular carcinoma (HCC) detection using 16 end-motifs. Performance increased with more motifs, reaching 0.886 (95% CI, 0.794-0.977) and 0.956 (95% CI, 0.89-1.0) for cancer diagnosis and HCC detection, respectively, with 64 end-motifs. On an external-testing set, iLLMAC achieved AUROC of 0.912 (95% CI, 0.849-0.976) for cancer diagnosis and 0.938 (95% CI, 0.885-0.992) for HCC detection with 64 end-motifs, significantly outperforming benchmarked methods. Furthermore, iLLMAC achieved high classification performance on datasets with bisulfite and 5-hydroxymethylcytosine sequencing. Our study highlights the effectiveness of LLM-based instruction-tuning for cfDNA-based cancer detection.


Asunto(s)
Carcinoma Hepatocelular , Ácidos Nucleicos Libres de Células , Humanos , Ácidos Nucleicos Libres de Células/sangre , Carcinoma Hepatocelular/diagnóstico , Carcinoma Hepatocelular/genética , Carcinoma Hepatocelular/sangre , Neoplasias Hepáticas/diagnóstico , Neoplasias Hepáticas/genética , Neoplasias Hepáticas/sangre , Neoplasias/diagnóstico , Neoplasias/genética , Neoplasias/sangre , Curva ROC , Biomarcadores de Tumor/genética , Biomarcadores de Tumor/sangre , Motivos de Nucleótidos , Metilación de ADN
11.
Nature ; 587(7834): 448-454, 2020 11.
Artículo en Inglés | MEDLINE | ID: mdl-33149306

RESUMEN

Low concordance between studies that examine the role of microbiota in human diseases is a pervasive challenge that limits the capacity to identify causal relationships between host-associated microorganisms and pathology. The risk of obtaining false positives is exacerbated by wide interindividual heterogeneity in microbiota composition1, probably due to population-wide differences in human lifestyle and physiological variables2 that exert differential effects on the microbiota. Here we infer the greatest, generalized sources of heterogeneity in human gut microbiota profiles and also identify human lifestyle and physiological characteristics that, if not evenly matched between cases and controls, confound microbiota analyses to produce spurious microbial associations with human diseases. We identify alcohol consumption frequency and bowel movement quality as unexpectedly strong sources of gut microbiota variance that differ in distribution between healthy participants and participants with a disease and that can confound study designs. We demonstrate that for numerous prevalent, high-burden human diseases, matching cases and controls for confounding variables reduces observed differences in the microbiota and the incidence of spurious associations. On this basis, we present a list of host variables that we recommend should be captured in human microbiota studies for the purpose of matching comparison groups, which we anticipate will increase robustness and reproducibility in resolving the members of the gut microbiota that are truly associated with human disease.


Asunto(s)
Factores de Confusión Epidemiológicos , Análisis de Datos , Dieta , Enfermedad , Microbioma Gastrointestinal/fisiología , Estilo de Vida , Aprendizaje Automático , Adulto , Anciano , Anciano de 80 o más Años , Consumo de Bebidas Alcohólicas , Área Bajo la Curva , Índice de Masa Corporal , Estudios de Casos y Controles , Diabetes Mellitus Tipo 2 , Heces/microbiología , Femenino , Motilidad Gastrointestinal , Humanos , Masculino , Persona de Mediana Edad , ARN Ribosómico 16S/genética , Curva ROC , Características de la Residencia , Adulto Joven
12.
Circulation ; 150(12): 911-922, 2024 Sep 17.
Artículo en Inglés | MEDLINE | ID: mdl-38881496

RESUMEN

BACKGROUND: Artificial intelligence, particularly deep learning (DL), has immense potential to improve the interpretation of transthoracic echocardiography (TTE). Mitral regurgitation (MR) is the most common valvular heart disease and presents unique challenges for DL, including the integration of multiple video-level assessments into a final study-level classification. METHODS: A novel DL system was developed to intake complete TTEs, identify color MR Doppler videos, and determine MR severity on a 4-step ordinal scale (none/trace, mild, moderate, and severe) using the reading cardiologist as a reference standard. This DL system was tested in internal and external test sets with performance assessed by agreement with the reading cardiologist, weighted κ, and area under the receiver-operating characteristic curve for binary classification of both moderate or greater and severe MR. In addition to the primary 4-step model, a 6-step MR assessment model was studied with the addition of the intermediate MR classes of mild-moderate and moderate-severe with performance assessed by both exact agreement and ±1 step agreement with the clinical MR interpretation. RESULTS: A total of 61 689 TTEs were split into train (n=43 811), validation (n=8891), and internal test (n=8987) sets with an additional external test set of 8208 TTEs. The model had high performance in MR classification in internal (exact accuracy, 82%; κ=0.84; area under the receiver-operating characteristic curve, 0.98 for moderate or greater MR) and external test sets (exact accuracy, 79%; κ=0.80; area under the receiver-operating characteristic curve, 0.98 for moderate or greater MR). Most (63% internal and 66% external) misclassification disagreements were between none/trace and mild MR. MR classification accuracy was slightly higher using multiple TTE views (accuracy, 82%) than with only apical 4-chamber views (accuracy, 80%). In subset analyses, the model was accurate in the classification of both primary and secondary MR with slightly lower performance in cases of eccentric MR. In the analysis of the 6-step classification system, the exact accuracy was 80% and 76% with a ±1 step agreement of 99% and 98% in the internal and external test set, respectively. CONCLUSIONS: This end-to-end DL system can intake entire echocardiogram studies to accurately classify MR severity and may be useful in helping clinicians refine MR assessments.


Asunto(s)
Aprendizaje Profundo , Insuficiencia de la Válvula Mitral , Insuficiencia de la Válvula Mitral/diagnóstico por imagen , Insuficiencia de la Válvula Mitral/fisiopatología , Insuficiencia de la Válvula Mitral/clasificación , Humanos , Masculino , Femenino , Anciano , Persona de Mediana Edad , Ecocardiografía/métodos , Índice de Severidad de la Enfermedad , Válvula Mitral/diagnóstico por imagen , Curva ROC
13.
Am J Hum Genet ; 109(2): 195-209, 2022 02 03.
Artículo en Inglés | MEDLINE | ID: mdl-35032432

RESUMEN

Whole-genome sequencing resolves many clinical cases where standard diagnostic methods have failed. However, at least half of these cases remain unresolved after whole-genome sequencing. Structural variants (SVs; genomic variants larger than 50 base pairs) of uncertain significance are the genetic cause of a portion of these unresolved cases. As sequencing methods using long or linked reads become more accessible and SV detection algorithms improve, clinicians and researchers are gaining access to thousands of reliable SVs of unknown disease relevance. Methods to predict the pathogenicity of these SVs are required to realize the full diagnostic potential of long-read sequencing. To address this emerging need, we developed StrVCTVRE to distinguish pathogenic SVs from benign SVs that overlap exons. In a random forest classifier, we integrated features that capture gene importance, coding region, conservation, expression, and exon structure. We found that features such as expression and conservation are important but are absent from SV classification guidelines. We leveraged multiple resources to construct a size-matched training set of rare, putatively benign and pathogenic SVs. StrVCTVRE performs accurately across a wide SV size range on independent test sets, which will allow clinicians and researchers to eliminate about half of SVs from consideration while retaining a 90% sensitivity. We anticipate clinicians and researchers will use StrVCTVRE to prioritize SVs in probands where no SV is immediately compelling, empowering deeper investigation into novel SVs to resolve cases and understand new mechanisms of disease. StrVCTVRE runs rapidly and is publicly available.


Asunto(s)
Algoritmos , Genoma Humano , Variación Estructural del Genoma , Programas Informáticos , Aprendizaje Automático Supervisado , Conjuntos de Datos como Asunto , Exones , Genómica/métodos , Humanos , Curva ROC , Secuenciación Completa del Genoma/estadística & datos numéricos
14.
Gastroenterology ; 167(3): 591-603.e9, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-38583724

RESUMEN

BACKGROUND & AIMS: Benign ulcerative colorectal diseases (UCDs) such as ulcerative colitis, Crohn's disease, ischemic colitis, and intestinal tuberculosis share similar phenotypes with different etiologies and treatment strategies. To accurately diagnose closely related diseases like UCDs, we hypothesize that contextual learning is critical in enhancing the ability of the artificial intelligence models to differentiate the subtle differences in lesions amidst the vastly divergent spatial contexts. METHODS: White-light colonoscopy datasets of patients with confirmed UCDs and healthy controls were retrospectively collected. We developed a Multiclass Contextual Classification (MCC) model that can differentiate among the mentioned UCDs and healthy controls by incorporating the tissue object contexts surrounding the individual lesion region in a scene and spatial information from other endoscopic frames (video-level) into a unified framework. Internal and external datasets were used to validate the model's performance. RESULTS: Training datasets included 762 patients, and the internal and external testing cohorts included 257 patients and 293 patients, respectively. Our MCC model provided a rapid reference diagnosis on internal test sets with a high averaged area under the receiver operating characteristic curve (image-level: 0.950 and video-level: 0.973) and balanced accuracy (image-level: 76.1% and video-level: 80.8%), which was superior to junior endoscopists (accuracy: 71.8%, P < .0001) and similar to experts (accuracy: 79.7%, P = .732). The MCC model achieved an area under the receiver operating characteristic curve of 0.988 and balanced accuracy of 85.8% using external testing datasets. CONCLUSIONS: These results enable this model to fit in the routine endoscopic workflow, and the contextual framework to be adopted for diagnosing other closely related diseases.


Asunto(s)
Inteligencia Artificial , Colitis Ulcerosa , Colonoscopía , Humanos , Colitis Ulcerosa/diagnóstico , Estudios Retrospectivos , Femenino , Masculino , Persona de Mediana Edad , Adulto , Interpretación de Imagen Asistida por Computador/métodos , Curva ROC , Anciano , Reproducibilidad de los Resultados , Colon/patología , Colon/diagnóstico por imagen , Valor Predictivo de las Pruebas , Diagnóstico Diferencial , Grabación en Video , Aprendizaje Automático , Estudios de Casos y Controles
15.
Gastroenterology ; 167(2): 357-367.e9, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38513745

RESUMEN

BACKGROUND & AIMS: There is an unmet need for noninvasive tests to improve case-finding and aid primary care professionals in referring patients at high risk of liver disease. METHODS: A metabolic dysfunction-associated fibrosis (MAF-5) score was developed and externally validated in a total of 21,797 individuals with metabolic dysfunction in population-based (National Health and Nutrition Examination Survey 2017-2020, National Health and Nutrition Examination Survey III, and Rotterdam Study) and hospital-based (from Antwerp and Bogota) cohorts. Fibrosis was defined as liver stiffness ≥8.0 kPa. Diagnostic accuracy was compared with FIB-4, nonalcoholic fatty liver disease fibrosis score (NFS), LiverRisk score and steatosis-associated fibrosis estimator (SAFE). MAF-5 was externally validated with liver stiffness measurement ≥8.0 kPa, with shear-wave elastography ≥7.5 kPa, and biopsy-proven steatotic liver disease according to Metavir and Nonalcoholic Steatohepatitis Clinical Research Network scores, and was tested for prognostic performance (all-cause mortality). RESULTS: The MAF-5 score comprised waist circumference, body mass index (calculated as kg / m2), diabetes, aspartate aminotransferase, and platelets. With this score, 60.9% was predicted at low, 14.1% at intermediate, and 24.9% at high risk of fibrosis. The observed prevalence was 3.3%, 7.9%, and 28.1%, respectively. The area under the receiver operator curve of MAF-5 (0.81) was significantly higher than FIB-4 (0.61), and outperformed the FIB-4 among young people (negative predictive value [NPV], 99%; area under the curve [AUC], 0.86 vs NPV, 94%; AUC, 0.51) and older adults (NPV, 94%; AUC, 0.75 vs NPV, 88%; AUC, 0.55). MAF-5 showed excellent performance to detect liver stiffness measurement ≥12 kPa (AUC, 0.86 training; AUC, 0.85 validation) and good performance in detecting liver stiffness and biopsy-proven liver fibrosis among the external validation cohorts. MAF-5 score >1 was associated with increased risk of all-cause mortality in (un)adjusted models (adjusted hazard ratio, 1.59; 95% CI, 1.47-1.73). CONCLUSIONS: The MAF-5 score is a validated, age-independent, inexpensive referral tool to identify individuals at high risk of liver fibrosis and all-cause mortality in primary care populations, using simple variables.


Asunto(s)
Diagnóstico por Imagen de Elasticidad , Cirrosis Hepática , Valor Predictivo de las Pruebas , Humanos , Masculino , Femenino , Cirrosis Hepática/diagnóstico , Cirrosis Hepática/epidemiología , Cirrosis Hepática/patología , Cirrosis Hepática/etiología , Persona de Mediana Edad , Medición de Riesgo , Anciano , Pronóstico , Índice de Masa Corporal , Factores de Riesgo , Circunferencia de la Cintura , Encuestas Nutricionales , Enfermedad del Hígado Graso no Alcohólico/epidemiología , Enfermedad del Hígado Graso no Alcohólico/diagnóstico , Enfermedad del Hígado Graso no Alcohólico/patología , Adulto , Aspartato Aminotransferasas/sangre , Recuento de Plaquetas , Hígado/patología , Hígado/diagnóstico por imagen , Países Bajos/epidemiología , Biopsia , Curva ROC , Reproducibilidad de los Resultados
16.
Brief Bioinform ; 24(3)2023 05 19.
Artículo en Inglés | MEDLINE | ID: mdl-37039696

RESUMEN

The ability to identify B-cell epitopes is an essential step in vaccine design, immunodiagnostic tests and antibody production. Several computational approaches have been proposed to identify, from an antigen protein or peptide sequence, which residues are more likely to be part of an epitope, but have limited performance on relatively homogeneous data sets and lack interpretability, limiting biological insights that could otherwise be obtained. To address these limitations, we have developed epitope1D, an explainable machine learning method capable of accurately identifying linear B-cell epitopes, leveraging two new descriptors: a graph-based signature representation of protein sequences, based on our well-established Cutoff Scanning Matrix algorithm and Organism Ontology information. Our model achieved Areas Under the ROC curve of up to 0.935 on cross-validation and blind tests, demonstrating robust performance. A comprehensive comparison to alternative methods using distinct benchmark data sets was also employed, with our model outperforming state-of-the-art tools. epitope1D represents not only a significant advance in predictive performance, but also allows biologically meaningful features to be combined and used for model interpretation. epitope1D has been made available as a user-friendly web server interface and application programming interface at https://biosig.lab.uq.edu.au/epitope1d/.


Asunto(s)
Algoritmos , Epítopos de Linfocito B , Secuencia de Aminoácidos , Curva ROC
17.
Brief Bioinform ; 24(4)2023 07 20.
Artículo en Inglés | MEDLINE | ID: mdl-37369639

RESUMEN

DNA methylation plays a crucial role in transcriptional regulation. Reduced representation bisulfite sequencing (RRBS) is a technique of increasing use for analyzing genome-wide methylation profiles. Many computational tools such as Metilene, MethylKit, BiSeq and DMRfinder have been developed to use RRBS data for the detection of the differentially methylated regions (DMRs) potentially involved in epigenetic regulations of gene expression. For DMR detection tools, as for countless other medical applications, P-values and their adjustments are among the most standard reporting statistics used to assess the statistical significance of biological findings. However, P-values are coming under increasing criticism relating to their questionable accuracy and relatively high levels of false positive or negative indications. Here, we propose a method to calculate E-values, as likelihood ratios falling into the null hypothesis over the entire parameter space, for DMR detection in RRBS data. We also provide the R package 'metevalue' as a user-friendly interface to implement E-value calculations into various DMR detection tools. To evaluate the performance of E-values, we generated various RRBS benchmarking datasets using our simulator 'RRBSsim' with eight samples in each experimental group. Our comprehensive benchmarking analyses showed that using E-values not only significantly improved accuracy, area under ROC curve and power, over that of P-values or adjusted P-values, but also reduced false discovery rates and type I errors. In applications using real RRBS data of CRL rats and a clinical trial on low-salt diet, the use of E-values detected biologically more relevant DMRs and also improved the negative association between DNA methylation and gene expression.


Asunto(s)
Metilación de ADN , Animales , Ratas , Análisis de Secuencia de ADN/métodos , Curva ROC , Islas de CpG
18.
Brief Bioinform ; 24(3)2023 05 19.
Artículo en Inglés | MEDLINE | ID: mdl-37013942

RESUMEN

Identifying protein-protein interaction (PPI) site is an important step in understanding biological activity, apprehending pathological mechanism and designing novel drugs. Developing reliable computational methods for predicting PPI site as screening tools contributes to reduce lots of time and expensive costs for conventional experiments, but how to improve the accuracy is still challenging. We propose a PPI site predictor, called Augmented Graph Attention Network Protein-Protein Interacting Site (AGAT-PPIS), based on AGAT with initial residual and identity mapping, in which eight AGAT layers are connected to mine node embedding representation deeply. AGAT is our augmented version of graph attention network, with added edge features. Besides, extra node features and edge features are introduced to provide more structural information and increase the translation and rotation invariance of the model. On the benchmark test set, AGAT-PPIS significantly surpasses the state-of-the-art method by 8% in Accuracy, 17.1% in Precision, 11.8% in F1-score, 15.1% in Matthews Correlation Coefficient (MCC), 8.1% in Area Under the Receiver Operating Characteristic curve (AUROC), 14.5% in Area Under the Precision-Recall curve (AUPRC), respectively.


Asunto(s)
Mapeo de Interacción de Proteínas , Inhibidores de la Bomba de Protones , Mapeo de Interacción de Proteínas/métodos , Proteínas/química , Área Bajo la Curva , Curva ROC
19.
Brief Bioinform ; 24(4)2023 07 20.
Artículo en Inglés | MEDLINE | ID: mdl-37328701

RESUMEN

Circular RNA (circRNA) is closely associated with human diseases. Accordingly, identifying the associations between human diseases and circRNA can help in disease prevention, diagnosis and treatment. Traditional methods are time consuming and laborious. Meanwhile, computational models can effectively predict potential circRNA-disease associations (CDAs), but are restricted by limited data, resulting in data with high dimension and imbalance. In this study, we propose a model based on automatically selected meta-path and contrastive learning, called the MPCLCDA model. First, the model constructs a new heterogeneous network based on circRNA similarity, disease similarity and known association, via automatically selected meta-path and obtains the low-dimensional fusion features of nodes via graph convolutional networks. Then, contrastive learning is used to optimize the fusion features further, and obtain the node features that make the distinction between positive and negative samples more evident. Finally, circRNA-disease scores are predicted through a multilayer perceptron. The proposed method is compared with advanced methods on four datasets. The average area under the receiver operating characteristic curve, area under the precision-recall curve and F1 score under 5-fold cross-validation reached 0.9752, 0.9831 and 0.9745, respectively. Simultaneously, case studies on human diseases further prove the predictive ability and application value of this method.


Asunto(s)
Redes Neurales de la Computación , ARN Circular , Humanos , ARN Circular/genética , Curva ROC , Biología Computacional/métodos , Algoritmos
20.
Brief Bioinform ; 24(2)2023 03 19.
Artículo en Inglés | MEDLINE | ID: mdl-36790856

RESUMEN

Potential miRNA-disease associations (MDA) play an important role in the discovery of complex human disease etiology. Therefore, MDA prediction is an attractive research topic in the field of biomedical machine learning. Recently, several models have been proposed for this task, but their performance limited by over-reliance on relevant network information with noisy graph structure connections. However, the application of self-supervised graph structure learning to MDA tasks remains unexplored. Our study is the first to use multi-view self-supervised contrastive learning (MSGCL) for MDA prediction. Specifically, we generated a learner view without association labels of miRNAs and diseases as input, and utilized the known association network to generate an anchor view that provides guiding signals for the learner view. The graph structure was optimized by designing a contrastive loss to maximize the consistency between the anchor and learner views. Our model is similar to a pre-trained model that continuously optimizes upstream tasks for high-quality association graph topology, thereby enhancing the latent representation of association predictions. The experimental results show that our proposed method outperforms state-of-the-art methods by 2.79$\%$ and 3.20$\%$ in area under the receiver operating characteristic curve (AUC) and area under the precision/recall curve (AUPR), respectively.


Asunto(s)
Aprendizaje Automático , MicroARNs , Humanos , Área Bajo la Curva , MicroARNs/genética , Curva ROC
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA