Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 3 de 3
Filtrar
Mais filtros

Base de dados
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Brief Bioinform ; 24(2)2023 03 19.
Artigo em Inglês | MEDLINE | ID: mdl-36907658

RESUMO

The adaptive immune response to foreign antigens is initiated by T-cell receptor (TCR) recognition on the antigens. Recent experimental advances have enabled the generation of a large amount of TCR data and their cognate antigenic targets, allowing machine learning models to predict the binding specificity of TCRs. In this work, we present TEINet, a deep learning framework that utilizes transfer learning to address this prediction problem. TEINet employs two separately pretrained encoders to transform TCR and epitope sequences into numerical vectors, which are subsequently fed into a fully connected neural network to predict their binding specificities. A major challenge for binding specificity prediction is the lack of a unified approach to sampling negative data. Here, we first assess the current negative sampling approaches comprehensively and suggest that the Unified Epitope is the most suitable one. Subsequently, we compare TEINet with three baseline methods and observe that TEINet achieves an average AUROC of 0.760, which outperforms baseline methods by 6.4-26%. Furthermore, we investigate the impacts of the pretraining step and notice that excessive pretraining may lower its transferability to the final prediction task. Our results and analysis show that TEINet can make an accurate prediction using only the TCR sequence (CDR3$\beta $) and the epitope sequence, providing novel insights to understand the interactions between TCRs and epitopes.


Assuntos
Aprendizado Profundo , Epitopos de Linfócito T , Receptores de Antígenos de Linfócitos T , Ligação Proteica
2.
Brief Bioinform ; 22(6)2021 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-34109382

RESUMO

Attention deficit hyperactivity disorder (ADHD) is a common neurodevelopmental disorder. Although genome-wide association studies (GWAS) identify the risk ADHD-associated variants and genes with significant P-values, they may neglect the combined effect of multiple variants with insignificant P-values. Here, we proposed a convolutional neural network (CNN) to classify 1033 individuals diagnosed with ADHD from 950 healthy controls according to their genomic data. The model takes the single nucleotide polymorphism (SNP) loci of P-values $\le{1\times 10^{-3}}$, i.e. 764 loci, as inputs, and achieved an accuracy of 0.9018, AUC of 0.9570, sensitivity of 0.8980 and specificity of 0.9055. By incorporating the saliency analysis for the deep learning network, a total of 96 candidate genes were found, of which 14 genes have been reported in previous ADHD-related studies. Furthermore, joint Gene Ontology enrichment and expression Quantitative Trait Loci analysis identified a potential risk gene for ADHD, EPHA5 with a variant of rs4860671. Overall, our CNN deep learning model exhibited a high accuracy for ADHD classification and demonstrated that the deep learning model could capture variants' combining effect with insignificant P-value, while GWAS fails. To our best knowledge, our model is the first deep learning method for the classification of ADHD with SNPs data.


Assuntos
Transtorno do Deficit de Atenção com Hiperatividade/genética , Biomarcadores , Aprendizado Profundo , Predisposição Genética para Doença , Receptor EphA5/genética , Área Sob a Curva , Transtorno do Deficit de Atenção com Hiperatividade/diagnóstico , Biologia Computacional/métodos , Ontologia Genética , Estudo de Associação Genômica Ampla , Humanos , Desequilíbrio de Ligação , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Curva ROC
3.
Comput Struct Biotechnol J ; 20: 1389-1401, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35342534

RESUMO

SARS-CoV-2 is a single-stranded RNA betacoronavirus with a high mutation rate. The rapidly emerging SARS-CoV-2 variants could increase transmissibility and diminish vaccine protection. However, whether coinfection with multiple SARS-CoV-2 variants exists remains controversial. This study collected 12,986 and 4,113 SARS-CoV-2 genomes from the GISAID database on May 11, 2020 (GISAID20May11), and Apr 1, 2021 (GISAID21Apr1), respectively. With single-nucleotide variant (SNV) and network clique analyses, we constructed single-nucleotide polymorphism (SNP) coexistence networks and discovered maximal SNP cliques of sizes 16 and 34 in the GISAID20May11 and GISAID21Apr1 datasets, respectively. Simulating the transmission routes and SNV accumulations, we discovered a linear relationship between the size of the maximal clique and the number of coinfected variants. We deduced that the COVID-19 cases in GISAID20May11 and GISAID21Apr1 were coinfections with 3.20 and 3.42 variants on average, respectively. Additionally, we performed Nanopore sequencing on 42 COVID-19 patients and discovered recurrent heterozygous SNPs in twenty of the patients, including loci 8,782 and 28,144, which were crucial for SARS-CoV-2 lineage divergence. In conclusion, our findings reported SARS-CoV-2 variants coinfection in COVID-19 patients and demonstrated the increasing number of coinfected variants.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA