Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
1.
Nucleic Acids Res ; 52(W1): W533-W539, 2024 Jul 05.
Artigo em Inglês | MEDLINE | ID: mdl-38742639

RESUMO

Prediction of conformational B-cell epitopes is a crucial task in vaccine design and development. In this work, we have developed SEMA 2.0, a user-friendly web platform that enables the research community to tackle the B-cell epitopes prediction problem using state-of-the-art protein language models. SEMA 2.0 offers comprehensive research tools for sequence- and structure-based conformational B-cell epitopes prediction, accurate identification of N-glycosylation sites, and a distinctive module for comparing the structures of antigen B-cell epitopes enhancing our ability to analyze and understand its immunogenic properties. SEMA 2.0 website https://sema.airi.net is free and open to all users and there is no login requirement. Source code is available at https://github.com/AIRI-Institute/SEMAi.


Assuntos
Epitopos de Linfócito B , Internet , Software , Epitopos de Linfócito B/imunologia , Epitopos de Linfócito B/química , Inteligência Artificial , Humanos , Conformação Proteica , Glicosilação
2.
Bioinformatics ; 39(11)2023 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-37935419

RESUMO

MOTIVATION: Accurate prediction of change in protein stability due to point mutations is an attractive goal that remains unachieved. Despite the high interest in this area, little consideration has been given to the transformer architecture, which is dominant in many fields of machine learning. RESULTS: In this work, we introduce PROSTATA, a predictive model built in a knowledge-transfer fashion on a new curated dataset. PROSTATA demonstrates advantage over existing solutions based on neural networks. We show that the large improvement margin is due to both the architecture of the model and the quality of the new training dataset. This work opens up opportunities to develop new lightweight and accurate models for protein stability assessment. AVAILABILITY AND IMPLEMENTATION: PROSTATA is available at https://github.com/AIRI-Institute/PROSTATA and https://prostata.airi.net.


Assuntos
Aprendizado de Máquina , Redes Neurais de Computação , Mutação Puntual , Estabilidade Proteica
3.
Nucleic Acids Res ; 49(D1): D1347-D1350, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33245779

RESUMO

Genome-wide association studies have provided a vast array of publicly available SNP × phenotype association results. However, they are often in disparate repositories and formats, making downstream analyses difficult and time consuming. PheLiGe (https://phelige.com) is a database that provides easy access to such results via a web interface. The underlying database currently stores >75 billion genotype-phenotype associations from 7347 genome-wide and 1.2 million region-wide (e.g. cis-eQTL) association scans. The web interface allows for investigation of regional genotype-phenotype associations across many phenotypes, giving insights into the biological function affected by the variant in question. Furthermore, PheLiGe can compare regional patterns of association between different traits. This analysis can ascertain whether a co-association is due to pleiotropy or linkage. Moreover, comparison of association patterns for a complex trait of interest and gene expression and protein levels can implicate causal genes.


Assuntos
Bases de Dados Genéticas , Doença/genética , Estudos de Associação Genética , Predisposição Genética para Doença , Polimorfismo de Nucleotídeo Único , Software , Ligação Genética , Genoma Humano , Estudo de Associação Genômica Ampla , Genótipo , Humanos , Internet , Fenótipo , Característica Quantitativa Herdável
4.
PLoS Genet ; 15(4): e1008110, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-30998689

RESUMO

Varicose veins of lower extremities (VVs) are a common multifactorial vascular disease. Genetic factors underlying VVs development remain largely unknown. Here we report the first large-scale study of VVs performed on a freely available genetic data of 408,455 European-ancestry individuals. We identified the 12 reliably associated loci that explain 13% of the SNP-based heritability, and prioritized the most likely causal genes CASZ1, PIEZO1, PPP3R1, EBF1, STIM2, HFE, GATA2, NFATC2, and SOX9. VVs-associated variants within these loci exhibited pleiotropic effects on several phenotypes including blood pressure/hypertension and blood cell traits. Gene set enrichment analysis revealed gene categories related to abnormal vasculogenesis. Genetic correlation analysis confirmed known epidemiological associations between VVs and deep venous thrombosis, weight, rough labor, and standing job, and found a genetic overlap with multiple traits that have not been previously suspected to share common genetic background with VVs. These traits included educational attainment, fluid intelligence and prospective memory scores, walking pace (negative correlation with VVs), smoking, height, number of operations, pain, and gonarthrosis (positive correlation with VVs). Finally, Mendelian randomization analysis provided evidence for causal effects of plasma levels of MICB and CD209 proteins, and anthropometric traits such as waist and hip circumference, height, weight, and both fat and fat-free mass. Our results provide novel insight into both VVs genetics and etiology. The revealed genes and proteins can be considered as good candidates for follow-up functional studies and might be of interest as potential drug targets.


Assuntos
Suscetibilidade a Doenças , Extremidade Inferior/irrigação sanguínea , Extremidade Inferior/patologia , Varizes/etiologia , Varizes/patologia , Biomarcadores , Biologia Computacional/métodos , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Humanos , Polimorfismo de Nucleotídeo Único , Característica Quantitativa Herdável
5.
Gigascience ; 122023 03 20.
Artigo em Inglês | MEDLINE | ID: mdl-36971292

RESUMO

Interpretation of noncoding genomic variants is one of the most important challenges in human genetics. Machine learning methods have emerged recently as a powerful tool to solve this problem. State-of-the-art approaches allow prediction of transcriptional and epigenetic effects caused by noncoding mutations. However, these approaches require specific experimental data for training and cannot generalize across cell types where required features were not experimentally measured. We show here that available epigenetic characteristics of human cell types are extremely sparse, limiting those approaches that rely on specific epigenetic input. We propose a new neural network architecture, DeepCT, which can learn complex interconnections of epigenetic features and infer unmeasured data from any available input. Furthermore, we show that DeepCT can learn cell type-specific properties, build biologically meaningful vector representations of cell types, and utilize these representations to generate cell type-specific predictions of the effects of noncoding variations in the human genome.


Assuntos
Aprendizado Profundo , Humanos , Redes Neurais de Computação , Aprendizado de Máquina , Genoma Humano
6.
Front Immunol ; 13: 960985, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36189325

RESUMO

One of the primary tasks in vaccine design and development of immunotherapeutic drugs is to predict conformational B-cell epitopes corresponding to primary antibody binding sites within the antigen tertiary structure. To date, multiple approaches have been developed to address this issue. However, for a wide range of antigens their accuracy is limited. In this paper, we applied the transfer learning approach using pretrained deep learning models to develop a model that predicts conformational B-cell epitopes based on the primary antigen sequence and tertiary structure. A pretrained protein language model, ESM-1v, and an inverse folding model, ESM-IF1, were fine-tuned to quantitatively predict antibody-antigen interaction features and distinguish between epitope and non-epitope residues. The resulting model called SEMA demonstrated the best performance on an independent test set with ROC AUC of 0.76 compared to peer-reviewed tools. We show that SEMA can quantitatively rank the immunodominant regions within the SARS-CoV-2 RBD domain. SEMA is available at https://github.com/AIRI-Institute/SEMAi and the web-interface http://sema.airi.net.


Assuntos
COVID-19 , Vacinas , Antígenos , Epitopos de Linfócito B , Humanos , Epitopos Imunodominantes , Aprendizado de Máquina , SARS-CoV-2
7.
Eur J Hum Genet ; 29(7): 1082-1091, 2021 07.
Artigo em Inglês | MEDLINE | ID: mdl-33664501

RESUMO

Adult height inspired the first biometrical and quantitative genetic studies and is a test-case trait for understanding heritability. The studies of height led to formulation of the classical polygenic model, that has a profound influence on the way we view and analyse complex traits. An essential part of the classical model is an assumption of additivity of effects and normality of the distribution of the residuals. However, it may be expected that the normal approximation will become insufficient in bigger studies. Here, we demonstrate that when the height of hundreds of thousands of individuals is analysed, the model complexity needs to be increased to include non-additive interactions between sex, environment and genes. Alternatively, the use of log-normal approximation allowed us to still use the additive effects model. These findings are important for future genetic and methodologic studies that make use of adult height as an exemplar trait.


Assuntos
Estatura , Característica Quantitativa Herdável , Valores de Referência , Adulto , Algoritmos , Bancos de Espécimes Biológicos , Estatura/genética , Feminino , Estudo de Associação Genômica Ampla , Humanos , Masculino , Modelos Genéticos , Herança Multifatorial , Vigilância da População , Reino Unido
8.
Transl Anim Sci ; 4(1): 264-274, 2020 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-32704985

RESUMO

Genomic selection is routinely used worldwide in agricultural breeding. However, in Russia, it is still not used to its full potential partially due to high genotyping costs. The use of genotypes imputed from the low-density chips (LD-chip) provides a valuable opportunity for reducing the genotyping costs. Pork production in Russia is based on the conventional 3-tier pyramid involving 3 breeds; therefore, the best option would be the development of a single LD-chip that could be used for all of them. Here, we for the first time have analyzed genomic variability in 3 breeds of Russian pigs, namely, Landrace, Duroc, and Large White and generated the LD-chip that can be used in pig breeding with the negligible loss in genotyping quality. We have demonstrated that out of the 3 methods commonly used for LD-chip construction, the block method shows the best results. The imputation quality depends strongly on the presence of close ancestors in the reference population. We have demonstrated that for the animals with both parents genotyped using high-density panels high-quality genotypes (allelic discordance rate < 0.05) could be obtained using a 300 single nucleotide polymorphism (SNP) chip, while in the absence of genotyped ancestors at least 2,000 SNP markers are required. We have shown that imputation quality varies between chromosomes, and it is lower near the chromosome ends and drops with the increase in minor allele frequency. Imputation quality of the individual SNPs correlated well across breeds. Using the same LD-chip, we were able to obtain comparable imputation quality in all 3 breeds, so it may be suggested that a single chip could be used for all of them. Our findings also suggest that the presence of markers with extremely low imputation quality is likely to be explained by wrong mapping of the markers to the chromosomal positions.

9.
Sci Rep ; 10(1): 10486, 2020 06 26.
Artigo em Inglês | MEDLINE | ID: mdl-32591598

RESUMO

Genome-wide association studies have led to a significant progress in identification of genomic loci affecting coronary artery disease (CAD) risk. However, revealing the causal genes responsible for the observed associations is challenging. In the present study, we aimed to prioritize CAD-relevant genes based on cumulative evidence from the published studies and our own study of colocalization between eQTLs and loci associated with CAD using SMR/HEIDI approach. Prior knowledge of candidate genes was extracted from both experimental and in silico studies, employing different prioritization algorithms. Our review systematized information for a total of 51 CAD-associated loci. We pinpointed 37 genes in 36 loci. For 27 genes we infer they are causal for CAD, and for 10 further genes we judge them most likely causal. Colocalization analysis showed that for 18 out of these loci, association with CAD can be explained by changes in gene expression in one or more CAD-relevant tissues. Furthermore, for 8 out of 36 loci, existing evidence suggested additional CAD-associated genes. For the remaining 15 loci, we concluded that evidence for gene prioritization remains inconsistent, insufficient, or absent. Our results provide deeper insights into the genetic etiology of CAD and demonstrate knowledge gaps where further research is warranted.


Assuntos
Doença da Artéria Coronariana/genética , Predisposição Genética para Doença/genética , Simulação por Computador , Estudo de Associação Genômica Ampla/métodos , Genômica/métodos , Humanos , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas/genética , Fatores de Risco
10.
Nutrients ; 10(5)2018 May 08.
Artigo em Inglês | MEDLINE | ID: mdl-29738477

RESUMO

Personalized nutrition is of increasing interest to individuals actively monitoring their health. The relations between the duration of diet intervention and the effects on gut microbiota have yet to be elucidated. Here we examined the associations of short-term dietary changes, long-term dietary habits and lifestyle with gut microbiota. Stool samples from 248 citizen-science volunteers were collected before and after a self-reported 2-week personalized diet intervention, then analyzed using 16S rRNA sequencing. Considerable correlations between long-term dietary habits and gut community structure were detected. A higher intake of vegetables and fruits was associated with increased levels of butyrate-producing Clostridiales and higher community richness. A paired comparison of the metagenomes before and after the 2-week intervention showed that even a brief, uncontrolled intervention produced profound changes in community structure: resulting in decreased levels of Bacteroidaceae, Porphyromonadaceae and Rikenellaceae families and decreased alpha-diversity coupled with an increase of Methanobrevibacter, Bifidobacterium, Clostridium and butyrate-producing Lachnospiraceae- as well as the prevalence of a permatype (a bootstrapping-based variation of enterotype) associated with a higher diversity of diet. The response of microbiota to the intervention was dependent on the initial microbiota state. These findings pave the way for the development of an individualized diet.


Assuntos
Dieta , Microbioma Gastrointestinal , Bacteroidetes/genética , Bacteroidetes/isolamento & purificação , Bifidobacterium/genética , Bifidobacterium/isolamento & purificação , Clostridium/genética , Clostridium/isolamento & purificação , Análise por Conglomerados , Fezes/química , Fezes/microbiologia , Humanos , Metagenoma , Methanobrevibacter/genética , Methanobrevibacter/isolamento & purificação , RNA Ribossômico 16S/genética , Tamanho da Amostra , Análise de Sequência de DNA
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa