Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
1.
J Allergy Clin Immunol ; 153(6): 1655-1667, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38154666

RESUMO

BACKGROUND: Functional T-cell responses are essential for virus clearance and long-term protection after severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection, whereas certain clinical factors, such as older age and immunocompromise, are associated with worse outcome. OBJECTIVE: We sought to study the breadth and magnitude of T-cell responses in patients with coronavirus disease 2019 (COVID-19) and in individuals with inborn errors of immunity (IEIs) who had received COVID-19 mRNA vaccine. METHODS: Using high-throughput sequencing and bioinformatics tools to characterize the T-cell receptor ß repertoire signatures in 540 individuals after SARS-CoV-2 infection, 31 IEI recipients of COVID-19 mRNA vaccine, and healthy controls, we quantified HLA class I- and class II-restricted SARS-CoV-2-specific responses and also identified several HLA allele-clonotype motif associations in patients with COVID-19, including a subcohort of anti-type 1 interferon (IFN-1)-positive patients. RESULTS: Our analysis revealed that elderly patients with COVID-19 with critical disease manifested lower SARS-CoV-2 T-cell clonotype diversity as well as T-cell responses with reduced magnitude, whereas the SARS-CoV-2-specific clonotypes targeted a broad range of HLA class I- and class II-restricted epitopes across the viral proteome. The presence of anti-IFN-I antibodies was associated with certain HLA alleles. Finally, COVID-19 mRNA immunization induced an increase in the breadth of SARS-CoV-2-specific clonotypes in patients with IEIs, including those who had failed to seroconvert. CONCLUSIONS: Elderly individuals have impaired capacity to develop broad and sustained T-cell responses after SARS-CoV-2 infection. Genetic factors may play a role in the production of anti-IFN-1 antibodies. COVID-19 mRNA vaccines are effective in inducing T-cell responses in patients with IEIs.


Assuntos
COVID-19 , Hospedeiro Imunocomprometido , SARS-CoV-2 , Humanos , COVID-19/imunologia , SARS-CoV-2/imunologia , Masculino , Pessoa de Meia-Idade , Feminino , Hospedeiro Imunocomprometido/imunologia , Adulto , Idoso , Linfócitos T/imunologia , Vacinas contra COVID-19/imunologia , Imunocompetência/imunologia
2.
Bioinformatics ; 37(13): 1884-1890, 2021 Jul 27.
Artigo em Inglês | MEDLINE | ID: mdl-33471061

RESUMO

MOTIVATION: Automatic phenotype concept recognition from unstructured text remains a challenging task in biomedical text mining research. Previous works that address the task typically use dictionary-based matching methods, which can achieve high precision but suffer from lower recall. Recently, machine learning-based methods have been proposed to identify biomedical concepts, which can recognize more unseen concept synonyms by automatic feature learning. However, most methods require large corpora of manually annotated data for model training, which is difficult to obtain due to the high cost of human annotation. RESULTS: In this article, we propose PhenoTagger, a hybrid method that combines both dictionary and machine learning-based methods to recognize Human Phenotype Ontology (HPO) concepts in unstructured biomedical text. We first use all concepts and synonyms in HPO to construct a dictionary, which is then used to automatically build a distantly supervised training dataset for machine learning. Next, a cutting-edge deep learning model is trained to classify each candidate phrase (n-gram from input sentence) into a corresponding concept label. Finally, the dictionary and machine learning-based prediction results are combined for improved performance. Our method is validated with two HPO corpora, and the results show that PhenoTagger compares favorably to previous methods. In addition, to demonstrate the generalizability of our method, we retrained PhenoTagger using the disease ontology MEDIC for disease concept recognition to investigate the effect of training on different ontologies. Experimental results on the NCBI disease corpus show that PhenoTagger without requiring manually annotated training data achieves competitive performance as compared with state-of-the-art supervised methods. AVAILABILITYAND IMPLEMENTATION: The source code, API information and data for PhenoTagger are freely available at https://github.com/ncbi-nlp/PhenoTagger. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

3.
J Biomed Inform ; 129: 104059, 2022 05.
Artigo em Inglês | MEDLINE | ID: mdl-35351638

RESUMO

The study aims at developing a neural network model to improve the performance of Human Phenotype Ontology (HPO) concept recognition tools. We used the terms, definitions, and comments about the phenotypic concepts in the HPO database to train our model. The document to be analyzed is first split into sentences and annotated with a base method to generate candidate concepts. The sentences, along with the candidate concepts, are then fed into the pre-trained model for re-ranking. Our model comprises the pre-trained BlueBERT and a feature selection module, followed by a contrastive loss. We re-ranked the results generated by three robust HPO annotation tools and compared the performance against most of the existing approaches. The experimental results show that our model can improve the performance of the existing methods. Significantly, it boosted 3.0% and 5.6% in F1 score on the two evaluated datasets compared with the base methods. It removed more than 80% of the false positives predicted by the base methods, resulting in up to 18% improvement in precision. Our model utilizes the descriptive data in the ontology and the contextual information in the sentences for re-ranking. The results indicate that the additional information and the re-ranking model can significantly enhance the precision of HPO concept recognition compared with the base method.


Assuntos
Idioma , Redes Neurais de Computação , Bases de Dados Factuais , Humanos , Fenótipo
4.
Nucleic Acids Res ; 45(D1): D499-D506, 2017 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-28053164

RESUMO

The Papillomavirus Episteme (PaVE) is a database of curated papillomavirus genomic sequences, accompanied by web-based sequence analysis tools. This update describes the addition of major new features. The papillomavirus genomes within PaVE have been further annotated, and now includes the major spliced mRNA transcripts. Viral genes and transcripts can be visualized on both linear and circular genome browsers. Evolutionary relationships among PaVE reference protein sequences can be analysed using multiple sequence alignments and phylogenetic trees. To assist in viral discovery, PaVE offers a typing tool; a simplified algorithm to determine whether a newly sequenced virus is novel. PaVE also now contains an image library containing gross clinical and histopathological images of papillomavirus infected lesions. Database URL: https://pave.niaid.nih.gov/.


Assuntos
Bases de Dados de Ácidos Nucleicos , Genoma Viral , Genômica/métodos , Papillomaviridae/genética , Filogenia , Biologia Computacional/métodos , Anotação de Sequência Molecular , Papillomaviridae/classificação , Navegador
5.
Nucleic Acids Res ; 41(Database issue): D571-8, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23093593

RESUMO

The goal of the Papillomavirus Episteme (PaVE) is to provide an integrated resource for the analysis of papillomavirus (PV) genome sequences and related information. The PaVE is a freely accessible, web-based tool (http://pave.niaid.nih.gov) created around a relational database, which enables storage, analysis and exchange of sequence information. From a design perspective, the PaVE adopts an Open Source software approach and stresses the integration and reuse of existing tools. Reference PV genome sequences have been extracted from publicly available databases and reannotated using a custom-created tool. To date, the PaVE contains 241 annotated PV genomes, 2245 genes and regions, 2004 protein sequences and 47 protein structures, which users can explore, analyze or download. The PaVE provides scientists with the data and tools needed to accelerate scientific progress for the study and treatment of diseases caused by PVs.


Assuntos
Bases de Dados Genéticas , Papillomaviridae/genética , Genoma Viral , Genômica , Internet , Anotação de Sequência Molecular , Análise de Sequência , Interface Usuário-Computador , Proteínas Virais/química , Proteínas Virais/genética
6.
bioRxiv ; 2022 Feb 02.
Artigo em Inglês | MEDLINE | ID: mdl-35132409

RESUMO

Human immunoglobulin heavy chain (IGH) locus on chromosome 14 includes more than 40 functional copies of the variable gene (IGHV), which, together with the joining genes (IGHJ), diversity genes (IGHD), constant genes (IGHC) and immunoglobulin light chains, code for antibodies that identify and neutralize pathogenic invaders as a part of the adaptive immune system. Because of its highly repetitive sequence composition, the IGH locus has been particularly difficult to assemble or genotype through the use of standard short read sequencing technologies. Here we introduce ImmunoTyper-SR, an algorithmic method for genotype and CNV analysis of the germline IGHV genes using Illumina whole genome sequencing (WGS) data. ImmunoTyper-SR is based on a novel combinatorial optimization formulation that aims to minimize the total edit distance between reads and their assigned IGHV alleles from a given database, with constraints on the number and distribution of reads across each called allele. We have validated ImmunoTyper-SR on 12 individuals with Illumina WGS data from the 1000 Genomes Project, whose IGHV allele composition have been studied extensively through the use of long read and targeted sequencing platforms, as well as nine individuals from the NIAID COVID Consortium who have been subjected to WGS twice. We have then applied ImmunoTyper-SR on 585 samples from the NIAID COVID Consortium to investigate associations between distinct IGHV alleles and anti-type I IFN autoantibodies which have been linked to COVID-19 severity.

7.
Cell Syst ; 13(10): 808-816.e5, 2022 10 19.
Artigo em Inglês | MEDLINE | ID: mdl-36265467

RESUMO

Human immunoglobulin heavy chain (IGH) locus on chromosome 14 includes more than 40 functional copies of the variable gene (IGHV), which are critical for the structure of antibodies that identify and neutralize pathogenic invaders as a part of the adaptive immune system. Because of its highly repetitive sequence composition, the IGH locus has been particularly difficult to assemble or genotype when using standard short-read sequencing technologies. Here, we introduce ImmunoTyper-SR, an algorithmic tool for the genotyping and CNV analysis of the germline IGHV genes on Illumina whole-genome sequencing (WGS) data using a combinatorial optimization formulation that resolves ambiguous read mappings. We have validated ImmunoTyper-SR on 12 individuals, whose IGHV allele composition had been independently validated, as well as concordance between WGS replicates from nine individuals. We then applied ImmunoTyper-SR on 585 COVID patients to investigate the associations between IGHV alleles and anti-type I IFN autoantibodies, which were previously associated with COVID-19 severity.


Assuntos
COVID-19 , Região Variável de Imunoglobulina , Humanos , Região Variável de Imunoglobulina/genética , Genótipo , COVID-19/genética , Sequenciamento de Nucleotídeos em Larga Escala , Cadeias Pesadas de Imunoglobulinas/genética , Autoanticorpos/genética
8.
Nucleic Acids Res ; 36(Database issue): D892-900, 2008 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-17962311

RESUMO

CEBS (Chemical Effects in Biological Systems) is an integrated public repository for toxicogenomics data, including the study design and timeline, clinical chemistry and histopathology findings and microarray and proteomics data. CEBS contains data derived from studies of chemicals and of genetic alterations, and is compatible with clinical and environmental studies. CEBS is designed to permit the user to query the data using the study conditions, the subject responses and then, having identified an appropriate set of subjects, to move to the microarray module of CEBS to carry out gene signature and pathway analysis. Scope of CEBS: CEBS currently holds 22 studies of rats, four studies of mice and one study of Caenorhabditis elegans. CEBS can also accommodate data from studies of human subjects. Toxicogenomics studies currently in CEBS comprise over 4000 microarray hybridizations, and 75 2D gel images annotated with protein identification performed by MALDI and MS/MS. CEBS contains raw microarray data collected in accordance with MIAME guidelines and provides tools for data selection, pre-processing and analysis resulting in annotated lists of genes of interest. Additionally, clinical chemistry and histopathology findings from over 1500 animals are included in CEBS. CEBS/BID: The BID (Biomedical Investigation Database) is another component of the CEBS system. BID is a relational database used to load and curate study data prior to export to CEBS, in addition to capturing and displaying novel data types such as PCR data, or additional fields of interest, including those defined by the HESI Toxicogenomics Committee (in preparation). BID has been shared with Health Canada and the US Environmental Protection Agency. CEBS is available at http://cebs.niehs.nih.gov. BID can be accessed via the user interface from https://dir-apps.niehs.nih.gov/arc/. Requests for a copy of BID and for depositing data into CEBS or BID are available at http://www.niehs.nih.gov/cebs-df/.


Assuntos
Bases de Dados Genéticas , Análise de Sequência com Séries de Oligonucleotídeos , Proteômica , Toxicogenética , Animais , Humanos , Internet , Camundongos , Ratos , Integração de Sistemas , Interface Usuário-Computador
9.
Bioinformatics ; 22(7): 874-82, 2006 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-16410321

RESUMO

MOTIVATION: The CEBS data repository is being developed to promote a systems biology approach to understand the biological effects of environmental stressors. CEBS will house data from multiple gene expression platforms (transcriptomics), protein expression and protein-protein interaction (proteomics), and changes in low molecular weight metabolite levels (metabolomics) aligned by their detailed toxicological context. The system will accommodate extensive complex querying in a user-friendly manner. CEBS will store toxicological contexts including the study design details, treatment protocols, animal characteristics and conventional toxicological endpoints such as histopathology findings and clinical chemistry measures. All of these data types can be integrated in a seamless fashion to enable data query and analysis in a biologically meaningful manner. RESULTS: An object model, the SysBio-OM (Xirasagar et al., 2004) has been designed to facilitate the integration of microarray gene expression, proteomics and metabolomics data in the CEBS database system. We now report SysTox-OM as an open source systems toxicology model designed to integrate toxicological context into gene expression experiments. The SysTox-OM model is comprehensive and leverages other open source efforts, namely, the Standard for Exchange of Nonclinical Data (http://www.cdisc.org/models/send/v2/index.html) which is a data standard for capturing toxicological information for animal studies and Clinical Data Interchange Standards Consortium (http://www.cdisc.org/models/sdtm/index.html) that serves as a standard for the exchange of clinical data. Such standardization increases the accuracy of data mining, interpretation and exchange. The open source SysTox-OM model, which can be implemented on various software platforms, is presented here. AVAILABILITY: A universal modeling language (UML) depiction of the entire SysTox-OM is available at http://cebs.niehs.nih.gov and the Rational Rose object model package is distributed under an open source license that permits unrestricted academic and commercial use and is available at http://cebs.niehs.nih.gov/cebsdownloads. Currently, the public toxicological data in CEBS can be queried via a web application based on the SysTox-OM at http://cebs.niehs.nih.gov CONTACT: xirasagars@saic.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional , Sistemas de Gerenciamento de Base de Dados , Armazenamento e Recuperação da Informação/métodos , Design de Software , Toxicogenética/métodos , Modelos Biológicos , Linguagens de Programação , Proteômica
10.
Bioinformatics ; 20(13): 2004-15, 2004 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-15044233

RESUMO

MOTIVATION: To promote a systems biology approach to understanding the biological effects of environmental stressors, the Chemical Effects in Biological Systems (CEBS) knowledge base is being developed to house data from multiple complex data streams in a systems friendly manner that will accommodate extensive querying from users. Unified data representation via a single object model will greatly aid in integrating data storage and management, and facilitate reuse of software to analyze and display data resulting from diverse differential expression or differential profile technologies. Data streams include, but are not limited to, gene expression analysis (transcriptomics), protein expression and protein-protein interaction analysis (proteomics) and changes in low molecular weight metabolite levels (metabolomics). RESULTS: To enable the integration of microarray gene expression, proteomics and metabolomics data in the CEBS system, we designed an object model, Systems Biology Object Model (SysBio-OM). The model is comprehensive and leverages other open source efforts, namely the MicroArray Gene Expression Object Model (MAGE-OM) and the Proteomics Experiment Data Repository (PEDRo) object model. SysBio-OM is designed by extending MAGE-OM to represent protein expression data elements (including those from PEDRo), protein-protein interaction and metabolomics data. SysBio-OM promotes the standardization of data representation and data quality by facilitating the capture of the minimum annotation required for an experiment. Such standardization refines the accuracy of data mining and interpretation. The open source SysBio-OM model, which can be implemented on varied computing platforms is presented here. AVAILABILITY: A universal modeling language depiction of the entire SysBio-OM is available at http://cebs.niehs.nih.gov/SysBioOM/. The Rational Rose object model package is distributed under an open source license that permits unrestricted academic and commercial use and is available at http://cebs.niehs.nih.gov/cebsdownloads. The database and interface are being built to implement the model and will be available for public use at http://cebs.niehs.nih.gov.


Assuntos
Sistemas de Gerenciamento de Base de Dados , Bases de Dados Factuais , Perfilação da Expressão Gênica/métodos , Armazenamento e Recuperação da Informação/métodos , Metabolismo/fisiologia , Mapeamento de Interação de Proteínas/métodos , Biologia de Sistemas/métodos , Modelos Biológicos , Proteômica/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA