Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
1.
J Bioinform Comput Biol ; 3(3): 743-70, 2005 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-16108092

RESUMO

Researchers, hindered by a lack of standard gene and protein-naming conventions, endure long, sometimes fruitless, literature searches. A system that is able to automatically assign gene names to their LocusLink ID (LLID) in previously unseen MEDLINE abstracts is described. The system is based on supervised learning and builds a model for each LLID. The training sets for all LLIDs are extracted automatically from MEDLINE references in the LocusLink and SwissProt databases. A validation was done of the performance for all 20,546 human genes with LLIDs. Of these, 7344 produced good quality models (F-measure >0.7, nearly 60% of which were >0.9) and 13,202 did not, mainly due to insufficient numbers of known document references. A hand validation of MEDLINE documents for a set of 66 genes agreed well with the system's internal accuracy assessment. It is concluded that it is possible to achieve high quality gene disambiguation using scaleable automated techniques.


Assuntos
Algoritmos , Genes , MEDLINE , Processamento de Linguagem Natural , Proteínas/classificação , Software , Terminologia como Assunto , Bases de Dados de Proteínas , Humanos , Vocabulário Controlado
2.
Nat Genet ; 47(1): 73-7, 2015 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-25420144

RESUMO

Temple-Baraitser syndrome (TBS) is a multisystem developmental disorder characterized by intellectual disability, epilepsy, and hypoplasia or aplasia of the nails of the thumb and great toe. Here we report damaging de novo mutations in KCNH1 (encoding a protein called ether à go-go, EAG1 or KV10.1), a voltage-gated potassium channel that is predominantly expressed in the central nervous system (CNS), in six individuals with TBS. Characterization of the mutant channels in both Xenopus laevis oocytes and human HEK293T cells showed a decreased threshold of activation and delayed deactivation, demonstrating that TBS-associated KCNH1 mutations lead to deleterious gain of function. Consistent with this result, we find that two mothers of children with TBS, who have epilepsy but are otherwise healthy, are low-level (10% and 27%) mosaic carriers of pathogenic KCNH1 mutations. Consistent with recent reports, this finding demonstrates that the etiology of many unresolved CNS disorders, including epilepsies, might be explained by pathogenic mosaic mutations.


Assuntos
Epilepsia/genética , Canais de Potássio Éter-A-Go-Go/genética , Hallux/anormalidades , Deficiência Intelectual/genética , Mutação de Sentido Incorreto , Unhas Malformadas/genética , Polegar/anormalidades , Sequência de Aminoácidos , Animais , Criança , Pré-Escolar , Sequência Conservada , Canais de Potássio Éter-A-Go-Go/química , Canais de Potássio Éter-A-Go-Go/fisiologia , Éxons/genética , Feminino , Células HEK293 , Humanos , Lactente , Masculino , Dados de Sequência Molecular , Mosaicismo , Oócitos , Conformação Proteica , Proteínas Recombinantes de Fusão/metabolismo , Homologia de Sequência de Aminoácidos , Xenopus laevis
3.
J Comput Biol ; 21(6): 405-19, 2014 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-24874280

RESUMO

The analysis of whole-genome or exome sequencing data from trios and pedigrees has been successfully applied to the identification of disease-causing mutations. However, most methods used to identify and genotype genetic variants from next-generation sequencing data ignore the relationships between samples, resulting in significant Mendelian errors, false positives and negatives. Here we present a Bayesian network framework that jointly analyzes data from all members of a pedigree simultaneously using Mendelian segregation priors, yet providing the ability to detect de novo mutations in offspring, and is scalable to large pedigrees. We evaluated our method by simulations and analysis of whole-genome sequencing (WGS) data from a 17-individual, 3-generation CEPH pedigree sequenced to 50× average depth. Compared with singleton calling, our family caller produced more high-quality variants and eliminated spurious calls as judged by common quality metrics such as Ti/Tv, Het/Hom ratios, and dbSNP/SNP array data concordance, and by comparing to ground truth variant sets available for this sample. We identify all previously validated de novo mutations in NA12878, concurrent with a 7× precision improvement. Our results show that our method is scalable to large genomics and human disease studies.


Assuntos
Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala , Mutação , Linhagem , Análise Mutacional de DNA/métodos , Humanos
5.
Artigo em Inglês | MEDLINE | ID: mdl-16448034

RESUMO

Researchers, hindered by a lack of standard gene and protein-naming conventions, endure long, sometimes fruitless, literature searches. A system is described which is able to automatically assign gene names to their LocusLink ID (LLID) in previously unseen MEDLINE abstracts. The system is based on supervised learning and builds a model for each LLID. The training sets for all LLIDs are extracted automatically from MEDLINE references in the LocusLink and SwissProt databases. A validation was done of the performance for all 20,546 human genes with LLIDs. Of these, 7,344 produced good quality models (F-measure > 0.7, nearly 60% of which were > 0.9) and 13,202 did not, mainly due to insufficient numbers of known document references. A hand validation of MEDLINE documents for a set of 66 genes agreed well with the system's internal accuracy assessment. It is concluded that it is possible to achieve high quality gene disambiguation using scaleable automated techniques.


Assuntos
Bases de Dados de Proteínas , Armazenamento e Recuperação da Informação/métodos , MEDLINE , Processamento de Linguagem Natural , Proteínas/classificação , Software , Terminologia como Assunto , Genes , Interface Usuário-Computador , Vocabulário Controlado
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA