Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Nat Neurosci ; 26(1): 150-162, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-36482247

RESUMO

Amyotrophic lateral sclerosis (ALS) is a progressively fatal neurodegenerative disease affecting motor neurons in the brain and spinal cord. In this study, we investigated gene expression changes in ALS via RNA sequencing in 380 postmortem samples from cervical, thoracic and lumbar spinal cord segments from 154 individuals with ALS and 49 control individuals. We observed an increase in microglia and astrocyte gene expression, accompanied by a decrease in oligodendrocyte gene expression. By creating a gene co-expression network in the ALS samples, we identified several activated microglia modules that negatively correlate with retrospective disease duration. We mapped molecular quantitative trait loci and found several potential ALS risk loci that may act through gene expression or splicing in the spinal cord and assign putative cell types for FNBP1, ACSL5, SH3RF1 and NFASC. Finally, we outline how common genetic variants associated with splicing of C9orf72 act as proxies for the well-known repeat expansion, and we use the same mechanism to suggest ATXN3 as a putative risk gene.


Assuntos
Esclerose Lateral Amiotrófica , Doenças Neurodegenerativas , Humanos , Esclerose Lateral Amiotrófica/genética , Esclerose Lateral Amiotrófica/metabolismo , Doenças Neurodegenerativas/metabolismo , Estudos Retrospectivos , Transcriptoma , Medula Espinal/metabolismo
2.
Cell Genom ; 2(5)2022 May.
Artigo em Inglês | MEDLINE | ID: mdl-36452119

RESUMO

Genome in a Bottle benchmarks are widely used to help validate clinical sequencing pipelines and develop variant calling and sequencing methods. Here we use accurate linked and long reads to expand benchmarks in 7 samples to include difficult-to-map regions and segmental duplications that are challenging for short reads. These benchmarks add more than 300,000 SNVs and 50,000 insertions or deletions (indels) and include 16% more exonic variants, many in challenging, clinically relevant genes not covered previously, such as PMS2. For HG002, we include 92% of the autosomal GRCh38 assembly while excluding regions problematic for benchmarking small variants, such as copy number variants, that should not have been in the previous version, which included 85% of GRCh38. It identifies eight times more false negatives in a short read variant call set relative to our previous benchmark. We demonstrate that this benchmark reliably identifies false positives and false negatives across technologies, enabling ongoing methods development.

3.
Cell ; 185(18): 3426-3440.e19, 2022 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-36055201

RESUMO

The 1000 Genomes Project (1kGP) is the largest fully open resource of whole-genome sequencing (WGS) data consented for public distribution without access or use restrictions. The final, phase 3 release of the 1kGP included 2,504 unrelated samples from 26 populations and was based primarily on low-coverage WGS. Here, we present a high-coverage 3,202-sample WGS 1kGP resource, which now includes 602 complete trios, sequenced to a depth of 30X using Illumina. We performed single-nucleotide variant (SNV) and short insertion and deletion (INDEL) discovery and generated a comprehensive set of structural variants (SVs) by integrating multiple analytic methods through a machine learning model. We show gains in sensitivity and precision of variant calls compared to phase 3, especially among rare SNVs as well as INDELs and SVs spanning frequency spectrum. We also generated an improved reference imputation panel, making variants discovered here accessible for association studies.


Assuntos
Genoma Humano , Sequenciamento Completo do Genoma , Feminino , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Mutação INDEL , Masculino , Polimorfismo de Nucleotídeo Único
4.
Nat Genet ; 53(8): 1125-1134, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-34312540

RESUMO

Autism is a highly heritable complex disorder in which de novo mutation (DNM) variation contributes significantly to risk. Using whole-genome sequencing data from 3,474 families, we investigate another source of large-effect risk variation, ultra-rare variants. We report and replicate a transmission disequilibrium of private, likely gene-disruptive (LGD) variants in probands but find that 95% of this burden resides outside of known DNM-enriched genes. This variant class more strongly affects multiplex family probands and supports a multi-hit model for autism. Candidate genes with private LGD variants preferentially transmitted to probands converge on the E3 ubiquitin-protein ligase complex, intracellular transport and Erb signaling protein networks. We estimate that these variants are approximately 2.5 generations old and significantly younger than other variants of similar type and frequency in siblings. Overall, private LGD variants are under strong purifying selection and appear to act on a distinct set of genes not yet associated with autism.


Assuntos
Transtorno do Espectro Autista/genética , Predisposição Genética para Doença , Proteínas/genética , Transtorno Autístico/genética , Evolução Molecular , Dosagem de Genes , Haplótipos , Humanos , Desequilíbrio de Ligação , Modelos Genéticos , Mutação , Linhagem , Polimorfismo de Nucleotídeo Único , Mapas de Interação de Proteínas/genética , Irmãos , Sequenciamento Completo do Genoma
5.
Science ; 372(6537)2021 04 02.
Artigo em Inglês | MEDLINE | ID: mdl-33632895

RESUMO

Long-read and strand-specific sequencing technologies together facilitate the de novo assembly of high-quality haplotype-resolved human genomes without parent-child trio data. We present 64 assembled haplotypes from 32 diverse human genomes. These highly contiguous haplotype assemblies (average minimum contig length needed to cover 50% of the genome: 26 million base pairs) integrate all forms of genetic variation, even across complex loci. We identified 107,590 structural variants (SVs), of which 68% were not discovered with short-read sequencing, and 278 SV hotspots (spanning megabases of gene-rich sequence). We characterized 130 of the most active mobile element source elements and found that 63% of all SVs arise through homology-mediated mechanisms. This resource enables reliable graph-based genotyping from short reads of up to 50,340 SVs, resulting in the identification of 1526 expression quantitative trait loci as well as SV candidates for adaptive selection within the human population.


Assuntos
Variação Genética , Genoma Humano , Haplótipos , Feminino , Genótipo , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Mutação INDEL , Sequências Repetitivas Dispersas , Masculino , Grupos Populacionais/genética , Locos de Características Quantitativas , Retroelementos , Análise de Sequência de DNA , Inversão de Sequência , Sequenciamento Completo do Genoma
6.
BMC Genomics ; 16: 143, 2015 Feb 28.
Artigo em Inglês | MEDLINE | ID: mdl-25765891

RESUMO

BACKGROUND: Identifying insertion/deletion polymorphisms (INDELs) with high confidence has been intrinsically challenging in short-read sequencing data. Here we report our approach for improving INDEL calling accuracy by using a machine learning algorithm to combine call sets generated with three independent methods, and by leveraging the strengths of each individual pipeline. Utilizing this approach, we generated a consensus exome INDEL call set from a large dataset generated by the 1000 Genomes Project (1000G), maximizing both the sensitivity and the specificity of the calls. RESULTS: This consensus exome INDEL call set features 7,210 INDELs, from 1,128 individuals across 13 populations included in the 1000 Genomes Phase 1 dataset, with a false discovery rate (FDR) of about 7.0%. CONCLUSIONS: In our study we further characterize the patterns and distributions of these exonic INDELs with respect to density, allele length, and site frequency spectrum, as well as the potential mutagenic mechanisms of coding INDELs in humans.


Assuntos
Exoma/genética , Mutação INDEL/genética , Mutagênese , Biologia Computacional , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala , Projeto Genoma Humano , Humanos , Aprendizado de Máquina
7.
Science ; 342(6154): 1235587, 2013 Oct 04.
Artigo em Inglês | MEDLINE | ID: mdl-24092746

RESUMO

Interpreting variants, especially noncoding ones, in the increasing number of personal genomes is challenging. We used patterns of polymorphisms in functionally annotated regions in 1092 humans to identify deleterious variants; then we experimentally validated candidates. We analyzed both coding and noncoding regions, with the former corroborating the latter. We found regions particularly sensitive to mutations ("ultrasensitive") and variants that are disruptive because of mechanistic effects on transcription-factor binding (that is, "motif-breakers"). We also found variants in regions with higher network centrality tend to be deleterious. Insertions and deletions followed a similar pattern to single-nucleotide variants, with some notable exceptions (e.g., certain deletions and enhancers). On the basis of these patterns, we developed a computational tool (FunSeq), whose application to ~90 cancer genomes reveals nearly a hundred candidate noncoding drivers.


Assuntos
Variação Genética , Anotação de Sequência Molecular/métodos , Neoplasias/genética , Sítios de Ligação/genética , Genoma Humano , Genômica , Humanos , Fatores de Transcrição Kruppel-Like/metabolismo , Mutação , Polimorfismo de Nucleotídeo Único , População/genética , RNA não Traduzido/genética , Seleção Genética
8.
BMC Bioinformatics ; 14: 53, 2013 Feb 14.
Artigo em Inglês | MEDLINE | ID: mdl-23409969

RESUMO

BACKGROUND: Gene Ontology (GO) enrichment analysis remains one of the most common methods for hypothesis generation from high throughput datasets. However, we believe that researchers strive to test other hypotheses that fall outside of GO. Here, we developed and evaluated a tool for hypothesis generation from gene or protein lists using ontological concepts present in manually curated text that describes those genes and proteins. RESULTS: As a consequence we have developed the method Statistical Tracking of Ontological Phrases (STOP) that expands the realm of testable hypotheses in gene set enrichment analyses by integrating automated annotations of genes to terms from over 200 biomedical ontologies. While not as precise as manually curated terms, we find that the additional enriched concepts have value when coupled with traditional enrichment analyses using curated terms. CONCLUSION: Multiple ontologies have been developed for gene and protein annotation, by using a dataset of both manually curated GO terms and automatically recognized concepts from curated text we can expand the realm of hypotheses that can be discovered. The web application STOP is available at http://mooneygroup.org/stop/.


Assuntos
Genes , Anotação de Sequência Molecular , Proteínas , Software , Vocabulário Controlado , Humanos , Doença de Huntington/genética , Doença de Huntington/metabolismo , Internet , Doença de Parkinson/genética , Doença de Parkinson/metabolismo , Mapeamento de Interação de Proteínas
9.
BMC Genomics ; 13 Suppl 6: S19, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23134663

RESUMO

BACKGROUND: Until recently, sequencing has primarily been carried out in large genome centers which have invested heavily in developing the computational infrastructure that enables genomic sequence analysis. The recent advancements in next generation sequencing (NGS) have led to a wide dissemination of sequencing technologies and data, to highly diverse research groups. It is expected that clinical sequencing will become part of diagnostic routines shortly. However, limited accessibility to computational infrastructure and high quality bioinformatic tools, and the demand for personnel skilled in data analysis and interpretation remains a serious bottleneck. To this end, the cloud computing and Software-as-a-Service (SaaS) technologies can help address these issues. RESULTS: We successfully enabled the Atlas2 Cloud pipeline for personal genome analysis on two different cloud service platforms: a community cloud via the Genboree Workbench, and a commercial cloud via the Amazon Web Services using Software-as-a-Service model. We report a case study of personal genome analysis using our Atlas2 Genboree pipeline. We also outline a detailed cost structure for running Atlas2 Amazon on whole exome capture data, providing cost projections in terms of storage, compute and I/O when running Atlas2 Amazon on a large data set. CONCLUSIONS: We find that providing a web interface and an optimized pipeline clearly facilitates usage of cloud computing for personal genome analysis, but for it to be routinely used for large scale projects there needs to be a paradigm shift in the way we develop tools, in standard operating procedures, and in funding mechanisms.


Assuntos
Genoma Humano , Software , Bases de Dados Genéticas , Humanos , Internet , Interface Usuário-Computador
10.
Mol Biol Cell ; 23(24): 4679-88, 2012 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-23097491

RESUMO

Accumulation of insoluble protein in cells is associated with aging and aging-related diseases; however, the roles of insoluble protein in these processes are uncertain. The nature and impact of changes to protein solubility during normal aging are less well understood. Using quantitative mass spectrometry, we identify 480 proteins that become insoluble during postmitotic aging in Saccharomyces cerevisiae and show that this ensemble of insoluble proteins is similar to those that accumulate in aging nematodes. SDS-insoluble protein is present exclusively in a nonquiescent subpopulation of postmitotic cells, indicating an asymmetrical distribution of this protein. In addition, we show that nitrogen starvation of young cells is sufficient to cause accumulation of a similar group of insoluble proteins. Although many of the insoluble proteins identified are known to be autophagic substrates, induction of macroautophagy is not required for insoluble protein formation. However, genetic or chemical inhibition of the Tor1 kinase is sufficient to promote accumulation of insoluble protein. We conclude that target of rapamycin complex 1 regulates accumulation of insoluble proteins via mechanisms acting upstream of macroautophagy. Our data indicate that the accumulation of proteins in an SDS-insoluble state in postmitotic cells represents a novel autophagic cargo preparation process that is regulated by the Tor1 kinase.


Assuntos
Autofagia , Nitrogênio/metabolismo , Fosfatidilinositol 3-Quinases/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/metabolismo , Proteína 7 Relacionada à Autofagia , Proteínas Relacionadas à Autofagia , Eletroforese em Gel de Poliacrilamida , Espectrometria de Massas , Alvo Mecanístico do Complexo 1 de Rapamicina , Mitose , Complexos Multiproteicos/metabolismo , Mutação , Fosfatidilinositol 3-Quinases/genética , Fosforilação , Proteínas Quinases/genética , Proteínas Quinases/metabolismo , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/crescimento & desenvolvimento , Proteínas de Saccharomyces cerevisiae/genética , Dodecilsulfato de Sódio/química , Solubilidade , Serina-Treonina Quinases TOR/metabolismo , Fatores de Tempo
11.
BMC Bioinformatics ; 13: 8, 2012 Jan 12.
Artigo em Inglês | MEDLINE | ID: mdl-22239737

RESUMO

BACKGROUND: Whole exome capture sequencing allows researchers to cost-effectively sequence the coding regions of the genome. Although the exome capture sequencing methods have become routine and well established, there is currently a lack of tools specialized for variant calling in this type of data. RESULTS: Using statistical models trained on validated whole-exome capture sequencing data, the Atlas2 Suite is an integrative variant analysis pipeline optimized for variant discovery on all three of the widely used next generation sequencing platforms (SOLiD, Illumina, and Roche 454). The suite employs logistic regression models in conjunction with user-adjustable cutoffs to accurately separate true SNPs and INDELs from sequencing and mapping errors with high sensitivity (96.7%). CONCLUSION: We have implemented the Atlas2 Suite and applied it to 92 whole exome samples from the 1000 Genomes Project. The Atlas2 Suite is available for download at http://sourceforge.net/projects/atlas2/. In addition to a command line version, the suite has been integrated into the Genboree Workbench, allowing biomedical scientists with minimal informatics expertise to remotely call, view, and further analyze variants through a simple web interface. The existing genomic databases displayed via the Genboree browser also streamline the process from variant discovery to functional genomics analysis, resulting in an off-the-shelf toolkit for the broader community.


Assuntos
Exoma , Genoma Humano , Software , Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Mutação INDEL , Fases de Leitura Aberta , Polimorfismo de Nucleotídeo Único
12.
Aging Cell ; 11(1): 120-7, 2012 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-22103665

RESUMO

While it is generally recognized that misfolding of specific proteins can cause late-onset disease, the contribution of protein aggregation to the normal aging process is less well understood. To address this issue, a mass spectrometry-based proteomic analysis was performed to identify proteins that adopt sodium dodecyl sulfate (SDS)-insoluble conformations during aging in Caenorhabditis elegans. SDS-insoluble proteins extracted from young and aged C. elegans were chemically labeled by isobaric tagging for relative and absolute quantification (iTRAQ) and identified by liquid chromatography and mass spectrometry. Two hundred and three proteins were identified as being significantly enriched in an SDS-insoluble fraction in aged nematodes and were largely absent from a similar protein fraction in young nematodes. The SDS-insoluble fraction in aged animals contains a diverse range of proteins including a large number of ribosomal proteins. Gene ontology analysis revealed highly significant enrichments for energy production and translation functions. Expression of genes encoding insoluble proteins observed in aged nematodes was knocked down using RNAi, and effects on lifespan were measured. 41% of genes tested were shown to extend lifespan after RNAi treatment, compared with 18% in a control group of genes. These data indicate that genes encoding proteins that become insoluble with age are enriched for modifiers of lifespan. This demonstrates that proteomic approaches can be used to identify genes that modify lifespan. Finally, these observations indicate that the accumulation of insoluble proteins with diverse functions may be a general feature of aging.


Assuntos
Envelhecimento/genética , Proteínas de Caenorhabditis elegans/genética , Caenorhabditis elegans/genética , Expressão Gênica , Longevidade/genética , Animais , Caenorhabditis elegans/metabolismo , Proteínas de Caenorhabditis elegans/metabolismo , Perfilação da Expressão Gênica , Humanos , Espectrometria de Massas , Proteômica , Interferência de RNA , Proteínas Ribossômicas/genética , Proteínas Ribossômicas/metabolismo , Dodecilsulfato de Sódio , Solubilidade , Coloração e Rotulagem
13.
Methods Mol Biol ; 628: 307-19, 2010.
Artigo em Inglês | MEDLINE | ID: mdl-20238089

RESUMO

As databases of genome data continue to grow, our understanding of the functional elements of the genome grows as well. Many genetic changes in the genome have now been discovered and characterized, including both disease-causing mutations and neutral polymorphisms. In addition to experimental approaches to characterize specific variants, over the past decade, there has been intense bioinformatic research to understand the molecular effects of these genetic changes. In addition to genomic experimental assays, the bioinformatic efforts have focused on two general areas. First, researchers have annotated genetic variation data with molecular features that are likely to affect function. Second, statistical methods have been developed to predict mutations that are likely to have a molecular effect. In this protocol manuscript, methods for understanding the molecular functions of single nucleotide polymorphisms (SNPs) and mutations are reviewed and described. The intent of this chapter is to provide an introduction to the online tools that are both easy to use and useful.


Assuntos
Polimorfismo de Nucleotídeo Único , Substituição de Aminoácidos , Biologia Computacional , Predisposição Genética para Doença , Humanos , Mutação , Splicing de RNA
14.
Hum Mutat ; 31(3): 335-46, 2010 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-20052762

RESUMO

An important challenge in translational bioinformatics is to understand how genetic variation gives rise to molecular changes at the protein level that can precipitate both monogenic and complex disease. To this end, we compiled datasets of human disease-associated amino acid substitutions (AAS) in the contexts of inherited monogenic disease, complex disease, functional polymorphisms with no known disease association, and somatic mutations in cancer, and compared them with respect to predicted functional sites in proteins. Using the sequence homology-based tool SIFT to estimate the proportion of deleterious AAS in each dataset, only complex disease AAS were found to be indistinguishable from neutral polymorphic AAS. Investigation of monogenic disease AAS predicted to be nondeleterious by SIFT were characterized by a significant enrichment for inherited AAS within solvent accessible residues, regions of intrinsic protein disorder, and an association with the loss or gain of various posttranslational modifications. Sites of structural and/or functional interest were therefore surmised to constitute useful additional features with which to identify the molecular disruptions caused by deleterious AAS. A range of bioinformatic tools, designed to predict structural and functional sites in protein sequences, were then employed to demonstrate that intrinsic biases exist in terms of the distribution of different types of human AAS with respect to specific structural, functional and pathological features. Our Web tool, designed to potentiate the functional profiling of novel AAS, has been made available at http://profile.mutdb.org/.


Assuntos
Biologia Computacional/métodos , Regulação Neoplásica da Expressão Gênica , Neoplasias/genética , Polimorfismo Genético , Alelos , Aminoácidos/química , Aminoácidos/genética , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Variação Genética , Glicosilação , Humanos , Internet , Mutação de Sentido Incorreto , Fosforilação , Análise de Sequência de Proteína
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA