Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 81
Filtrar
1.
Hum Genomics ; 18(1): 20, 2024 Feb 23.
Artigo em Inglês | MEDLINE | ID: mdl-38395944

RESUMO

BACKGROUND: De novo mutations (DNMs) are variants that occur anew in the offspring of noncarrier parents. They are not inherited from either parent but rather result from endogenous mutational processes involving errors of DNA repair/replication. These spontaneous errors play a significant role in the causation of genetic disorders, and their importance in the context of molecular diagnostic medicine has become steadily more apparent as more DNMs have been reported in the literature. In this study, we examined 46,489 disease-associated DNMs annotated by the Human Gene Mutation Database (HGMD) to ascertain their distribution across gene and disease categories. RESULTS: Most disease-associated DNMs reported to date are found to be associated with developmental and psychiatric disorders, a reflection of the focus of sequencing efforts over the last decade. Of the 13,277 human genes in which DNMs have so far been found, the top-10 genes with the highest proportions of DNM relative to gene size were H3-3 A, DDX3X, CSNK2B, PURA, ZC4H2, STXBP1, SCN1A, SATB2, H3-3B and TUBA1A. The distribution of CADD and REVEL scores for both disease-associated DNMs and those mutations not reported to be de novo revealed a trend towards higher deleteriousness for DNMs, consistent with the likely lower selection pressure impacting them. This contrasts with the non-DNMs, which are presumed to have been subject to continuous negative selection over multiple generations. CONCLUSION: This meta-analysis provides important information on the occurrence and distribution of disease-associated DNMs in association with heritable disease and should make a significant contribution to our understanding of this major type of mutation.


Assuntos
Células Germinativas , Pais , Humanos , Mutação
2.
Nature ; 571(7766): 505-509, 2019 07.
Artigo em Inglês | MEDLINE | ID: mdl-31243369

RESUMO

The evolution of gene expression in mammalian organ development remains largely uncharacterized. Here we report the transcriptomes of seven organs (cerebrum, cerebellum, heart, kidney, liver, ovary and testis) across developmental time points from early organogenesis to adulthood for human, rhesus macaque, mouse, rat, rabbit, opossum and chicken. Comparisons of gene expression patterns identified correspondences of developmental stages across species, and differences in the timing of key events during the development of the gonads. We found that the breadth of gene expression and the extent of purifying selection gradually decrease during development, whereas the amount of positive selection and expression of new genes increase. We identified differences in the temporal trajectories of expression of individual genes across species, with brain tissues showing the smallest percentage of trajectory changes, and the liver and testis showing the largest. Our work provides a resource of developmental transcriptomes of seven organs across seven species, and comparative analyses that characterize the development and evolution of mammalian organs.


Assuntos
Regulação da Expressão Gênica no Desenvolvimento , Organogênese/genética , Transcriptoma/genética , Animais , Evolução Biológica , Galinhas/genética , Feminino , Humanos , Macaca mulatta/genética , Masculino , Camundongos , Gambás/genética , Coelhos , Ratos
3.
Genome Res ; 31(2): 327-336, 2021 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-33468550

RESUMO

Recent evidence from proteomics and deep massively parallel sequencing studies have revealed that eukaryotic genomes contain substantial numbers of as-yet-uncharacterized open reading frames (ORFs). We define these uncharacterized ORFs as novel ORFs (nORFs). nORFs in humans are mostly under 100 codons and are found in diverse regions of the genome, including in long noncoding RNAs, pseudogenes, 3' UTRs, 5' UTRs, and alternative reading frames of canonical protein coding exons. There is therefore a pressing need to evaluate the potential functional importance of these unannotated transcripts and proteins in biological pathways and human disease on a larger scale, rather than one at a time. In this study, we outline the creation of a valuable nORFs data set with experimental evidence of translation for the community, use measures of heritability and selection that reveal signals for functional importance, and show the potential implications for functional interpretation of genetic variants in nORFs. Our results indicate that some variants that were previously classified as being benign or of uncertain significance may have to be reinterpreted.

4.
Hum Genet ; 142(2): 245-274, 2023 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-36344696

RESUMO

Whilst DNA repeat expansions cause numerous heritable human disorders, their origins and underlying pathological mechanisms are often unclear. We collated a dataset comprising 224 human repeat expansions encompassing 203 different genes, and performed a systematic analysis with respect to key topological features at the DNA, RNA and protein levels. Comparison with controls without known pathogenicity and genomic regions lacking repeats, allowed the construction of the first tool to discriminate repeat regions harboring pathogenic repeat expansions (DPREx). At the DNA level, pathogenic repeat expansions exhibited stronger signals for DNA regulatory factors (e.g. H3K4me3, transcription factor-binding sites) in exons, promoters, 5'UTRs and 5'genes but were not significantly different from controls in introns, 3'UTRs and 3'genes. Additionally, pathogenic repeat expansions were also found to be enriched in non-B DNA structures. At the RNA level, pathogenic repeat expansions were characterized by lower free energy for forming RNA secondary structure and were closer to splice sites in introns, exons, promoters and 5'genes than controls. At the protein level, pathogenic repeat expansions exhibited a preference to form coil rather than other types of secondary structure, and tended to encode surface-located protein domains. Guided by these features, DPREx ( http://biomed.nscc-gz.cn/zhaolab/geneprediction/# ) achieved an Area Under the Curve (AUC) value of 0.88 in a test on an independent dataset. Pathogenic repeat expansions are thus located such that they exert a synergistic influence on the gene expression pathway involving inter-molecular connections at the DNA, RNA and protein levels.


Assuntos
Expansão das Repetições de DNA , DNA , Humanos , Íntrons/genética , RNA , Expansão das Repetições de Trinucleotídeos
5.
Nucleic Acids Res ; 49(1): 221-243, 2021 01 11.
Artigo em Inglês | MEDLINE | ID: mdl-33300026

RESUMO

Human genome stability requires efficient repair of oxidized bases, which is initiated via damage recognition and excision by NEIL1 and other base excision repair (BER) pathway DNA glycosylases (DGs). However, the biological mechanisms underlying detection of damaged bases among the million-fold excess of undamaged bases remain enigmatic. Indeed, mutation rates vary greatly within individual genomes, and lesion recognition by purified DGs in the chromatin context is inefficient. Employing super-resolution microscopy and co-immunoprecipitation assays, we find that acetylated NEIL1 (AcNEIL1), but not its non-acetylated form, is predominantly localized in the nucleus in association with epigenetic marks of uncondensed chromatin. Furthermore, chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) revealed non-random AcNEIL1 binding near transcription start sites of weakly transcribed genes and along highly transcribed chromatin domains. Bioinformatic analyses revealed a striking correspondence between AcNEIL1 occupancy along the genome and mutation rates, with AcNEIL1-occupied sites exhibiting fewer mutations compared to AcNEIL1-free domains, both in cancer genomes and in population variation. Intriguingly, from the evolutionarily conserved unstructured domain that targets NEIL1 to open chromatin, its damage surveillance of highly oxidation-susceptible sites to preserve essential gene function and to limit instability and cancer likely originated ∼500 million years ago during the buildup of free atmospheric oxygen.


Assuntos
Cromatina/fisiologia , DNA Glicosilases/metabolismo , Reparo do DNA , Processamento de Proteína Pós-Traducional , Acetilação , Animais , Linhagem Celular Tumoral , Núcleo Celular/metabolismo , Cromatina/ultraestrutura , DNA Glicosilases/química , DNA Glicosilases/fisiologia , Reparo do DNA/genética , Conjuntos de Dados como Assunto , Evolução Molecular , Genes de Helmintos , Genes Homeobox , Células HEK293 , Proteínas de Helminto/genética , Humanos , Invertebrados/genética , Invertebrados/metabolismo , Lisina/química , Mutação , Proteínas de Neoplasias/metabolismo , Neoplasias/genética , Neoplasias/metabolismo , Neoplasias/mortalidade , Oxirredução , Proteoma , Alinhamento de Sequência , Homologia de Sequência de Aminoácidos , Sítio de Iniciação de Transcrição , Vertebrados/genética , Vertebrados/metabolismo
6.
Hum Genet ; 139(10): 1197-1207, 2020 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-32596782

RESUMO

The Human Gene Mutation Database (HGMD®) constitutes a comprehensive collection of published germline mutations in nuclear genes that are thought to underlie, or are closely associated with human inherited disease. At the time of writing (June 2020), the database contains in excess of 289,000 different gene lesions identified in over 11,100 genes manually curated from 72,987 articles published in over 3100 peer-reviewed journals. There are primarily two main groups of users who utilise HGMD on a regular basis; research scientists and clinical diagnosticians. This review aims to highlight how to make the most out of HGMD data in each setting.


Assuntos
Bases de Dados Genéticas , Genoma Humano , Mutação em Linhagem Germinativa , Polimorfismo Genético , Bibliometria , Pesquisa Biomédica/métodos , Predisposição Genética para Doença , Humanos , Parcerias Público-Privadas
7.
PLoS Comput Biol ; 15(6): e1007112, 2019 06.
Artigo em Inglês | MEDLINE | ID: mdl-31199787

RESUMO

Differentiation between phenotypically neutral and disease-causing genetic variation remains an open and relevant problem. Among different types of variation, non-frameshifting insertions and deletions (indels) represent an understudied group with widespread phenotypic consequences. To address this challenge, we present a machine learning method, MutPred-Indel, that predicts pathogenicity and identifies types of functional residues impacted by non-frameshifting insertion/deletion variation. The model shows good predictive performance as well as the ability to identify impacted structural and functional residues including secondary structure, intrinsic disorder, metal and macromolecular binding, post-translational modifications, allosteric sites, and catalytic residues. We identify structural and functional mechanisms impacted preferentially by germline variation from the Human Gene Mutation Database, recurrent somatic variation from COSMIC in the context of different cancers, as well as de novo variants from families with autism spectrum disorder. Further, the distributions of pathogenicity prediction scores generated by MutPred-Indel are shown to differentiate highly recurrent from non-recurrent somatic variation. Collectively, we present a framework to facilitate the interrogation of both pathogenicity and the functional effects of non-frameshifting insertion/deletion variants. The MutPred-Indel webserver is available at http://mutpred.mutdb.org/.


Assuntos
Predisposição Genética para Doença/genética , Genoma Humano , Mutação INDEL , Transtorno do Espectro Autista/genética , Transtorno do Espectro Autista/fisiopatologia , Biologia Computacional , Bases de Dados Genéticas , Genoma Humano/genética , Genoma Humano/fisiologia , Humanos , Mutação INDEL/genética , Mutação INDEL/fisiologia , Aprendizado de Máquina , Curva ROC
8.
Hum Mutat ; 40(10): 1856-1873, 2019 10.
Artigo em Inglês | MEDLINE | ID: mdl-31131953

RESUMO

It has long been known that canonical 5' splice site (5'SS) GT>GC variants may be compatible with normal splicing. However, to date, the actual scale of canonical 5'SSs capable of generating wild-type transcripts in the case of GT>GC substitutions remains unknown. Herein, combining data derived from a meta-analysis of 45 human disease-causing 5'SS GT>GC variants and a cell culture-based full-length gene splicing assay of 103 5'SS GT>GC substitutions, we estimate that ~15-18% of canonical GT 5'SSs retain their capacity to generate between 1% and 84% normal transcripts when GT is substituted by GC. We further demonstrate that the canonical 5'SSs in which substitution of GT by GC-generated normal transcripts exhibit stronger complementarity to the 5' end of U1 snRNA than those sites whose substitutions of GT by GC did not lead to the generation of normal transcripts. We also observed a correlation between the generation of wild-type transcripts and a milder than expected clinical phenotype but found that none of the available splicing prediction tools were capable of reliably distinguishing 5'SS GT>GC variants that generated wild-type transcripts from those that did not. Our findings imply that 5'SS GT>GC variants in human disease genes may not invariably be pathogenic.


Assuntos
Processamento Alternativo , Sequência de Bases , Regulação da Expressão Gênica , Variação Genética , Sítios de Splice de RNA , Células Cultivadas , Biologia Computacional/métodos , Bases de Dados de Ácidos Nucleicos , Éxons , Perfilação da Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Íntrons , Motivos de Nucleotídeos , Matrizes de Pontuação de Posição Específica , Análise de Sequência de DNA
9.
Bioinformatics ; 34(3): 511-513, 2018 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-28968714

RESUMO

Summary: We present FATHMM-XF, a method for predicting pathogenic point mutations in the human genome. Drawing on an extensive feature set, FATHMM-XF outperforms competitors on benchmark tests, particularly in non-coding regions where the majority of pathogenic mutations are likely to be found. Availability and implementation: The FATHMM-XF web server is available at http://fathmm.biocompute.org.uk/fathmm-xf/, and as tracks on the Genome Tolerance Browser: http://gtb.biocompute.org.uk. Predictions are provided for human genome version GRCh37/hg19. The data used for this project can be downloaded from: http://fathmm.biocompute.org.uk/fathmm-xf/. Contact: mark.rogers@bristol.ac.uk or c.campbell@bristol.ac.uk. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Genômica/métodos , Mutação Puntual , Análise de Sequência de DNA/métodos , Software , Genoma Humano , Humanos
10.
Nucleic Acids Res ; 45(3): e13, 2017 02 17.
Artigo em Inglês | MEDLINE | ID: mdl-28180317

RESUMO

The in silico prediction of the functional consequences of mutations is an important goal of human pathogenetics. However, bioinformatic tools that classify mutations according to their functionality employ different algorithms so that predictions may vary markedly between tools. We therefore integrated nine popular prediction tools (PolyPhen-2, SNPs&GO, MutPred, SIFT, MutationTaster2, Mutation Assessor and FATHMM as well as conservation-based Grantham Score and PhyloP) into a single predictor. The optimal combination of these tools was selected by means of a wide range of statistical modeling techniques, drawing upon 10 029 disease-causing single nucleotide variants (SNVs) from Human Gene Mutation Database and 10 002 putatively 'benign' non-synonymous SNVs from UCSC. Predictive performance was found to be markedly improved by model-based integration, whilst maximum predictive capability was obtained with either random forest, decision tree or logistic regression analysis. A combination of PolyPhen-2, SNPs&GO, MutPred, MutationTaster2 and FATHMM was found to perform as well as all tools combined. Comparison of our approach with other integrative approaches such as Condel, CoVEC, CAROL, CADD, MetaSVM and MetaLR using an independent validation dataset, revealed the superiority of our newly proposed integrative approach. An online implementation of this approach, IMHOTEP ('Integrating Molecular Heuristics and Other Tools for Effect Prediction'), is provided at http://www.uni-kiel.de/medinfo/cgi-bin/predictor/.


Assuntos
Variação Genética , Software , Algoritmos , Biologia Computacional/métodos , Simulação por Computador , Humanos , Mutação , Polimorfismo de Nucleotídeo Único
11.
Hum Mutat ; 39(2): 292-301, 2018 02.
Artigo em Inglês | MEDLINE | ID: mdl-29044887

RESUMO

Many genetic diseases exhibit considerable epidemiological comorbidity and common symptoms, which provokes debate about the extent of their etiological overlap. The rapid growth in the number of known disease-causing mutations in the Human Gene Mutation Database (HGMD) has allowed us to characterize genetic similarities between diseases by ascertaining the extent to which identical genetic mutations are shared between diseases. Using this approach, we show that 41.6% of disease pairs in all possible pairs (42, 083) exhibit a significant sharing of mutations (P value < 0.05). These mutation-related disease pairs are in agreement with heritability-based disease-disease relations in 48 neurological and psychiatric disease pairs (Spearman's correlation coefficient = 0.50; P value = 3.4 × 10-5 ), and share over-expressed genes significantly more often than unrelated disease pairs (1.5-1.8-fold higher; P value ≤ 1.6 × 10-4 ). The usefulness of mutation-related disease pairs was further demonstrated for predicting novel mutations and identifying individuals susceptible to Crohn disease. Moreover, the mutation-based disease network concurs closely with that based on phenotypes.


Assuntos
Mutação/genética , Predisposição Genética para Doença/genética , Humanos , Fenótipo , RNA Mensageiro/genética
12.
Proteins ; 86 Suppl 1: 374-386, 2018 03.
Artigo em Inglês | MEDLINE | ID: mdl-28975675

RESUMO

Our goal is to answer the question: compared with experimental structures, how useful are predicted models for functional annotation? We assessed the functional utility of predicted models by comparing the performances of a suite of methods for functional characterization on the predictions and the experimental structures. We identified 28 sites in 25 protein targets to perform functional assessment. These 28 sites included nine sites with known ligand binding (holo-sites), nine sites that are expected or suggested by experimental authors for small molecule binding (apo-sites), and Ten sites containing important motifs, loops, or key residues with important disease-associated mutations. We evaluated the utility of the predictions by comparing their microenvironments to the experimental structures. Overall structural quality correlates with functional utility. However, the best-ranked predictions (global) may not have the best functional quality (local). Our assessment provides an ability to discriminate between predictions with high structural quality. When assessing ligand-binding sites, most prediction methods have higher performance on apo-sites than holo-sites. Some servers show consistently high performance for certain types of functional sites. Finally, many functional sites are associated with protein-protein interaction. We also analyzed biologically relevant features from the protein assemblies of two targets where the active site spanned the protein-protein interface. For the assembly targets, we find that the features in the models are mainly determined by the choice of template.


Assuntos
Produtos Biológicos/metabolismo , Biologia Computacional/métodos , Modelos Moleculares , Modelos Estatísticos , Conformação Proteica , Proteínas/química , Proteínas/metabolismo , Sítios de Ligação , Domínio Catalítico , Humanos , Ligantes , Ligação Proteica
13.
Bioinformatics ; 33(14): i389-i398, 2017 Jul 15.
Artigo em Inglês | MEDLINE | ID: mdl-28882004

RESUMO

MOTIVATION: Loss-of-function genetic variants are frequently associated with severe clinical phenotypes, yet many are present in the genomes of healthy individuals. The available methods to assess the impact of these variants rely primarily upon evolutionary conservation with little to no consideration of the structural and functional implications for the protein. They further do not provide information to the user regarding specific molecular alterations potentially causative of disease. RESULTS: To address this, we investigate protein features underlying loss-of-function genetic variation and develop a machine learning method, MutPred-LOF, for the discrimination of pathogenic and tolerated variants that can also generate hypotheses on specific molecular events disrupted by the variant. We investigate a large set of human variants derived from the Human Gene Mutation Database, ClinVar and the Exome Aggregation Consortium. Our prediction method shows an area under the Receiver Operating Characteristic curve of 0.85 for all loss-of-function variants and 0.75 for proteins in which both pathogenic and neutral variants have been observed. We applied MutPred-LOF to a set of 1142 de novo vari3ants from neurodevelopmental disorders and find enrichment of pathogenic variants in affected individuals. Overall, our results highlight the potential of computational tools to elucidate causal mechanisms underlying loss of protein function in loss-of-function variants. AVAILABILITY AND IMPLEMENTATION: http://mutpred.mutdb.org. CONTACT: predrag@indiana.edu.


Assuntos
Mutação com Perda de Função , Aprendizado de Máquina , Proteínas/genética , Análise de Sequência de Proteína/métodos , Software , Biologia Computacional/métodos , Humanos , Conformação Proteica , Proteínas/metabolismo , Proteínas/fisiologia
14.
BMC Med Genet ; 19(1): 183, 2018 10 11.
Artigo em Inglês | MEDLINE | ID: mdl-30305043

RESUMO

BACKGROUND: Mucopolysaccharidosis-IVA (Morquio A disease) is a lysosomal disorder in which the abnormal accumulation of keratan sulfate and chondroitin-6-sulfate is consequent to mutations in the galactosamine-6-sulfatase (GALNS) gene. Since standard DNA sequencing analysis fails to detect about 16% of GALNS mutant alleles, gross DNA rearrangement screening and uniparental disomy evaluation are required to complete the molecular diagnosis. Despite this, the second pathogenic GALNS allele generally remains unidentified in ~ 5% of Morquio-A disease patients. METHODS: In an attempt to bridge the residual gap between clinical and molecular diagnosis, we performed an mRNA-based evaluation of three Morquio-A disease patients in whom the second mutant GALNS allele had not been identified. We also performed sequence analysis of the entire GALNS gene in two patients. RESULTS: Different aberrant GALNS mRNA transcripts were characterized in each patient. Analysis of these transcripts then allowed the identification, in one patient, of a disease-causing deep intronic GALNS mutation. The aberrant mRNA products identified in the other two individuals resulted in partial exon loss. Despite sequencing the entire GALNS gene region in these patients, the identity of a single underlying pathological lesion could not be unequivocally determined. We postulate that a combination of multiple variants, acting in cis, may synergise in terms of their impact on the splicing machinery. CONCLUSIONS: We have identified GALNS variants located within deep intronic regions that have the potential to impact splicing. These findings have prompted us to incorporate mRNA analysis into our diagnostic flow procedure for the molecular analysis of Morquio A disease.


Assuntos
Condroitina Sulfatases/genética , Mucopolissacaridose IV/genética , Mutação , Splicing de RNA , RNA Mensageiro/genética , Adolescente , Sequência de Bases , Condroitina Sulfatases/metabolismo , Análise Mutacional de DNA , Árvores de Decisões , Éxons , Feminino , Genótipo , Humanos , Íntrons , Masculino , Mucopolissacaridose IV/diagnóstico , Mucopolissacaridose IV/metabolismo , Mucopolissacaridose IV/fisiopatologia , RNA Mensageiro/metabolismo
15.
Gastrointest Endosc ; 88(4): 665-673, 2018 10.
Artigo em Inglês | MEDLINE | ID: mdl-29702101

RESUMO

BACKGROUND AND AIMS: Duodenal polyposis and cancer have become a key issue for patients with familial adenomatous polyposis (FAP) and MUTYH-associated polyposis (MAP). Almost all patients with FAP will develop duodenal adenomas, and 5% will develop cancer. The incidence of duodenal adenomas in MAP appears to be lower than in FAP, but the limited available data suggest a comparable increase in the relative risk and lifetime risk of duodenal cancer. Current surveillance recommendations, however, are the same for FAP and MAP, using the Spigelman score (incorporating polyp number, size, dysplasia, and histology) for risk stratification and determination of surveillance intervals. Previous studies have demonstrated a benefit of enhanced detection rates of adenomas by use of chromoendoscopy both in sporadic colorectal disease and in groups at high risk of colorectal cancer. We aimed to assess the effect of chromoendoscopy on duodenal adenoma detection, to determine the impact on Spigelman stage and to compare this in individuals with known pathogenic mutations in order to determine the difference in duodenal involvement between MAP and FAP. METHODS: A prospective study examined the impact of chromoendoscopy on the assessment of the duodenum in 51 consecutive patients with MAP and FAP in 2 academic centers in the United Kingdom (University Hospital Llandough, Cardiff, and St Mark's Hospital, London) from 2011 to 2014. RESULTS: Enhanced adenoma detection of 3 times the number of adenomas after chromoendoscopy was demonstrated in both MAP (P = .013) and FAP (P = .002), but did not affect adenoma size. In both conditions, there was a significant increase in Spigelman stage after chromoendoscopy compared with endoscopy without dye spray. Spigelman scores and overall adenoma detection was significantly lower in MAP compared with FAP. CONCLUSIONS: Chromoendoscopy improved the diagnostic yield of anomas in MAP and FAP 3-fold, and in both MAP and FAP this resulted in a clinically significant upstaging in Spigelman score. Further studies are required to determine the impact of improved adenoma detection on the management and outcome of duodenal polyposis.


Assuntos
Polipose Adenomatosa do Colo/diagnóstico por imagem , Neoplasias Duodenais/diagnóstico por imagem , Endoscopia Gastrointestinal/métodos , Vigilância da População/métodos , Polipose Adenomatosa do Colo/genética , Polipose Adenomatosa do Colo/patologia , Adulto , Idoso , Idoso de 80 Anos ou mais , Corantes , DNA Glicosilases/genética , Neoplasias Duodenais/genética , Neoplasias Duodenais/patologia , Feminino , Humanos , Índigo Carmim , Masculino , Pessoa de Meia-Idade , Estadiamento de Neoplasias , Estudos Prospectivos , Carga Tumoral
16.
Nature ; 483(7388): 169-75, 2012 Mar 07.
Artigo em Inglês | MEDLINE | ID: mdl-22398555

RESUMO

Gorillas are humans' closest living relatives after chimpanzees, and are of comparable importance for the study of human origins and evolution. Here we present the assembly and analysis of a genome sequence for the western lowland gorilla, and compare the whole genomes of all extant great ape genera. We propose a synthesis of genetic and fossil evidence consistent with placing the human-chimpanzee and human-chimpanzee-gorilla speciation events at approximately 6 and 10 million years ago. In 30% of the genome, gorilla is closer to human or chimpanzee than the latter are to each other; this is rarer around coding genes, indicating pervasive selection throughout great ape evolution, and has functional consequences in gene expression. A comparison of protein coding genes reveals approximately 500 genes showing accelerated evolution on each of the gorilla, human and chimpanzee lineages, and evidence for parallel acceleration, particularly of genes involved in hearing. We also compare the western and eastern gorilla species, estimating an average sequence divergence time 1.75 million years ago, but with evidence for more recent genetic exchange and a population bottleneck in the eastern species. The use of the genome sequence in these and future analyses will promote a deeper understanding of great ape biology and evolution.


Assuntos
Evolução Molecular , Especiação Genética , Genoma/genética , Gorilla gorilla/genética , Animais , Feminino , Regulação da Expressão Gênica , Variação Genética/genética , Genômica , Humanos , Macaca mulatta/genética , Dados de Sequência Molecular , Pan troglodytes/genética , Filogenia , Pongo/genética , Proteínas/genética , Alinhamento de Sequência , Especificidade da Espécie , Transcrição Gênica
17.
BMC Bioinformatics ; 18(1): 442, 2017 Oct 06.
Artigo em Inglês | MEDLINE | ID: mdl-28985712

RESUMO

BACKGROUND: Small insertions and deletions (indels) have a significant influence in human disease and, in terms of frequency, they are second only to single nucleotide variants as pathogenic mutations. As the majority of mutations associated with complex traits are located outside the exome, it is crucial to investigate the potential pathogenic impact of indels in non-coding regions of the human genome. RESULTS: We present FATHMM-indel, an integrative approach to predict the functional effect, pathogenic or neutral, of indels in non-coding regions of the human genome. Our method exploits various genomic annotations in addition to sequence data. When validated on benchmark data, FATHMM-indel significantly outperforms CADD and GAVIN, state of the art models in assessing the pathogenic impact of non-coding variants. FATHMM-indel is available via a web server at indels.biocompute.org.uk. CONCLUSIONS: FATHMM-indel can accurately predict the functional impact and prioritise small indels throughout the whole non-coding genome.


Assuntos
Biologia Computacional/métodos , DNA Intergênico/genética , Genoma Humano , Mutação INDEL/genética , Genética Populacional , Humanos , Fenótipo , Curva ROC , Reprodutibilidade dos Testes , Software
18.
Hum Mutat ; 38(10): 1336-1347, 2017 10.
Artigo em Inglês | MEDLINE | ID: mdl-28649752

RESUMO

Synonymous single-nucleotide variants (SNVs), although they do not alter the encoded protein sequences, have been implicated in many genetic diseases. Experimental studies indicate that synonymous SNVs can lead to changes in the secondary and tertiary structures of DNA and RNA, thereby affecting translational efficiency, cotranslational protein folding as well as the binding of DNA-/RNA-binding proteins. However, the importance of these various features in disease phenotypes is not clearly understood. Here, we have built a support vector machine (SVM) model (termed DDIG-SN) as a means to discriminate disease-causing synonymous variants. The model was trained and evaluated on nearly 900 disease-causing variants. The method achieves robust performance with the area under the receiver operating characteristic curve of 0.84 and 0.85 for protein-stratified 10-fold cross-validation and independent testing, respectively. We were able to show that the disease-causing effects in the immediate proximity to exon-intron junctions (1-3 bp) are driven by the loss of splicing motif strength, whereas the gain of splicing motif strength is the primary cause in regions further away from the splice site (4-69 bp). The method is available as a part of the DDIG server at http://sparks-lab.org/ddig.


Assuntos
Proteínas de Ligação a DNA/genética , DNA/genética , Proteínas/genética , Mutação Silenciosa/genética , DNA/química , Proteínas de Ligação a DNA/química , Predisposição Genética para Doença , Humanos , Conformação de Ácido Nucleico , Polimorfismo de Nucleotídeo Único/genética , Dobramento de Proteína , Proteínas/química , RNA/química , RNA/genética
19.
Hum Mutat ; 38(1): 16-24, 2017 01.
Artigo em Inglês | MEDLINE | ID: mdl-27604408

RESUMO

Alternative splicing (AS) is a closely regulated process that allows a single gene to encode multiple protein isoforms, thereby contributing to the diversity of the proteome. Dysregulation of the splicing process has been found to be associated with many inherited diseases. However, among the pathogenic AS events, there are numerous "passenger" events whose inclusion or exclusion does not lead to significant changes with respect to protein function. In this study, we evaluate the secondary and tertiary structural features of proteins associated with disease-causing and neutral AS events, and show that several structural features are strongly associated with the pathological impact of exon inclusion. We further develop a machine-learning-based computational model, ExonImpact, for prioritizing and evaluating the functional consequences of hitherto uncharacterized AS events. We evaluated our model using several strategies including cross-validation, and data from the Gene-Tissue Expression (GTEx) and ClinVar databases. ExonImpact is freely available at http://watson.compbio.iupui.edu/ExonImpact.


Assuntos
Processamento Alternativo , Biologia Computacional/métodos , Éxons , Estudos de Associação Genética/métodos , Software , Algoritmos , Encéfalo/metabolismo , Bases de Dados de Ácidos Nucleicos , Predisposição Genética para Doença , Humanos , Aprendizado de Máquina , Domínios Proteicos , Isoformas de Proteínas/química , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Relação Estrutura-Atividade , Navegador
20.
Hum Genet ; 136(6): 665-677, 2017 06.
Artigo em Inglês | MEDLINE | ID: mdl-28349240

RESUMO

The Human Gene Mutation Database (HGMD®) constitutes a comprehensive collection of published germline mutations in nuclear genes that underlie, or are closely associated with human inherited disease. At the time of writing (March 2017), the database contained in excess of 203,000 different gene lesions identified in over 8000 genes manually curated from over 2600 journals. With new mutation entries currently accumulating at a rate exceeding 17,000 per annum, HGMD represents de facto the central unified gene/disease-oriented repository of heritable mutations causing human genetic disease used worldwide by researchers, clinicians, diagnostic laboratories and genetic counsellors, and is an essential tool for the annotation of next-generation sequencing data. The public version of HGMD ( http://www.hgmd.org ) is freely available to registered users from academic institutions and non-profit organisations whilst the subscription version (HGMD Professional) is available to academic, clinical and commercial users under license via QIAGEN Inc.


Assuntos
Bases de Dados Genéticas , Mutação , Humanos , Técnicas de Diagnóstico Molecular
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA