Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 24
Filtrar
1.
Hum Genet ; 142(2): 245-274, 2023 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-36344696

RESUMO

Whilst DNA repeat expansions cause numerous heritable human disorders, their origins and underlying pathological mechanisms are often unclear. We collated a dataset comprising 224 human repeat expansions encompassing 203 different genes, and performed a systematic analysis with respect to key topological features at the DNA, RNA and protein levels. Comparison with controls without known pathogenicity and genomic regions lacking repeats, allowed the construction of the first tool to discriminate repeat regions harboring pathogenic repeat expansions (DPREx). At the DNA level, pathogenic repeat expansions exhibited stronger signals for DNA regulatory factors (e.g. H3K4me3, transcription factor-binding sites) in exons, promoters, 5'UTRs and 5'genes but were not significantly different from controls in introns, 3'UTRs and 3'genes. Additionally, pathogenic repeat expansions were also found to be enriched in non-B DNA structures. At the RNA level, pathogenic repeat expansions were characterized by lower free energy for forming RNA secondary structure and were closer to splice sites in introns, exons, promoters and 5'genes than controls. At the protein level, pathogenic repeat expansions exhibited a preference to form coil rather than other types of secondary structure, and tended to encode surface-located protein domains. Guided by these features, DPREx ( http://biomed.nscc-gz.cn/zhaolab/geneprediction/# ) achieved an Area Under the Curve (AUC) value of 0.88 in a test on an independent dataset. Pathogenic repeat expansions are thus located such that they exert a synergistic influence on the gene expression pathway involving inter-molecular connections at the DNA, RNA and protein levels.


Assuntos
Expansão das Repetições de DNA , DNA , Humanos , Íntrons/genética , RNA , Expansão das Repetições de Trinucleotídeos
2.
Hum Mutat ; 43(3): 328-346, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-34918412

RESUMO

Microdeletions and gross deletions are important causes (~20%) of human inherited disease and their genomic locations are strongly influenced by the local DNA sequence environment. This notwithstanding, no study has systematically examined their underlying generative mechanisms. Here, we obtained 42,098 pathogenic microdeletions and gross deletions from the Human Gene Mutation Database (HGMD) that together form a continuum of germline deletions ranging in size from 1 to 28,394,429 bp. We analyzed the DNA sequence within 1 kb of the breakpoint junctions and found that the frequencies of non-B DNA-forming repeats, GC-content, and the presence of seven of 78 specific sequence motifs in the vicinity of pathogenic deletions correlated with deletion length for deletions of length ≤30 bp. Further, we found that the presence of DR, GQ, and STR repeats is important for the formation of longer deletions (>30 bp) but not for the formation of shorter deletions (≤30 bp) while significantly (χ2 , p < 2E-16) more microhomologies were identified flanking short deletions than long deletions (length >30 bp). We provide evidence to support a functional distinction between microdeletions and gross deletions. Finally, we propose that a deletion length cut-off of 25-30 bp may serve as an objective means to functionally distinguish microdeletions from gross deletions.


Assuntos
DNA , Genoma Humano , Composição de Bases , Sequência de Bases , DNA/genética , Genoma Humano/genética , Humanos , Mutação , Deleção de Sequência
3.
Hum Genet ; 139(10): 1197-1207, 2020 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-32596782

RESUMO

The Human Gene Mutation Database (HGMD®) constitutes a comprehensive collection of published germline mutations in nuclear genes that are thought to underlie, or are closely associated with human inherited disease. At the time of writing (June 2020), the database contains in excess of 289,000 different gene lesions identified in over 11,100 genes manually curated from 72,987 articles published in over 3100 peer-reviewed journals. There are primarily two main groups of users who utilise HGMD on a regular basis; research scientists and clinical diagnosticians. This review aims to highlight how to make the most out of HGMD data in each setting.


Assuntos
Bases de Dados Genéticas , Genoma Humano , Mutação em Linhagem Germinativa , Polimorfismo Genético , Bibliometria , Pesquisa Biomédica/métodos , Predisposição Genética para Doença , Humanos , Parcerias Público-Privadas
4.
Nature ; 483(7388): 169-75, 2012 Mar 07.
Artigo em Inglês | MEDLINE | ID: mdl-22398555

RESUMO

Gorillas are humans' closest living relatives after chimpanzees, and are of comparable importance for the study of human origins and evolution. Here we present the assembly and analysis of a genome sequence for the western lowland gorilla, and compare the whole genomes of all extant great ape genera. We propose a synthesis of genetic and fossil evidence consistent with placing the human-chimpanzee and human-chimpanzee-gorilla speciation events at approximately 6 and 10 million years ago. In 30% of the genome, gorilla is closer to human or chimpanzee than the latter are to each other; this is rarer around coding genes, indicating pervasive selection throughout great ape evolution, and has functional consequences in gene expression. A comparison of protein coding genes reveals approximately 500 genes showing accelerated evolution on each of the gorilla, human and chimpanzee lineages, and evidence for parallel acceleration, particularly of genes involved in hearing. We also compare the western and eastern gorilla species, estimating an average sequence divergence time 1.75 million years ago, but with evidence for more recent genetic exchange and a population bottleneck in the eastern species. The use of the genome sequence in these and future analyses will promote a deeper understanding of great ape biology and evolution.


Assuntos
Evolução Molecular , Especiação Genética , Genoma/genética , Gorilla gorilla/genética , Animais , Feminino , Regulação da Expressão Gênica , Variação Genética/genética , Genômica , Humanos , Macaca mulatta/genética , Dados de Sequência Molecular , Pan troglodytes/genética , Filogenia , Pongo/genética , Proteínas/genética , Alinhamento de Sequência , Especificidade da Espécie , Transcrição Gênica
5.
Hum Genet ; 136(6): 665-677, 2017 06.
Artigo em Inglês | MEDLINE | ID: mdl-28349240

RESUMO

The Human Gene Mutation Database (HGMD®) constitutes a comprehensive collection of published germline mutations in nuclear genes that underlie, or are closely associated with human inherited disease. At the time of writing (March 2017), the database contained in excess of 203,000 different gene lesions identified in over 8000 genes manually curated from over 2600 journals. With new mutation entries currently accumulating at a rate exceeding 17,000 per annum, HGMD represents de facto the central unified gene/disease-oriented repository of heritable mutations causing human genetic disease used worldwide by researchers, clinicians, diagnostic laboratories and genetic counsellors, and is an essential tool for the annotation of next-generation sequencing data. The public version of HGMD ( http://www.hgmd.org ) is freely available to registered users from academic institutions and non-profit organisations whilst the subscription version (HGMD Professional) is available to academic, clinical and commercial users under license via QIAGEN Inc.


Assuntos
Bases de Dados Genéticas , Mutação , Humanos , Técnicas de Diagnóstico Molecular
6.
PLoS Genet ; 9(9): e1003816, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24086153

RESUMO

Single base substitutions constitute the most frequent type of human gene mutation and are a leading cause of cancer and inherited disease. These alterations occur non-randomly in DNA, being strongly influenced by the local nucleotide sequence context. However, the molecular mechanisms underlying such sequence context-dependent mutagenesis are not fully understood. Using bioinformatics, computational and molecular modeling analyses, we have determined the frequencies of mutation at G • C bp in the context of all 64 5'-NGNN-3' motifs that contain the mutation at the second position. Twenty-four datasets were employed, comprising >530,000 somatic single base substitutions from 21 cancer genomes, >77,000 germline single-base substitutions causing or associated with human inherited disease and 16.7 million benign germline single-nucleotide variants. In several cancer types, the number of mutated motifs correlated both with the free energies of base stacking and the energies required for abstracting an electron from the target guanines (ionization potentials). Similar correlations were also evident for the pathological missense and nonsense germline mutations, but only when the target guanines were located on the non-transcribed DNA strand. Likewise, pathogenic splicing mutations predominantly affected positions in which a purine was located on the non-transcribed DNA strand. Novel candidate driver mutations and tissue-specific mutational patterns were also identified in the cancer datasets. We conclude that electron transfer reactions within the DNA molecule contribute to sequence context-dependent mutagenesis, involving both somatic driver and passenger mutations in cancer, as well as germline alterations causing or associated with inherited disease.


Assuntos
Substituição de Aminoácidos/genética , Doenças Genéticas Inatas/genética , Guanina , Neoplasias/genética , Biologia Computacional , DNA de Neoplasias/genética , Doenças Genéticas Inatas/patologia , Mutação em Linhagem Germinativa , Humanos , Modelos Moleculares , Neoplasias/patologia , Motivos de Nucleotídeos/genética
7.
Am J Hum Genet ; 91(6): 1022-32, 2012 Dec 07.
Artigo em Inglês | MEDLINE | ID: mdl-23217326

RESUMO

We have assessed the numbers of potentially deleterious variants in the genomes of apparently healthy humans by using (1) low-coverage whole-genome sequence data from 179 individuals in the 1000 Genomes Pilot Project and (2) current predictions and databases of deleterious variants. Each individual carried 281-515 missense substitutions, 40-85 of which were homozygous, predicted to be highly damaging. They also carried 40-110 variants classified by the Human Gene Mutation Database (HGMD) as disease-causing mutations (DMs), 3-24 variants in the homozygous state, and many polymorphisms putatively associated with disease. Whereas many of these DMs are likely to represent disease-allele-annotation errors, between 0 and 8 DMs (0-1 homozygous) per individual are predicted to be highly damaging, and some of them provide information of medical relevance. These analyses emphasize the need for improved annotation of disease alleles both in mutation databases and in the primary literature; some HGMD mutation data have been recategorized on the basis of the present findings, an iterative process that is both necessary and ongoing. Our estimates of deleterious-allele numbers are likely to be subject to both overcounting and undercounting. However, our current best mean estimates of ~400 damaging variants and ~2 bona fide disease mutations per individual are likely to increase rather than decrease as sequencing studies ascertain rare variants more effectively and as additional disease alleles are discovered.


Assuntos
Alelos , Taxa de Mutação , Bases de Dados de Ácidos Nucleicos , Genoma Humano , Estudo de Associação Genômica Ampla , Humanos , Mutação de Sentido Incorreto , Prevalência
8.
Hum Genet ; 133(1): 1-9, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24077912

RESUMO

The Human Gene Mutation Database (HGMD®) is a comprehensive collection of germline mutations in nuclear genes that underlie, or are associated with, human inherited disease. By June 2013, the database contained over 141,000 different lesions detected in over 5,700 different genes, with new mutation entries currently accumulating at a rate exceeding 10,000 per annum. HGMD was originally established in 1996 for the scientific study of mutational mechanisms in human genes. However, it has since acquired a much broader utility as a central unified disease-oriented mutation repository utilized by human molecular geneticists, genome scientists, molecular biologists, clinicians and genetic counsellors as well as by those specializing in biopharmaceuticals, bioinformatics and personalized genomics. The public version of HGMD (http://www.hgmd.org) is freely available to registered users from academic institutions/non-profit organizations whilst the subscription version (HGMD Professional) is available to academic, clinical and commercial users under license via BIOBASE GmbH.


Assuntos
Bases de Dados Genéticas , Genoma Humano , Mutação em Linhagem Germinativa , Núcleo Celular/genética , Biologia Computacional , Variações do Número de Cópias de DNA , Predisposição Genética para Doença , Testes Genéticos , Genômica , Humanos , Polimorfismo Genético , Medicina de Precisão
9.
Hum Genomics ; 5(5): 453-84, 2011 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-21807602

RESUMO

The recent publication of the draft genome sequences of the Neanderthal and a ∼50,000-year-old archaic hominin from Denisova Cave in southern Siberia has ushered in a new age in molecular archaeology. We previously cross-compared the human, chimpanzee and Neanderthal genome sequences with respect to a set of disease-causing/disease-associated missense and regulatory mutations (Human Gene Mutation Database) and succeeded in identifying genetic variants which, although apparently pathogenic in humans, may represent a 'compensated' wild-type state in at least one of the other two species. Here, in an attempt to identify further 'potentially compensated mutations' (PCMs) of interest, we have compared our dataset of disease-causing/disease-associated mutations with their corresponding nucleotide positions in the Denisovan hominin, Neanderthal and chimpanzee genomes. Of the 15 human putatively disease-causing mutations that were found to be compensated in chimpanzee, Denisovan or Neanderthal, only a solitary F5 variant (Val1736Met) was specific to the Denisovan. In humans, this missense mutation is associated with activated protein C resistance and an increased risk of thromboembolism and recurrent miscarriage. It is unclear at this juncture whether this variant was indeed a PCM in the Denisovan or whether it could instead have been associated with disease in this ancient hominin.


Assuntos
Genoma , Hominidae/genética , Mutação , Pan troglodytes/genética , Animais , Bases de Dados Genéticas , Genoma Humano , Humanos , Homem de Neandertal/genética
10.
Hum Mutat ; 32(10): 1137-43, 2011 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-21681852

RESUMO

A total of 405 unique single base-pair substitutions, located within the ATG translation initiation codons (TICs) of 255 different genes, and reported to cause human genetic disease, were retrieved from the Human Gene Mutation Database (HGMD). Although these lesions comprised only 0.7% of coding sequence mutations in HGMD, they nevertheless were 3.4-fold overrepresented as compared to other missense mutations. The distance between a TIC and the next downstream in-frame ATG codon was significantly greater for genes harboring TIC mutations than for the remainder of genes in HGMD (control genes). This suggests that the absence of an alternative ATG codon in the vicinity of a TIC increases the likelihood that a given TIC mutation will come to clinical attention. An additional 42 single base-pair substitutions in 37 different genes were identified in the vicinity of TICs (positions -6 to +4, comprising the so-called "Kozak consensus sequence"). These substitutions were not evenly distributed, being significantly more abundant at position +4. Finally, contrary to our initial expectation, the match between the original TIC and the Kozak consensus sequence was significantly better (rather than worse) for genes harboring TIC mutations than for the HGMD control genes.


Assuntos
Códon de Iniciação , Doenças Genéticas Inatas/genética , Iniciação Traducional da Cadeia Peptídica/genética , Mutação Puntual , Região 5'-Flanqueadora , Sequência Consenso , Bases de Dados Genéticas , Humanos , Taxa de Mutação , Fases de Leitura Aberta
11.
Hum Genomics ; 4(6): 406-10, 2010 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-20846930

RESUMO

The cytosine-guanine (CpG) dinucleotide has long been known to be a hotspot for pathological mutation in the human genome. This hypermutability is related to its role as the major site of cytosine methylation with the attendant risk of spontaneous deamination of 5-methylcytosine (5mC) to yield thymine. Cytosine methylation, however, also occurs in the context of CpNpG sites in the human genome, an unsurprising finding since the intrinsic symmetry of CpNpG renders it capable of supporting a semi-conservative model of replication of the methylation pattern. Recently, it has become clear that significant DNA methylation occurs in a CpHpG context (where H = A, C or T) in a variety of human somatic tissues. If we assume that CpHpG methylation also occurs in the germline, and that 5mC deamination can occur within a CpHpG context, then we might surmise that methylated CpHpG sites could also constitute mutation hotspots causing human genetic disease. To test this postulate, 54,625 missense and nonsense mutations from 2,113 genes causing inherited disease were retrieved from the Human Gene Mutation Database (http://www.hgmd.org). Some 18.2 per cent of these pathological lesions were found to be C → T and G → A transitions located in CpG dinucleotides (compatible with a model of methylation-mediated deamination of 5mC), an approximately ten-fold higher proportion than would have been expected by chance alone. The corresponding proportion for the CpHpG trinucleotide was 9.9 per cent, an approximately two-fold higher proportion than would have been expected by chance. We therefore estimate that ∼5 per cent of missense/nonsense mutations causing human inherited disease may be attributable to methylation-mediated deamination of 5mC within a CpHpG context.


Assuntos
5-Metilcitosina/metabolismo , Metilação de DNA/genética , Fosfatos de Dinucleosídeos/genética , Doenças Genéticas Inatas/genética , Mutação/genética , Repetições de Trinucleotídeos/genética , Bases de Dados de Ácidos Nucleicos , Desaminação , Humanos
12.
Hum Mutat ; 31(12): 1286-93, 2010 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-21064102

RESUMO

Triangulation of the human, chimpanzee, and Neanderthal genome sequences with respect to 44,348 disease-causing or disease-associated missense mutations and 1,712 putative regulatory mutations listed in the Human Gene Mutation Database was employed to identify genetic variants that are apparently pathogenic in humans but which may represent a "compensated" wild-type state in at least one of the other two species. Of 122 such "potentially compensated mutations" (PCMs) identified, 88 were deemed "ancestral" on the basis that the reported wild-type Neanderthal nucleotide was identical to that of the chimpanzee. Another 33 PCMs were deemed to be "derived" in that the Neanderthal wild-type nucleotide matched the human but not the chimpanzee wild-type. For the remaining PCM, all three wild-type states were found to differ. Whereas a derived PCM would require compensation only in the chimpanzee, ancestral PCMs are useful as a means to identify sites of possible adaptive differences between modern humans on the one hand, and Neanderthals and chimpanzees on the other. Ancestral PCMs considered to be disease-causing in humans were identified in two Neanderthal genes (DUOX2, MAMLD1). Because the underlying mutations are known to give rise to recessive conditions in human, it is possible that they may also have been of pathological significance in Neanderthals. Hum Mutat 31:1-8, 2010. © 2010 Wiley-Liss, Inc.


Assuntos
Genoma/genética , Hominidae/genética , Mutação/genética , Pan troglodytes/genética , Sequência de Aminoácidos , Animais , Sequência de Bases , Proteínas de Ligação a DNA/genética , Bases de Dados Genéticas , Doença/genética , Genética Populacional , Haplótipos/genética , Humanos , Dados de Sequência Molecular
13.
Hum Mutat ; 31(6): 631-55, 2010 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-20506564

RESUMO

The number of reported germline mutations in human nuclear genes, either underlying or associated with inherited disease, has now exceeded 100,000 in more than 3,700 different genes. The availability of these data has both revolutionized the study of the morbid anatomy of the human genome and facilitated "personalized genomics." With approximately 300 new "inherited disease genes" (and approximately 10,000 new mutations) being identified annually, it is pertinent to ask how many "inherited disease genes" there are in the human genome, how many mutations reside within them, and where such lesions are likely to be located? To address these questions, it is necessary not only to reconsider how we define human genes but also to explore notions of gene "essentiality" and "dispensability."Answers to these questions are now emerging from recent novel insights into genome structure and function and through complete genome sequence information derived from multiple individual human genomes. However, a change in focus toward screening functional genomic elements as opposed to genes sensu stricto will be required if we are to capitalize fully on recent technical and conceptual advances and identify new types of disease-associated mutation within noncoding regions remote from the genes whose function they disrupt.


Assuntos
Doenças Genéticas Inatas/genética , Predisposição Genética para Doença/genética , Genoma Humano/genética , Mutação , Genômica/métodos , Genômica/estatística & dados numéricos , Genômica/tendências , Humanos , Fases de Leitura Aberta/genética , Polimorfismo Genético , Sequências Reguladoras de Ácido Nucleico/genética
14.
Hum Mutat ; 26(3): 205-13, 2005 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-16086312

RESUMO

In the Human Gene Mutation Database (www.hgmd.org), microdeletions and microinsertions causing inherited disease (both defined as involving < or = 20 bp of DNA) account for 8,399 (17%) and 3,345 (7%) logged mutations, in 940 and 668 genes, respectively. A positive correlation was noted between the microdeletion and microinsertion frequencies for 564 genes for which both microdeletions and microinsertions are reported in HGMD, consistent with the view that the propensity of a given gene/sequence to undergo microdeletion is related to its propensity to undergo microinsertion. While microdeletions and microinsertions of 1 bp constitute respectively 48% and 66% of the corresponding totals, the relative frequency of the remaining lesions correlates negatively with the length of the DNA sequence deleted or inserted. Many of the microdeletions and microinsertions of more than 1 bp are potentially explicable in terms of slippage mutagenesis, involving the addition or removal of one copy of a mono-, di-, or trinucleotide tandem repeat. The frequency of in-frame 3-bp and 6-bp microinsertions and microdeletions was, however, found to be significantly lower than that of mutations of other lengths, suggesting that some of these in-frame lesions may not have come to clinical attention. Various sequence motifs were found to be over-represented in the vicinity of both microinsertions and microdeletions, including the heptanucleotide CCCCCTG that shares homology with the complement of the 8-bp human minisatellite conserved sequence/chi-like element (GCWGGWGG). The previously reported indel hotspot GTAAGT and its complement ACTTAC were also found to be overrepresented in the vicinity of both microinsertions and microdeletions, thereby providing a first example of a mutational hotspot that is common to different types of gene lesion. Other motifs overrepresented in the vicinity of microdeletions and microinsertions included DNA polymerase pause sites and topoisomerase cleavage sites. Several novel microdeletion/microinsertion hotspots were noted and some of these exhibited sufficient similarity to one another to justify terming them "super-hotspot" motifs. Analysis of sequence complexity also demonstrated that a combination of slipped mispairing mediated by direct repeats, and secondary structure formation promoted by symmetric elements, can account for the majority of microdeletions and microinsertions. Thus, microinsertions and microdeletions exhibit strong similarities in terms of the characteristics of their flanking DNA sequences, implying that they are generated by very similar underlying mechanisms.


Assuntos
Doenças Genéticas Inatas/genética , Mutagênese , Análise de Sequência de DNA/métodos , Biologia Computacional/métodos , DNA Polimerase Dirigida por DNA/genética , Bases de Dados Genéticas , Deleção de Genes , Variação Genética , Humanos , Mutação , Sequências Repetitivas de Ácido Nucleico
16.
Hum Mutat ; 22(3): 229-44, 2003 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-12938088

RESUMO

Translocations and gross deletions are important causes of both cancer and inherited disease. Such gene rearrangements are nonrandomly distributed in the human genome as a consequence of selection for growth advantage and/or the inherent potential of some DNA sequences to be frequently involved in breakage and recombination. Using the Gross Rearrangement Breakpoint Database [GRaBD; www.uwcm.ac.uk/uwcm/mg/grabd/grabd.html] (containing 397 germ-line and somatic DNA breakpoint junction sequences derived from 219 different rearrangements underlying human inherited disease and cancer), we have analyzed the sequence context of translocation and deletion breakpoints in a search for general characteristics that might have rendered these sequences prone to rearrangement. The oligonucleotide composition of breakpoint junctions and a set of reference sequences, matched for length and genomic location, were compared with respect to their nucleotide composition. Deletion breakpoints were found to be AT-rich whereas by comparison, translocation breakpoints were GC-rich. Alternating purine-pyrimidine sequences were found to be significantly over-represented in the vicinity of deletion breakpoints while polypyrimidine tracts were over-represented at translocation breakpoints. A number of recombination-associated motifs were found to be over-represented at translocation breakpoints (including DNA polymerase pause sites/frameshift hotspots, immunoglobulin heavy chain class switch sites, heptamer/nonamer V(D)J recombination signal sequences, translin binding sites, and the chi element) but, with the exception of the translin-binding site and immunoglobulin heavy chain class switch sites, none of these motifs were over-represented at deletion breakpoints. Alu sequences were found to span both breakpoints in seven cases of gross deletion that may thus be inferred to have arisen by homologous recombination. Our results are therefore consistent with a role for homologous unequal recombination in deletion mutagenesis and a role for nonhomologous recombination in the generation of translocations.


Assuntos
Quebra Cromossômica/genética , Deleção Cromossômica , Doenças Genéticas Inatas/genética , Neoplasias/genética , Recombinação Genética/genética , Translocação Genética/genética , Elementos Alu/genética , Composição de Bases , Biologia Computacional/métodos , Bases de Dados Genéticas , Genoma Humano , Humanos , Região de Junção de Imunoglobulinas/genética , Região Variável de Imunoglobulina/genética , Internet , Sequências Repetitivas de Ácido Nucleico , Homologia de Sequência do Ácido Nucleico
17.
Hum Mutat ; 21(1): 28-44, 2003 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-12497629

RESUMO

A relatively rare type of mutation causing human genetic disease is the indel, a complex lesion that appears to represent a combination of micro-deletion and micro-insertion. In the absence of meta-analytical studies of indels, the mutational mechanisms underlying indel formation remain unclear. Data from the Human Gene Mutation Database (HGMD) were therefore used to compare and contrast 211 different indels underlying genetic disease in an attempt to deduce the processes responsible for their genesis. Each indel was treated as if it were the result of a two-step insertion/deletion process and was assessed in the context of 10 base-pairs DNA sequence flanking the lesion on either side. Several indel hotspots were noted and a GTAAGT motif was found to be significantly over-represented in the vicinity of the indels studied. Previously postulated mechanisms underlying micro-deletions and micro-insertions were initially explored in terms of local DNA sequence regularity as measured by its complexity. The change in complexity consequent to a mutation was found to be indicative of the type of repeat sequence involved in mediating the event, thereby providing clues as to the underlying mutational mechanism. Complexity analysis was then employed to examine the possible intermediates through which each indel could have occurred and to propose likely mechanisms and pathways for indel generation on an individual basis. Manual analysis served to confirm that the majority of indels (>90%) are explicable in terms of a two-step process involving established mutational mechanisms. Indels equivalent to double base-pair substitutions (22% of the total) were found to be mechanistically indistinguishable from the remainder and may therefore be regarded as a special type of indel. The observed correspondence between changes in local DNA sequence complexity and the involvement of specific mutational mechanisms in the insertion/deletion process, and the ability of generated models to account for both the number and identity of the bases deleted and/or inserted, makes this approach invaluable not only for the analysis of indel formation, but also for the study of other types of complex lesion.


Assuntos
Doenças Genéticas Inatas/genética , Predisposição Genética para Doença , Mutagênese , Mutação , Sequência de Bases , Análise Mutacional de DNA , Humanos , Modelos Genéticos , Mutagênese Insercional , Conformação de Ácido Nucleico , Sequências Repetitivas de Ácido Nucleico , Deleção de Sequência
18.
Hum Mutat ; 21(6): 577-81, 2003 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-12754702

RESUMO

The Human Gene Mutation Database (HGMD) constitutes a comprehensive core collection of data on germ-line mutations in nuclear genes underlying or associated with human inherited disease (www.hgmd.org). Data catalogued includes: single base-pair substitutions in coding, regulatory and splicing-relevant regions; micro-deletions and micro-insertions; indels; triplet repeat expansions as well as gross deletions; insertions; duplications; and complex rearrangements. Each mutation is entered into HGMD only once in order to avoid confusion between recurrent and identical-by-descent lesions. By March 2003, the database contained in excess of 39,415 different lesions detected in 1,516 different nuclear genes, with new entries currently accumulating at a rate exceeding 5,000 per annum. Since its inception, HGMD has been expanded to include cDNA reference sequences for more than 87% of listed genes, splice junction sequences, disease-associated and functional polymorphisms, as well as links to data present in publicly available online locus-specific mutation databases. Although HGMD has recently entered into a licensing agreement with Celera Genomics (Rockville, MD), mutation data will continue to be made freely available via the Internet.


Assuntos
Bases de Dados Genéticas , Genes/genética , Mutação/genética , Genoma Humano , Genômica , Humanos , Internet , Polimorfismo Genético/genética , Fatores de Tempo
19.
Genome Biol ; 15(1): R19, 2014 Jan 13.
Artigo em Inglês | MEDLINE | ID: mdl-24451234

RESUMO

We have developed a novel machine-learning approach, MutPred Splice, for the identification of coding region substitutions that disrupt pre-mRNA splicing. Applying MutPred Splice to human disease-causing exonic mutations suggests that 16% of mutations causing inherited disease and 10 to 14% of somatic mutations in cancer may disrupt pre-mRNA splicing. For inherited disease, the main mechanism responsible for the splicing defect is splice site loss, whereas for cancer the predominant mechanism of splicing disruption is predicted to be exon skipping via loss of exonic splicing enhancers or gain of exonic splicing silencer elements. MutPred Splice is available at http://mutdb.org/mutpredsplice.


Assuntos
Processamento Alternativo/genética , Éxons , Variação Genética , Aprendizado de Máquina , Genes Supressores de Tumor , Humanos , Íntrons , Mutação , Mutação de Sentido Incorreto , Neoplasias/genética , Polimorfismo de Nucleotídeo Único , Precursores de RNA/genética , Sítios de Splice de RNA/genética , Elementos Silenciadores Transcricionais/genética
20.
Curr Protoc Bioinformatics ; Chapter 1: 1.13.1-1.13.20, 2012 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-22948725

RESUMO

The Human Gene Mutation Database (HGMD) constitutes a comprehensive core collection of data on germ-line mutations in nuclear genes underlying or associated with human inherited disease (http://www.hgmd.org). Data cataloged include single-base-pair substitutions in coding, regulatory, and splicing-relevant regions, micro-deletions and micro-insertions, indels, and triplet repeat expansions, as well as gross gene deletions, insertions, duplications, and complex rearrangements. Each mutation is entered into HGMD only once, in order to avoid confusion between recurrent and identical-by-descent lesions. By March 2012, the database contained in excess of 123,600 different lesions (HGMD Professional release 2012.1) detected in 4,514 different nuclear genes, with new entries currently accumulating at a rate in excess of 10,000 per annum. ∼6,000 of these entries constitute disease-associated and functional polymorphisms. HGMD also includes cDNA reference sequences for more than 98% of the listed genes.


Assuntos
Evolução Molecular , Genômica/métodos , Mutação , Bases de Dados Factuais , Genoma Humano , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA