Your browser doesn't support javascript.
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 21
Filtrar
Filtros adicionais











País/Região como assunto
Intervalo de ano
1.
Hum Mutat ; 2019 Jul 13.
Artigo em Inglês | MEDLINE | ID: mdl-31301154

RESUMO

Precision medicine and sequence-based clinical diagnostics seek to predict disease risk or to identify causative variants from sequencing data. The Critical Assessment of Genome Interpretation (CAGI) is a community experiment consisting of genotype-phenotype prediction challenges; participants build models, undergo assessment, and share key findings. In the past, few CAGI challenges have addressed the impact of sequence variants on splicing. In CAGI5, two challenges (Vex-seq and MaPSY) involved prediction of the effect of variants, primarily single-nucleotide changes, on splicing. Although there are significant differences between these two challenges, both involved prediction of results from high-throughput exon inclusion assays. Here, we discuss the methods used to predict the impact of these variants on splicing, their performance, strengths, and weaknesses, and prospects for predicting the impact of sequence variation on splicing and disease phenotypes.

2.
J Mol Diagn ; 21(2): 318-329, 2019 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-30610921

RESUMO

Orthogonal confirmation of next-generation sequencing (NGS)-detected germline variants is standard practice, although published studies have suggested that confirmation of the highest-quality calls may not always be necessary. The key question is how laboratories can establish criteria that consistently identify those NGS calls that require confirmation. Most prior studies addressing this question have had limitations: they have been generally of small scale, omitted statistical justification, and explored limited aspects of underlying data. The rigorous definition of criteria that separate high-accuracy NGS calls from those that may or may not be true remains a crucial issue. We analyzed five reference samples and over 80,000 patient specimens from two laboratories. Quality metrics were examined for approximately 200,000 NGS calls with orthogonal data, including 1662 false positives. A classification algorithm used these data to identify a battery of criteria that flag 100% of false positives as requiring confirmation (CI lower bound, 98.5% to 99.8%, depending on variant type) while minimizing the number of flagged true positives. These criteria identify false positives that the previously published criteria miss. Sampling analysis showed that smaller data sets resulted in less effective criteria. Our methodology for determining test- and laboratory-specific criteria can be generalized into a practical approach that can be used by laboratories to reduce the cost and time burdens of confirmation without affecting clinical accuracy.

3.
J Pers Med ; 6(1)2016 Feb 27.
Artigo em Inglês | MEDLINE | ID: mdl-26927186

RESUMO

Effective implementation of precision medicine will be enhanced by a thorough understanding of each patient's genetic composition to better treat his or her presenting symptoms or mitigate the onset of disease. This ideally includes the sequence information of a complete genome for each individual. At Partners HealthCare Personalized Medicine, we have developed a clinical process for whole genome sequencing (WGS) with application in both healthy individuals and those with disease. In this manuscript, we will describe our bioinformatics strategy to efficiently process and deliver genomic data to geneticists for clinical interpretation. We describe the handling of data from FASTQ to the final variant list for clinical review for the final report. We will also discuss our methodology for validating this workflow and the cost implications of running WGS.

4.
Alzheimers Dement ; 12(3): 233-43, 2016 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-26092349

RESUMO

INTRODUCTION: African-American (AA) individuals have a higher risk for late-onset Alzheimer's disease (LOAD) than Americans of primarily European ancestry (EA). Recently, the largest genome-wide association study in AAs to date confirmed that six of the Alzheimer's disease (AD)-related genetic variants originally discovered in EA cohorts are also risk variants in AA; however, the risk attributable to many of the loci (e.g., APOE, ABCA7) differed substantially from previous studies in EA. There likely are risk variants of higher frequency in AAs that have not been discovered. METHODS: We performed a comprehensive analysis of genetically determined local and global ancestry in AAs with regard to LOAD status. RESULTS: Compared to controls, LOAD cases showed higher levels of African ancestry, both globally and at several LOAD relevant loci, which explained risk for AD beyond global differences. DISCUSSION: Exploratory post hoc analyses highlight regions with greatest differences in ancestry as potential candidate regions for future genetic analyses.


Assuntos
Doença de Alzheimer/etnologia , Doença de Alzheimer/genética , Predisposição Genética para Doença/genética , Transportadores de Cassetes de Ligação de ATP/genética , Afro-Americanos/genética , Idoso , Idoso de 80 Anos ou mais , Doença de Alzheimer/epidemiologia , Apolipoproteínas E/genética , Distribuição de Qui-Quadrado , Aberrações Cromossômicas , Estudos de Coortes , Feminino , Estudos de Associação Genética , Genótipo , Humanos , Masculino , Polimorfismo de Nucleotídeo Único/genética , Lectina 3 Semelhante a Ig de Ligação ao Ácido Siálico/genética
5.
JAMA Neurol ; 72(11): 1313-23, 2015 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-26366463

RESUMO

IMPORTANCE: Mutations in known causal Alzheimer disease (AD) genes account for only 1% to 3% of patients and almost all are dominantly inherited. Recessive inheritance of complex phenotypes can be linked to long (>1-megabase [Mb]) runs of homozygosity (ROHs) detectable by single-nucleotide polymorphism (SNP) arrays. OBJECTIVE: To evaluate the association between ROHs and AD in an African American population known to have a risk for AD up to 3 times higher than white individuals. DESIGN, SETTING, AND PARTICIPANTS: Case-control study of a large African American data set previously genotyped on different genome-wide SNP arrays conducted from December 2013 to January 2015. Global and locus-based ROH measurements were analyzed using raw or imputed genotype data. We studied the raw genotypes from 2 case-control subsets grouped based on SNP array: Alzheimer's Disease Genetics Consortium data set (871 cases and 1620 control individuals) and Chicago Health and Aging Project-Indianapolis Ibadan Dementia Study data set (279 cases and 1367 control individuals). We then examined the entire data set using imputed genotypes from 1917 cases and 3858 control individuals. MAIN OUTCOMES AND MEASURES: The ROHs larger than 1 Mb, 2 Mb, or 3 Mb were investigated separately for global burden evaluation, consensus regions, and gene-based analyses. RESULTS: The African American cohort had a low degree of inbreeding (F ~ 0.006). In the Alzheimer's Disease Genetics Consortium data set, we detected a significantly higher proportion of cases with ROHs greater than 2 Mb (P = .004) or greater than 3 Mb (P = .02), as well as a significant 114-kilobase consensus region on chr4q31.3 (empirical P value 2 = .04; ROHs >2 Mb). In the Chicago Health and Aging Project-Indianapolis Ibadan Dementia Study data set, we identified a significant 202-kilobase consensus region on Chr15q24.1 (empirical P value 2 = .02; ROHs >1 Mb) and a cluster of 13 significant genes on Chr3p21.31 (empirical P value 2 = .03; ROHs >3 Mb). A total of 43 of 49 nominally significant genes common for both data sets also mapped to Chr3p21.31. Analyses of imputed SNP data from the entire data set confirmed the association of AD with global ROH measurements (12.38 ROHs >1 Mb in cases vs 12.11 in controls; 2.986 Mb average size of ROHs >2 Mb in cases vs 2.889 Mb in controls; and 22% of cases with ROHs >3 Mb vs 19% of controls) and a gene-cluster on Chr3p21.31 (empirical P value 2 = .006-.04; ROHs >3 Mb). Also, we detected a significant association between AD and CLDN17 (empirical P value 2 = .01; ROHs >1 Mb), encoding a protein from the Claudin family, members of which were previously suggested as AD biomarkers. CONCLUSIONS AND RELEVANCE: To our knowledge, we discovered the first evidence of increased burden of ROHs among patients with AD from an outbred African American population, which could reflect either the cumulative effect of multiple ROHs to AD or the contribution of specific loci harboring recessive mutations and risk haplotypes in a subset of patients. Sequencing is required to uncover AD variants in these individuals.


Assuntos
Afro-Americanos/etnologia , Doença de Alzheimer/genética , Homozigoto , Polimorfismo de Nucleotídeo Único/genética , Idoso , Estudos de Casos e Controles , Chicago/etnologia , Genes Recessivos , Estudo de Associação Genômica Ampla , Humanos , Indiana/etnologia
6.
Bioinformatics ; 31(8): 1290-2, 2015 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-25480377

RESUMO

UNLABELLED: We implemented a high-throughput identification pipeline for promoter interacting enhancer element to streamline the workflow from mapping raw Hi-C reads, identifying DNA-DNA interacting fragments with high confidence and quality control, detecting histone modifications and DNase hypersensitive enrichments in putative enhancer elements, to ultimately extracting possible intra- and inter-chromosomal enhancer-target gene relationships. AVAILABILITY AND IMPLEMENTATION: This software package is designed to run on high-performance computing clusters with Oracle Grid Engine. The source code is freely available under the MIT license for academic and nonprofit use. The source code and instructions are available at the Wang lab website (http://wanglab.pcbi.upenn.edu/hippie/). It is also provided as an Amazon Machine Image to be used directly on Amazon Cloud with minimal installation. CONTACT: lswang@mail.med.upenn.edu or bdgregor@sas.upenn.edu SUPPLEMENTARY INFORMATION: Supplementary Material is available at Bioinformatics online.


Assuntos
DNA/genética , DNA/metabolismo , Elementos Facilitadores Genéticos/genética , Regiões Promotoras Genéticas/genética , Análise de Sequência de DNA/métodos , Humanos , Linguagens de Programação
7.
JAMA Neurol ; 72(2): 209-16, 2015 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-25531812

RESUMO

IMPORTANCE: Recently, a rare variant in the amyloid precursor protein gene (APP) was described in a population from Iceland. This variant, in which alanine is replaced by threonine at position 673 (A673T), appears to protect against late-onset Alzheimer disease (AD). We evaluated the frequency of this variant in AD cases and cognitively normal controls to determine whether this variant will significantly contribute to risk assessment in individuals in the United States. OBJECTIVE: To determine the frequency of the APP A673T variant in a large group of elderly cognitively normal controls and AD cases from the United States and in 2 case-control cohorts from Sweden. DESIGN, SETTING, AND PARTICIPANTS: Case-control association analysis of variant APP A673T in US and Swedish white individuals comparing AD cases with cognitively intact elderly controls. Participants were ascertained at multiple university-associated medical centers and clinics across the United States and Sweden by study-specific sampling methods. They were from case-control studies, community-based prospective cohort studies, and studies that ascertained multiplex families from multiple sources. MAIN OUTCOMES AND MEASURES: Genotypes for the APP A673T variant were determined using the Infinium HumanExome V1 Beadchip (Illumina, Inc) and by TaqMan genotyping (Life Technologies). RESULTS: The A673T variant genotypes were evaluated in 8943 US AD cases, 10 480 US cognitively normal controls, 862 Swedish AD cases, and 707 Swedish cognitively normal controls. We identified 3 US individuals heterozygous for A673T, including 1 AD case (age at onset, 89 years) and 2 controls (age at last examination, 82 and 77 years). The remaining US samples were homozygous for the alanine (A673) allele. In the Swedish samples, 3 controls were heterozygous for A673T and all AD cases were homozygous for the A673 allele. We also genotyped a US family previously reported to harbor the A673T variant and found a mother-daughter pair, both cognitively normal at ages 72 and 84 years, respectively, who were both heterozygous for A673T; however, all individuals with AD in the family were homozygous for A673. CONCLUSIONS AND RELEVANCE: The A673T variant is extremely rare in US cohorts and does not play a substantial role in risk for AD in this population. This variant may be primarily restricted to Icelandic and Scandinavian populations.


Assuntos
Doença de Alzheimer/genética , Precursor de Proteína beta-Amiloide/genética , Idoso , Idoso de 80 Anos ou mais , Doença de Alzheimer/epidemiologia , Estudos de Casos e Controles , Feminino , Genótipo , Humanos , Masculino , Linhagem , Fatores de Proteção , Suécia/epidemiologia , Estados Unidos/epidemiologia
8.
Nature ; 515(7526): 209-15, 2014 Nov 13.
Artigo em Inglês | MEDLINE | ID: mdl-25363760

RESUMO

The genetic architecture of autism spectrum disorder involves the interplay of common and rare variants and their impact on hundreds of genes. Using exome sequencing, here we show that analysis of rare coding variation in 3,871 autism cases and 9,937 ancestry-matched or parental controls implicates 22 autosomal genes at a false discovery rate (FDR) < 0.05, plus a set of 107 autosomal genes strongly enriched for those likely to affect risk (FDR < 0.30). These 107 genes, which show unusual evolutionary constraint against mutations, incur de novo loss-of-function mutations in over 5% of autistic subjects. Many of the genes implicated encode proteins for synaptic formation, transcriptional regulation and chromatin-remodelling pathways. These include voltage-gated ion channels regulating the propagation of action potentials, pacemaking and excitability-transcription coupling, as well as histone-modifying enzymes and chromatin remodellers-most prominently those that mediate post-translational lysine methylation/demethylation modifications of histones.


Assuntos
Transtornos Globais do Desenvolvimento Infantil/genética , Cromatina/genética , Predisposição Genética para Doença/genética , Mutação/genética , Sinapses/metabolismo , Transcrição Genética/genética , Sequência de Aminoácidos , Transtornos Globais do Desenvolvimento Infantil/patologia , Cromatina/metabolismo , Montagem e Desmontagem da Cromatina , Exoma/genética , Feminino , Mutação em Linhagem Germinativa/genética , Humanos , Masculino , Dados de Sequência Molecular , Mutação de Sentido Incorreto/genética , Rede Nervosa/metabolismo , Razão de Chances
9.
PLoS One ; 9(6): e94661, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24922517

RESUMO

BACKGROUND: Alzheimer's disease is a common debilitating dementia with known heritability, for which 20 late onset susceptibility loci have been identified, but more remain to be discovered. This study sought to identify new susceptibility genes, using an alternative gene-wide analytical approach which tests for patterns of association within genes, in the powerful genome-wide association dataset of the International Genomics of Alzheimer's Project Consortium, comprising over 7 m genotypes from 25,580 Alzheimer's cases and 48,466 controls. PRINCIPAL FINDINGS: In addition to earlier reported genes, we detected genome-wide significant loci on chromosomes 8 (TP53INP1, p = 1.4×10-6) and 14 (IGHV1-67 p = 7.9×10-8) which indexed novel susceptibility loci. SIGNIFICANCE: The additional genes identified in this study, have an array of functions previously implicated in Alzheimer's disease, including aspects of energy metabolism, protein degradation and the immune system and add further weight to these pathways as potential therapeutic targets in Alzheimer's disease.


Assuntos
Doença de Alzheimer/genética , Proteínas de Transporte/genética , Proteínas de Choque Térmico/genética , Estudos de Casos e Controles , Estudo de Associação Genômica Ampla , Humanos , Polimorfismo de Nucleotídeo Único , Receptores de Antígenos de Linfócitos B/genética
10.
Bioinformatics ; 29(19): 2498-500, 2013 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-23943636

RESUMO

SUMMARY: We report our new DRAW+SneakPeek software for DNA-seq analysis. DNA resequencing analysis workflow (DRAW) automates the workflow of processing raw sequence reads including quality control, read alignment and variant calling on high-performance computing facilities such as Amazon elastic compute cloud. SneakPeek provides an effective interface for reviewing dozens of quality metrics reported by DRAW, so users can assess the quality of data and diagnose problems in their sequencing procedures. Both DRAW and SneakPeek are freely available under the MIT license, and are available as Amazon machine images to be used directly on Amazon cloud with minimal installation. AVAILABILITY: DRAW+SneakPeek is released under the MIT license and is available for academic and nonprofit use for free. The information about source code, Amazon machine images and instructions on how to install and run DRAW+SneakPeek locally and on Amazon elastic compute cloud is available at the National Institute on Aging Genetics of Alzheimer's Disease Data Storage Site (http://www.niagads.org/) and Wang lab Web site (http://wanglab.pcbi.upenn.edu/).


Assuntos
Biometria/métodos , DNA/análise , Análise de Sequência de DNA/métodos , Desenho de Programas de Computador , Internet , Linguagens de Programação
11.
JAMA ; 309(14): 1483-92, 2013 Apr 10.
Artigo em Inglês | MEDLINE | ID: mdl-23571587

RESUMO

IMPORTANCE: Genetic variants associated with susceptibility to late-onset Alzheimer disease are known for individuals of European ancestry, but whether the same or different variants account for the genetic risk of Alzheimer disease in African American individuals is unknown. Identification of disease-associated variants helps identify targets for genetic testing, prevention, and treatment. OBJECTIVE: To identify genetic loci associated with late-onset Alzheimer disease in African Americans. DESIGN, SETTING, AND PARTICIPANTS: The Alzheimer Disease Genetics Consortium (ADGC) assembled multiple data sets representing a total of 5896 African Americans (1968 case participants, 3928 control participants) 60 years or older that were collected between 1989 and 2011 at multiple sites. The association of Alzheimer disease with genotyped and imputed single-nucleotide polymorphisms (SNPs) was assessed in case-control and in family-based data sets. Results from individual data sets were combined to perform an inverse variance-weighted meta-analysis, first with genome-wide analyses and subsequently with gene-based tests for previously reported loci. MAIN OUTCOMES AND MEASURES: Presence of Alzheimer disease according to standardized criteria. RESULTS: Genome-wide significance in fully adjusted models (sex, age, APOE genotype, population stratification) was observed for a SNP in ABCA7 (rs115550680, allele = G; frequency, 0.09 cases and 0.06 controls; odds ratio [OR], 1.79 [95% CI, 1.47-2.12]; P = 2.2 × 10(-9)), which is in linkage disequilibrium with SNPs previously associated with Alzheimer disease in Europeans (0.8 < D' < 0.9). The effect size for the SNP in ABCA7 was comparable with that of the APOE ϵ4-determining SNP rs429358 (allele = C; frequency, 0.30 cases and 0.18 controls; OR, 2.31 [95% CI, 2.19-2.42]; P = 5.5 × 10(-47)). Several loci previously associated with Alzheimer disease but not reaching significance in genome-wide analyses were replicated in gene-based analyses accounting for linkage disequilibrium between markers and correcting for number of tests performed per gene (CR1, BIN1, EPHA1, CD33; 0.0005 < empirical P < .001). CONCLUSIONS AND RELEVANCE: In this meta-analysis of data from African American participants, Alzheimer disease was significantly associated with variants in ABCA7 and with other genes that have been associated with Alzheimer disease in individuals of European ancestry. Replication and functional validation of this finding is needed before this information is used in clinical settings.


Assuntos
Transportadores de Cassetes de Ligação de ATP/genética , Afro-Americanos/genética , Doença de Alzheimer/etnologia , Doença de Alzheimer/genética , Apolipoproteína E4/genética , Estudo de Associação Genômica Ampla , Idade de Início , Idoso , Estudos de Casos e Controles , Predisposição Genética para Doença , Variação Genética , Genótipo , Humanos , Desequilíbrio de Ligação , Pessoa de Meia-Idade , Polimorfismo de Nucleotídeo Único , Risco
12.
Neuron ; 77(2): 235-42, 2013 Jan 23.
Artigo em Inglês | MEDLINE | ID: mdl-23352160

RESUMO

To characterize the role of rare complete human knockouts in autism spectrum disorders (ASDs), we identify genes with homozygous or compound heterozygous loss-of-function (LoF) variants (defined as nonsense and essential splice sites) from exome sequencing of 933 cases and 869 controls. We identify a 2-fold increase in complete knockouts of autosomal genes with low rates of LoF variation (≤ 5% frequency) in cases and estimate a 3% contribution to ASD risk by these events, confirming this observation in an independent set of 563 probands and 4,605 controls. Outside the pseudoautosomal regions on the X chromosome, we similarly observe a significant 1.5-fold increase in rare hemizygous knockouts in males, contributing to another 2% of ASDs in males. Taken together, these results provide compelling evidence that rare autosomal and X chromosome complete gene knockouts are important inherited risk factors for ASD.


Assuntos
Transtornos Globais do Desenvolvimento Infantil/diagnóstico , Transtornos Globais do Desenvolvimento Infantil/genética , Demografia/métodos , Deleção de Genes , Perda de Heterozigosidade/genética , Estudos de Casos e Controles , Transtornos Globais do Desenvolvimento Infantil/epidemiologia , Pré-Escolar , Cromossomos Humanos X/genética , Feminino , Variação Genética/genética , Homozigoto , Humanos , Desequilíbrio de Ligação/genética , Masculino , Fatores de Risco
13.
Curr Protoc Hum Genet ; 79: Unit 1.27., 2013 Oct 18.
Artigo em Inglês | MEDLINE | ID: mdl-24510649

RESUMO

High-density SNP genotyping technology provides a low-cost, effective tool for conducting Genome Wide Association (GWA) studies. The wide adoption of GWA studies has indeed led to discoveries of disease- or trait-associated SNPs, some of which were subsequently shown to be causal. However, the nearly universal shortcoming of many GWA studies--missing heritability--has prompted great interest in searching for other types of genetic variation, such as copy number variation (CNV). Certain CNVs have been reported to alter disease susceptibility. Algorithms and tools have been developed to identify CNVs using SNP array hybridization intensity data. Such an approach provides an additional source of data with almost no extra cost. In this unit, we demonstrate the steps for calling CNVs from Illumina SNP array data using PennCNV and performing association analysis using R and PLINK.


Assuntos
Variações do Número de Cópias de DNA , Técnicas de Genotipagem/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Algoritmos , Frequência do Gene , Estudo de Associação Genômica Ampla/métodos , Humanos , Polimorfismo de Nucleotídeo Único
14.
Nature ; 485(7397): 242-5, 2012 Apr 04.
Artigo em Inglês | MEDLINE | ID: mdl-22495311

RESUMO

Autism spectrum disorders (ASD) are believed to have genetic and environmental origins, yet in only a modest fraction of individuals can specific causes be identified. To identify further genetic risk factors, here we assess the role of de novo mutations in ASD by sequencing the exomes of ASD cases and their parents (n = 175 trios). Fewer than half of the cases (46.3%) carry a missense or nonsense de novo variant, and the overall rate of mutation is only modestly higher than the expected rate. In contrast, the proteins encoded by genes that harboured de novo missense or nonsense mutations showed a higher degree of connectivity among themselves and to previous ASD genes as indexed by protein-protein interaction screens. The small increase in the rate of de novo events, when taken together with the protein interaction results, are consistent with an important but limited role for de novo point mutations in ASD, similar to that documented for de novo copy number variants. Genetic models incorporating these data indicate that most of the observed de novo events are unconnected to ASD; those that do confer risk are distributed across many genes and are incompletely penetrant (that is, not necessarily sufficient for disease). Our results support polygenic models in which spontaneous coding mutations in any of a large number of genes increases risk by 5- to 20-fold. Despite the challenge posed by such models, results from de novo events and a large parallel case-control study provide strong evidence in favour of CHD8 and KATNAL2 as genuine autism risk factors.


Assuntos
Transtorno Autístico/genética , Proteínas de Ligação a DNA/genética , Éxons/genética , Predisposição Genética para Doença/genética , Mutação/genética , Fatores de Transcrição/genética , Estudos de Casos e Controles , Exoma/genética , Saúde da Família , Humanos , Modelos Genéticos , Herança Multifatorial/genética , Fenótipo , Distribuição de Poisson , Mapas de Interação de Proteínas
15.
Int J Biol Sci ; 8(3): 344-52, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22393306

RESUMO

Most of eukaryotic genes are interrupted by introns that need to be removed from pre-mRNAs before they can perform their function. This is done by complex machinery called spliceosome. Many eukaryotes possess two separate spliceosomal systems that process separate sets of introns. The major (U2) spliceosome removes majority of introns, while minute fraction of intron repertoire is processed by the minor (U12) spliceosome. These two populations of introns are called U2-type and U12-type, respectively. The latter fall into two subtypes based on the terminal dinucleotides. The minor spliceosomal system has been lost independently in some lineages, while in some others few U12-type introns persist. We investigated twenty insect genomes in order to better understand the evolutionary dynamics of U12-type introns. Our work confirms dramatic drop of U12-type introns in Diptera, leaving these genomes just with a handful cases. This is mostly the result of intron deletion, but in a number of dipteral cases, minor type introns were switched to a major type, as well. Insect genes that harbor U12-type introns belong to several functional categories among which proteins binding ions and nucleic acids are enriched and these few categories are also overrepresented among these genes that preserved minor type introns in Diptera.


Assuntos
Dípteros/genética , Evolução Molecular , Genes de Insetos/genética , Íntrons/genética , RNA Nuclear Pequeno/genética , Spliceossomos/genética , Animais , Abelhas/genética , Bombyx/genética , Culicidae/genética , Drosophila/genética , Proteínas de Insetos/química , Proteínas de Insetos/genética , Pediculus/genética , Filogenia , Homologia de Sequência de Aminoácidos , Especificidade da Espécie , Tribolium/genética
16.
BMC Evol Biol ; 10: 47, 2010 Feb 17.
Artigo em Inglês | MEDLINE | ID: mdl-20163699

RESUMO

BACKGROUND: Many multicellular eukaryotes have two types of spliceosomes for the removal of introns from messenger RNA precursors. The major (U2) spliceosome processes the vast majority of introns, referred to as U2-type introns, while the minor (U12) spliceosome removes a small fraction (less than 0.5%) of introns, referred to as U12-type introns. U12-type introns have distinct sequence elements and usually occur together in genes with U2-type introns. A phylogenetic distribution of U12-type introns shows that the minor splicing pathway appeared very early in eukaryotic evolution and has been lost repeatedly. RESULTS: We have investigated the evolution of U12-type introns among eighteen metazoan genomes by analyzing orthologous U12-type intron clusters. Examination of gain, loss, and type switching shows that intron type is remarkably conserved among vertebrates. Among 180 intron clusters, only eight show intron loss in any vertebrate species and only five show conversion between the U12 and the U2-type. Although there are only nineteen U12-type introns in Drosophila melanogaster, we found one case of U2 to U12-type conversion, apparently mediated by the activation of cryptic U12 splice sites early in the dipteran lineage. Overall, loss of U12-type introns is more common than conversion to U2-type and the U12 to U2 conversion occurs more frequently among introns of the GT-AG subtype than among introns of the AT-AC subtype. We also found support for natural U12-type introns with non-canonical terminal dinucleotides (CT-AC, GG-AG, and GA-AG) that have not been previously reported. CONCLUSIONS: Although complete loss of the U12-type spliceosome has occurred repeatedly, U12 introns are extremely stable in some taxa, including eutheria. Loss of U12 introns or the genes containing them is more common than conversion to the U2-type. The degeneracy of U12-type terminal dinucleotides among natural U12-type introns is higher than previously thought.


Assuntos
RNA Nuclear Pequeno/genética , Spliceossomos/genética , Animais , Arabidopsis/genética , Evolução Molecular , Humanos , Íntrons
17.
BMC Evol Biol ; 7: 193, 2007 Oct 16.
Artigo em Inglês | MEDLINE | ID: mdl-17939861

RESUMO

BACKGROUND: Between five and fourteen per cent of genes in the vertebrate genomes do overlap sharing some intronic and/or exonic sequence. It was observed that majority of these overlaps are not conserved among vertebrate lineages. Although several mechanisms have been proposed to explain gene overlap origination the evolutionary basis of these phenomenon are still not well understood. Here, we present results of the comparative analysis of several vertebrate genomes. The purpose of this study was to examine overlapping genes in the context of their evolution and mechanisms leading to their origin. RESULTS: Based on the presence and arrangement of human overlapping genes orthologs in rodent and fish genomes we developed 15 theoretical scenarios of overlapping genes evolution. Analysis of these theoretical scenarios and close examination of genomic sequences revealed new mechanisms leading to the overlaps evolution and confirmed that many of the vertebrate gene overlaps are not conserved. This study also demonstrates that repetitive elements contribute to the overlapping genes origination and, for the first time, that evolutionary events could lead to the loss of an ancient overlap. CONCLUSION: Birth as well as most probably death of gene overlaps occurred over the entire time of vertebrate evolution and there wasn't any rapid origin or 'big bang' in the course of overlapping genes evolution. The major forces in the gene overlaps origination are transposition and exaptation. Our results also imply that origin of overlapping genes is not an issue of saving space and contracting genomes size.

18.
RNA ; 13(1): 5-14, 2007 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-17095541

RESUMO

The removal of introns from the primary transcripts of protein-coding genes is accomplished by the spliceosome, a large macromolecular complex of which small nuclear RNAs (snRNAs) are crucial components. Following the recent sequencing of the honeybee (Apis mellifera) genome, we used various computational methods, ranging from sequence similarity search to RNA secondary structure prediction, to search for putative snRNA genes (including their promoters) and to examine their pattern of conservation among 11 available insect genomes (A. mellifera, Tribolium castaneum, Bombyx mori, Anopheles gambiae, Aedes aegypti, and six Drosophila species). We identified candidates for all nine spliceosomal snRNA genes in all the analyzed genomes. All the species contain a similar number of snRNA genes, with the exception of A. aegypti, whose genome contains more U1, U2, and U5 genes, and A. mellifera, whose genome contains fewer U2 and U5 genes. We found that snRNA genes are generally more closely related to homologs within the same genus than to those in other genera. Promoter regions for all spliceosomal snRNA genes within each insect species share similar sequence motifs that are likely to correspond to the PSEA (proximal sequence element A), the binding site for snRNA activating protein complex, but these promoter elements vary in sequence among the five insect families surveyed here. In contrast to the other insect species investigated, Dipteran genomes are characterized by a rapid evolution (or loss) of components of the U12 spliceosome and a striking loss of U12-type introns.


Assuntos
Biologia Computacional , Dípteros/genética , Genes de Insetos/genética , Genoma de Inseto/genética , Processamento de RNA , RNA Nuclear Pequeno/química , Animais , Sequência de Bases , Abelhas/genética , Dípteros/classificação , Evolução Molecular , Dados de Sequência Molecular , Conformação de Ácido Nucleico , Filogenia , Regiões Promotoras Genéticas , RNA Nuclear Pequeno/genética , Análise de Sequência de RNA , Spliceossomos/metabolismo
19.
Genetics ; 172(3): 1711-26, 2006 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-16387879

RESUMO

Although mutation, genetic drift, and natural selection are well established as determinants of genome evolution, the importance (frequency and magnitude) of parameter fluctuations in molecular evolution is less understood. DNA sequence comparisons among closely related species allow specific substitutions to be assigned to lineages on a phylogenetic tree. In this study, we compare patterns of codon usage and protein evolution in 22 genes (>11,000 codons) among Drosophila melanogaster and five relatives within the D. melanogaster subgroup. We assign changes to eight lineages using a maximum-likelihood approach to infer ancestral states. Uncertainty in ancestral reconstructions is taken into account, at least to some extent, by weighting reconstructions by their posterior probabilities. Four of the eight lineages show potentially genomewide departures from equilibrium synonymous codon usage; three are decreasing and one is increasing in major codon usage. Several of these departures are consistent with lineage-specific changes in selection intensity (selection coefficients scaled to effective population size) at silent sites. Intron base composition and rates and patterns of protein evolution are also heterogeneous among these lineages. The magnitude of forces governing silent, intron, and protein evolution appears to have varied frequently, and in a lineage-specific manner, within the D. melanogaster subgroup.


Assuntos
Drosophila melanogaster/genética , Evolução Molecular , Filogenia , Substituição de Aminoácidos/genética , Animais , Composição de Bases , Códon , Frequência do Gene , Íntrons , Dados de Sequência Molecular , Polimorfismo Genético , Especificidade da Espécie
20.
Comput Biol Chem ; 29(1): 1-12, 2005 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-15680581

RESUMO

Overlapping genes in mammalian genomes are unexpected phenomena even though hundreds of pairs of protein coding overlapping genes have been reported so far. Overlapping genes can be divided into different categories based on direction of transcription as well as on sequence segments being shared between overlapping coding regions. The biologic functions of natural antisense transcripts, their involvement in physiological processes and gene regulation in living organisms are not fully understood. Number of documented examples indicates that they may exert control at various levels of gene expression, such as transcription, mRNA processing, splicing, stability, transport, and translation. Similarly, evolutionary origin of such genes is not known, existing hypotheses can explain only selected cases of mammalian gene overlaps which could originate as result of rearrangements, overprinting and/or adoption of signals in the neighboring gene locus.


Assuntos
Homologia de Genes , Genoma , RNA Antissenso/genética , Animais , Doença/etiologia , Evolução Molecular , Humanos , Mamíferos , Transcrição Genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA