Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 27
Filtrar
1.
FEBS J ; 288(7): 2311-2331, 2021 04.
Artigo em Inglês | MEDLINE | ID: mdl-33006196

RESUMO

The fetal inflammatory response (FIR) increases the risk of perinatal brain injury, particularly in extremely low gestational age newborns (ELGANs, < 28 weeks of gestation). One of the mechanisms contributing to such a risk is a postnatal intermittent or sustained systemic inflammation (ISSI) following FIR. The link between prenatal and postnatal systemic inflammation is supported by the presence of well-established inflammatory biomarkers in the umbilical cord and peripheral blood. However, the extent of molecular changes contributing to this association is unknown. Using RNA sequencing and mass spectrometry proteomics, we profiled the transcriptome and proteome of archived neonatal dried blood spot (DBS) specimens from 21 ELGANs. Comparing FIR-affected and unaffected ELGANs, we identified 782 gene and 27 protein expression changes of 50% magnitude or more, and an experiment-wide significance level below 5% false discovery rate. These expression changes confirm the robust postnatal activation of the innate immune system in FIR-affected ELGANs and reveal for the first time an impairment of their adaptive immunity. In turn, the altered pathways provide clues about the molecular mechanisms triggering ISSI after FIR, and the onset of perinatal brain injury. DATABASES: EGAS00001003635 (EGA); PXD011626 (PRIDE).


Assuntos
Feto/metabolismo , Inflamação/genética , Proteoma/genética , Transcriptoma/genética , Biomarcadores/metabolismo , Teste em Amostras de Sangue Seco , Feminino , Regulação da Expressão Gênica/genética , Genoma Humano/genética , Idade Gestacional , Humanos , Sistema Imunitário/metabolismo , Recém-Nascido , Inflamação/imunologia , Espectrometria de Massas , Gravidez , Análise de Sequência de RNA
2.
J Med Genet ; 56(7): 481-490, 2019 07.
Artigo em Inglês | MEDLINE | ID: mdl-30894412

RESUMO

BACKGROUND: Mapping the genetic component of molecular mechanisms responsible for the reduced penetrance (RP) of rare disorders constitutes one of the most challenging problems in human genetics. Heritable pulmonary arterial hypertension (PAH) is one such disorder characterised by rare mutations mostly occurring in the bone morphogenetic protein receptor type 2 (BMPR2) gene and a wide heterogeneity of penetrance modifier mechanisms. Here, we analyse 32 genotyped individuals from a large Iberian family of 65 members, including 22 carriers of the pathogenic BMPR2 mutation c.1472G>A (p.Arg491Gln), 8 of them diagnosed with PAH by right-heart catheterisation, leading to an RP rate of 36.4%. METHODS: We performed a linkage analysis on the genotyping data to search for genetic modifiers of penetrance. Using functional genomics data, we characterised the candidate region identified by linkage analysis. We also predicted the haplotype segregation within the family. RESULTS: We identified a candidate chromosome region in 2q24.3, 38 Mb upstream from BMPR2, with significant linkage (LOD=4.09) under a PAH susceptibility model. This region contains common variants associated with vascular aetiology and shows functional evidence that the putative genetic modifier is located in the upstream distal promoter of the fidgetin (FIGN) gene. CONCLUSION: Our results suggest that the genetic modifier acts through FIGN transcriptional regulation, whose expression variability would contribute to modulating heritable PAH. This finding may help to advance our understanding of RP in PAH across families sharing the p.Arg491Gln pathogenic mutation in BMPR2.


Assuntos
ATPases Associadas a Diversas Atividades Celulares/genética , Hipertensão Pulmonar Primária Familiar/diagnóstico , Hipertensão Pulmonar Primária Familiar/genética , Ligação Genética , Predisposição Genética para Doença , Proteínas Associadas aos Microtúbulos/genética , Penetrância , Alelos , Substituição de Aminoácidos , Pressão Sanguínea , Cromossomos Humanos Par 2 , Família , Estudos de Associação Genética , Estudo de Associação Genômica Ampla , Genótipo , Hemodinâmica , Heterozigoto , Humanos , Desequilíbrio de Ligação , Mutação , Linhagem , Fenótipo , Polimorfismo de Nucleotídeo Único
3.
Arch. bronconeumol. (Ed. impr.) ; 55(2): 93-89, feb. 2019. graf
Artigo em Espanhol | IBECS | ID: ibc-177337

RESUMO

La enfermedad pulmonar obstructiva crónica (EPOC) es una entidad de presentación heterogénea. Por ello, se han intentado perfilar diferentes fenotipos y endotipos, que permitirían un manejo más diferenciado. El objetivo del proyecto Biomarcadores en la EPOC (BIOMEPOC) es identificar biomarcadores sanguíneos útiles para tipificar mejor a los enfermos. Se analizarán datos clínicos y muestras sanguíneas en un grupo de pacientes y controles sanos. El proyecto constará de fases de prospección y de validación. Se realizarán determinaciones analíticas sanguíneas con técnicas convencionales y de diversas ciencias «ómicas» (transcriptómica, proteómica y metabolómica). Las primeras se realizarán orientadas por hipótesis, mientras que con las segundas se realizará una exploración sin dicho condicionante. Finalmente se realizará un análisis multinivel. En el momento actual se han reclutado 269 pacientes y 83 controles, y se está iniciando el procesamiento de muestras. Con los resultados obtenidos se espera identificar nuevos biomarcadores que, en solitario o combinados, permitan una mejor tipificación de los pacientes


Chronic obstructive pulmonary disease (COPD) is an entity with a heterogeneous presentation. For this reason, attempts have been made to characterize different phenotypes and endotypes to enable a more individualized approach. The aim of the Biomarkers in COPD (BIOMEPOC) project is to identify useful biomarkers in blood to improve the characterization of patients. Clinical data and blood samples from a group of patients and healthy controls will be analyzed. The project will consist of an exploration phase and a validation phase. Analytical parameters in blood will be determined using standard techniques and certain ‘omics’ (transcriptomics, proteomics, and metabolomics). The former will be hypothesis-driven, whereas the latter will be exploratory. Finally, a multilevel analysis will be conducted. Currently, 269 patients and 83 controls have been recruited, and sample processing is beginning. Our hope is to use the results to identify new biomarkers that, alone or combined, will allow a better characterization of patients


Assuntos
Humanos , Masculino , Feminino , Pessoa de Meia-Idade , Idoso , Doença Pulmonar Obstrutiva Crônica/diagnóstico , Biomarcadores/análise , Enfisema/diagnóstico por imagem , Análise Multinível , Estudos Prospectivos , Voluntários Saudáveis , Declaração de Helsinki
4.
Arch Bronconeumol (Engl Ed) ; 55(2): 93-99, 2019 Feb.
Artigo em Inglês, Espanhol | MEDLINE | ID: mdl-30343952

RESUMO

Chronic obstructive pulmonary disease (COPD) is an entity with a heterogeneous presentation. For this reason, attempts have been made to characterize different phenotypes and endotypes to enable a more individualized approach. The aim of the Biomarkers in COPD (BIOMEPOC) project is to identify useful biomarkers in blood to improve the characterization of patients. Clinical data and blood samples from a group of patients and healthy controls will be analyzed. The project will consist of an exploration phase and a validation phase. Analytical parameters in blood will be determined using standard techniques and certain 'omics' (transcriptomics, proteomics, and metabolomics). The former will be hypothesis-driven, whereas the latter will be exploratory. Finally, a multilevel analysis will be conducted. Currently, 269 patients and 83 controls have been recruited, and sample processing is beginning. Our hope is to use the results to identify new biomarkers that, alone or combined, will allow a better characterization of patients.


Assuntos
Biomarcadores/sangue , Metabolômica , Proteômica , Doença Pulmonar Obstrutiva Crônica/sangue , Transcriptoma , Idoso , Estudos de Casos e Controles , Estudos Transversais , Feminino , Humanos , Masculino , Estudos Prospectivos
5.
Bioinformatics ; 34(18): 3208-3210, 2018 09 15.
Artigo em Inglês | MEDLINE | ID: mdl-29718111

RESUMO

Summary: Genomewide position-specific scores, such as those estimating conservation, constraint, fitness or mutation tolerance, are ubiquitous in current genome analyses. The diversity of sources and formats of these scores, as well as their size, increase the burden to use them. We present GenomicScores, a Bioconductor package that provides efficient storage and seamless access of genomewide position-specific scores from R, facilitating their use in genome analysis workflows. Availability and implementation: GenomicScores is implemented in R and available at https://bioconductor.org/packages/GenomicScores under the open source 'Artistic-2.0' license. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Genoma , Software , Genômica
6.
Pediatr Res ; 79(3): 473-81, 2016 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-26539667

RESUMO

BACKGROUND: The fetal inflammatory response (FIR) in placental membranes to an intrauterine infection often precedes premature birth raising neonatal mortality and morbidity. However, the precise molecular events behind FIR still remain largely unknown, and little has been investigated at gene expression level. METHODS: We collected publicly available microarray expression data profiling umbilical cord (UC) tissue derived from the cohort of extremely low gestational age newborns (ELGANs) and interrogate them for differentially expressed (DE) genes between FIR and non-FIR-affected ELGANs. RESULTS: We found a broad and complex FIR UC gene expression signature, changing up to 19% (3,896/20,155) of all human genes at 1% false discovery rate. Significant changes of a minimum 50% magnitude (1,097/3,896) affect the upregulation of many inflammatory pathways and molecules, such as cytokines, toll-like receptors, and calgranulins. Remarkably, they also include the downregulation of neurodevelopmental pathways and genes, such as Fragile-X mental retardation 1 (FMR1), contactin 1 (CNTN1), and adenomatous polyposis coli (APC). CONCLUSION: The FIR expression signature in UC tissue contains molecular clues about signaling pathways that trigger FIR, and it is consistent with an acute inflammatory response by fetal innate and adaptive immune systems, which participate in the pathogenesis of neonatal brain damage.


Assuntos
Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Lactente Extremamente Prematuro , Inflamação/metabolismo , Cordão Umbilical/metabolismo , Proteína da Polipose Adenomatosa do Colo/genética , Estudos de Coortes , Contactina 1/genética , Reações Falso-Positivas , Feminino , Proteína do X Frágil da Deficiência Intelectual/genética , Idade Gestacional , Humanos , Sistema Imunitário , Recém-Nascido , Masculino , Análise de Sequência com Séries de Oligonucleotídeos , Transdução de Sinais
7.
Brief Bioinform ; 17(4): 603-15, 2016 07.
Artigo em Inglês | MEDLINE | ID: mdl-26463000

RESUMO

Molecular interrogation of a biological sample through DNA sequencing, RNA and microRNA profiling, proteomics and other assays, has the potential to provide a systems level approach to predicting treatment response and disease progression, and to developing precision therapies. Large publicly funded projects have generated extensive and freely available multi-assay data resources; however, bioinformatic and statistical methods for the analysis of such experiments are still nascent. We review multi-assay genomic data resources in the areas of clinical oncology, pharmacogenomics and other perturbation experiments, population genomics and regulatory genomics and other areas, and tools for data acquisition. Finally, we review bioinformatic tools that are explicitly geared toward integrative genomic data visualization and analysis. This review provides starting points for accessing publicly available data and tools to support development of needed integrative methods.


Assuntos
Genômica , Biologia Computacional , MicroRNAs , Análise de Sequência de DNA
8.
Nat Commun ; 6: 10105, 2015 Dec 16.
Artigo em Inglês | MEDLINE | ID: mdl-26670742

RESUMO

What happens to gene expression when you add new links to a gene regulatory network? To answer this question, we profile 85 network rewirings in E. coli. Here we report that concerted patterns of differential expression propagate from reconnected hub genes. The rewirings link promoter regions to different transcription factor and σ-factor genes, resulting in perturbations that span four orders of magnitude, changing up to ∼ 70% of the transcriptome. Importantly, factor connectivity and promoter activity both associate with perturbation size. Perturbations from related rewirings have more similar transcription profiles and a statistical analysis reveals ∼ 20 underlying states of the system, associating particular gene groups with rewiring constructs. We examine two large clusters (ribosomal and flagellar genes) in detail. These represent alternative global outcomes from different rewirings because of antagonism between these major cell states. This data set of systematically related perturbations enables reverse engineering and discovery of underlying network interactions.


Assuntos
Proteínas de Escherichia coli/genética , Escherichia coli/genética , Redes Reguladoras de Genes , Escherichia coli/metabolismo , Proteínas de Escherichia coli/metabolismo , Regulação Bacteriana da Expressão Gênica , Regiões Promotoras Genéticas
9.
Islets ; 7(4): e1118195, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26742564

RESUMO

The disease mechanisms underlying type 2 diabetes (T2D) remain poorly defined. Here we aimed to explore the pathophysiology of T2D by analyzing gene co-expression networks in human islets. Using partial correlation networks we identified a group of co-expressed genes ('module') including F2RL2 that was associated with glycated hemoglobin. F2Rl2 is a G-protein-coupled receptor (GPCR) that encodes protease-activated receptor-3 (PAR3). PAR3 is cleaved by thrombin, which exposes a 6-amino acid sequence that acts as a 'tethered ligand' to regulate cellular signaling. We have characterized the effect of PAR3 activation on insulin secretion by static insulin secretion measurements, capacitance measurements, studies of diabetic animal models and patient samples. We demonstrate that thrombin stimulates insulin secretion, an effect that was prevented by an antibody that blocks the thrombin cleavage site of PAR3. Treatment with a peptide corresponding to the PAR3 tethered ligand stimulated islet insulin secretion and single ß-cell exocytosis by a mechanism that involves activation of phospholipase C and Ca(2+) release from intracellular stores. Moreover, we observed that the expression of tissue factor, which regulates thrombin generation, was increased in human islets from T2D donors and associated with enhanced ß-cell exocytosis. Finally, we demonstrate that thrombin generation potential in patients with T2D was associated with increased fasting insulin and insulinogenic index. The findings provide a previously unrecognized link between hypercoagulability and hyperinsulinemia and suggest that reducing thrombin activity or blocking PAR3 cleavage could potentially counteract the exaggerated insulin secretion that drives insulin resistance and ß-cell exhaustion in T2D.


Assuntos
Células Secretoras de Insulina/efeitos dos fármacos , Células Secretoras de Insulina/metabolismo , Insulina/metabolismo , Receptor PAR-1/fisiologia , Trombina/farmacologia , Células Cultivadas , Diabetes Mellitus Tipo 2/genética , Diabetes Mellitus Tipo 2/metabolismo , Diabetes Mellitus Tipo 2/patologia , Exocitose/efeitos dos fármacos , Exocitose/genética , Perfilação da Expressão Gênica , Humanos , Secreção de Insulina , Células Secretoras de Insulina/patologia , Análise em Microsséries , Receptor PAR-1/metabolismo , Regulação para Cima/efeitos dos fármacos
10.
Genetics ; 198(4): 1377-93, 2014 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-25271303

RESUMO

Expression quantitative trait loci (eQTL) mapping constitutes a challenging problem due to, among other reasons, the high-dimensional multivariate nature of gene-expression traits. Next to the expression heterogeneity produced by confounding factors and other sources of unwanted variation, indirect effects spread throughout genes as a result of genetic, molecular, and environmental perturbations. From a multivariate perspective one would like to adjust for the effect of all of these factors to end up with a network of direct associations connecting the path from genotype to phenotype. In this article we approach this challenge with mixed graphical Markov models, higher-order conditional independences, and q-order correlation graphs. These models show that additive genetic effects propagate through the network as function of gene-gene correlations. Our estimation of the eQTL network underlying a well-studied yeast data set leads to a sparse structure with more direct genetic and regulatory associations that enable a straightforward comparison of the genetic control of gene expression across chromosomes. Interestingly, it also reveals that eQTLs explain most of the expression variability of network hub genes.


Assuntos
Mapeamento Cromossômico , Redes Reguladoras de Genes , Cadeias de Markov , Modelos Genéticos , Locos de Características Quantitativas , Algoritmos , Cruzamentos Genéticos , Regulação Fúngica da Expressão Gênica , Genômica/métodos , Reprodutibilidade dos Testes , Software , Leveduras/genética
11.
BMC Bioinformatics ; 14: 254, 2013 Aug 21.
Artigo em Inglês | MEDLINE | ID: mdl-23965047

RESUMO

BACKGROUND: High-throughput RNA sequencing (RNA-seq) offers unprecedented power to capture the real dynamics of gene expression. Experimental designs with extensive biological replication present a unique opportunity to exploit this feature and distinguish expression profiles with higher resolution. RNA-seq data analysis methods so far have been mostly applied to data sets with few replicates and their default settings try to provide the best performance under this constraint. These methods are based on two well-known count data distributions: the Poisson and the negative binomial. The way to properly calibrate them with large RNA-seq data sets is not trivial for the non-expert bioinformatics user. RESULTS: Here we show that expression profiles produced by extensively-replicated RNA-seq experiments lead to a rich diversity of count data distributions beyond the Poisson and the negative binomial, such as Poisson-Inverse Gaussian or Pólya-Aeppli, which can be captured by a more general family of count data distributions called the Poisson-Tweedie. The flexibility of the Poisson-Tweedie family enables a direct fitting of emerging features of large expression profiles, such as heavy-tails or zero-inflation, without the need to alter a single configuration parameter. We provide a software package for R called tweeDEseq implementing a new test for differential expression based on the Poisson-Tweedie family. Using simulations on synthetic and real RNA-seq data we show that tweeDEseq yields P-values that are equally or more accurate than competing methods under different configuration parameters. By surveying the tiny fraction of sex-specific gene expression changes in human lymphoblastoid cell lines, we also show that tweeDEseq accurately detects differentially expressed genes in a real large RNA-seq data set with improved performance and reproducibility over the previously compared methodologies. Finally, we compared the results with those obtained from microarrays in order to check for reproducibility. CONCLUSIONS: RNA-seq data with many replicates leads to a handful of count data distributions which can be accurately estimated with the statistical model illustrated in this paper. This method provides a better fit to the underlying biological variability; this may be critical when comparing groups of RNA-seq samples with markedly different count data distributions. The tweeDEseq package forms part of the Bioconductor project and it is available for download at http://www.bioconductor.org.


Assuntos
Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Modelos Estatísticos , Reprodutibilidade dos Testes , Software
12.
BMC Bioinformatics ; 14: 7, 2013 Jan 16.
Artigo em Inglês | MEDLINE | ID: mdl-23323831

RESUMO

BACKGROUND: Gene set enrichment (GSE) analysis is a popular framework for condensing information from gene expression profiles into a pathway or signature summary. The strengths of this approach over single gene analysis include noise and dimension reduction, as well as greater biological interpretability. As molecular profiling experiments move beyond simple case-control studies, robust and flexible GSE methodologies are needed that can model pathway activity within highly heterogeneous data sets. RESULTS: To address this challenge, we introduce Gene Set Variation Analysis (GSVA), a GSE method that estimates variation of pathway activity over a sample population in an unsupervised manner. We demonstrate the robustness of GSVA in a comparison with current state of the art sample-wise enrichment methods. Further, we provide examples of its utility in differential pathway activity and survival analysis. Lastly, we show how GSVA works analogously with data from both microarray and RNA-seq experiments. CONCLUSIONS: GSVA provides increased power to detect subtle pathway activity changes over a sample population in comparison to corresponding methods. While GSE methods are generally regarded as end points of a bioinformatic analysis, GSVA constitutes a starting point to build pathway-centric models of biology. Moreover, GSVA contributes to the current need of GSE methods for RNA-seq data. GSVA is an open source software package for R which forms part of the Bioconductor project and can be downloaded at http://www.bioconductor.org.


Assuntos
Perfilação da Expressão Gênica/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Análise de Sequência de RNA/métodos , Software , Análise de Variância , Feminino , Variação Genética , Humanos , Leucemia Aguda Bifenotípica/genética , Leucemia Aguda Bifenotípica/metabolismo , Neoplasias Ovarianas/genética , Neoplasias Ovarianas/metabolismo , Neoplasias Ovarianas/mortalidade , Leucemia-Linfoma Linfoblástico de Células Precursoras/genética , Leucemia-Linfoma Linfoblástico de Células Precursoras/metabolismo , Estatísticas não Paramétricas , Análise de Sobrevida
13.
Methods Mol Biol ; 802: 215-33, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22130883

RESUMO

Regulatory networks inferred from microarray data sets provide an estimated blueprint of the functional interactions taking place under the assayed experimental conditions. In each of these experiments, the gene expression pathway exerts a finely tuned control simultaneously over all genes relevant to the cellular state. This renders most pairs of those genes significantly correlated, and therefore, the challenge faced by every method that aims at inferring a molecular regulatory network from microarray data, lies in distinguishing direct from indirect interactions. A straightforward solution to this problem would be to move directly from bivariate to multivariate statistical approaches. However, the daunting dimension of typical microarray data sets, with a number of genes p several orders of magnitude larger than the number of samples n, precludes the application of standard multivariate techniques and confronts the biologist with sophisticated procedures that address this situation. We have introduced a new way to approach this problem in an intuitive manner, based on limited-order partial correlations, and in this chapter we illustrate this method through the R package qpgraph, which forms part of the Bioconductor project and is available at its Web site.


Assuntos
Perfilação da Expressão Gênica/métodos , Redes Reguladoras de Genes , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Software , Escherichia coli/genética , Internet , Anotação de Sequência Molecular/métodos , Reprodutibilidade dos Testes
14.
RNA ; 17(3): 453-68, 2011 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-21233220

RESUMO

In Drosophila melanogaster, female-specific expression of Sex-lethal (SXL) and Transformer (TRA) proteins controls sex-specific alternative splicing and/or translation of a handful of regulatory genes responsible for sexual differentiation and behavior. Recent findings in 2009 by Telonis-Scott et al. document widespread sex-biased alternative splicing in fruitflies, including instances of tissue-restricted sex-specific splicing. Here we report results arguing that some of these novel sex-specific splicing events are regulated by mechanisms distinct from those established by female-specific expression of SXL and TRA. Bioinformatic analysis of SXL/TRA binding sites, experimental analysis of sex-specific splicing in S2 and Kc cells lines and of the effects of SXL knockdown in Kc cells indicate that SXL-dependent and SXL-independent regulatory mechanisms coexist within the same cell. Additional determinants of sex-specific splicing can be provided by sex-specific differences in the expression of RNA binding proteins, including Hrp40/Squid. We report that sex-specific alternative splicing of the gene hrp40/squid leads to sex-specific differences in the levels of this hnRNP protein. The significant overlap between sex-regulated alternative splicing changes and those induced by knockdown of hrp40/squid and the presence of related sequence motifs enriched near subsets of Hrp40/Squid-regulated and sex-regulated splice sites indicate that this protein contributes to sex-specific splicing regulation. A significant fraction of sex-specific splicing differences are absent in germline-less tudor mutant flies. Intriguingly, these include alternative splicing events that are differentially spliced in tissues distant from the germline. Collectively, our results reveal that distinct genetic programs control widespread sex-specific splicing in Drosophila melanogaster.


Assuntos
Processamento Alternativo , Proteínas de Drosophila/genética , Drosophila melanogaster/genética , Proteínas Nucleares/genética , Proteínas de Ligação a RNA/genética , Animais , Biomarcadores/metabolismo , Western Blotting , Proteínas de Drosophila/metabolismo , Drosophila melanogaster/metabolismo , Feminino , Perfilação da Expressão Gênica , Genes Reguladores , Masculino , Proteínas Nucleares/metabolismo , Análise de Sequência com Séries de Oligonucleotídeos , RNA Mensageiro/genética , Proteínas de Ligação a RNA/metabolismo , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Fatores Sexuais
15.
Biochem Soc Trans ; 37(Pt 4): 778-82, 2009 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-19614593

RESUMO

Genomes contain a large number of genes that do not have recognizable homologues in other species. These genes, found in only one or a few closely related species, are known as orphan genes. Their limited distribution implies that many of them are probably involved in lineage-specific adaptive processes. One important question that has remained elusive to date is how orphan genes originate. It has been proposed that they might have arisen by gene duplication followed by a period of very rapid sequence divergence, which would have erased any traces of similarity to other evolutionarily related genes. However, this explanation does not seem plausible for genes lacking homologues in very closely related species. In the present article, we review recent efforts to identify the mechanisms of formation of primate orphan genes. These studies reveal an unexpected important role of transposable elements in the formation of novel protein-coding genes in the genomes of primates.


Assuntos
Evolução Molecular , Genoma/genética , Primatas/genética , Proteínas/genética , Animais , Duplicação Gênica
16.
J Comput Biol ; 16(2): 213-27, 2009 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-19178140

RESUMO

Reverse engineering bioinformatic procedures applied to high-throughput experimental data have become instrumental in generating new hypotheses about molecular regulatory mechanisms. This has been particularly the case for gene expression microarray data, where a large number of statistical and computational methodologies have been developed in order to assist in building network models of transcriptional regulation. A major challenge faced by every different procedure is that the number of available samples n for estimating the network model is much smaller than the number of genes p forming the system under study. This compromises many of the assumptions on which the statistics of the methods rely, often leading to unstable performance figures. In this work, we apply a recently developed novel methodology based in the so-called q-order limited partial correlation graphs, qp-graphs, which is specifically tailored towards molecular network discovery from microarray expression data with p >> n. Using experimental and functional annotation data from Escherichia coli, here we show how qp-graphs yield more stable performance figures than other state-of-the-art methods when the ratio of genes to experiments exceeds one order of magnitude. More importantly, we also show that the better performance of the qp-graph method on such a gene-to-sample ratio has a decisive impact on the functional coherence of the reverse-engineered transcriptional regulatory modules and becomes crucial in such a challenging situation in order to enable the discovery of a network of reasonable confidence that includes a substantial number of genes relevant to the essayed conditions. An R package, called qpgraph implementing this method is part of the Bioconductor project and can be downloaded from (www.bioconductor.org). A parallel standalone version for the most computationally expensive calculations is available from (http://functionalgenomics.upf.xsedu/qpgraph).


Assuntos
Simulação por Computador , Redes Reguladoras de Genes , Modelos Biológicos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Algoritmos , Biologia Computacional/métodos , Escherichia coli/genética , Escherichia coli/metabolismo , Redes e Vias Metabólicas , Software
17.
Genome Biol ; 10(1): R11, 2009.
Artigo em Inglês | MEDLINE | ID: mdl-19178699

RESUMO

BACKGROUND: Despite the prevalence and biological relevance of both signaling pathways and alternative pre-mRNA splicing, our knowledge of how intracellular signaling impacts on alternative splicing regulation remains fragmentary. We report a genome-wide analysis using splicing-sensitive microarrays of changes in alternative splicing induced by activation of two distinct signaling pathways, insulin and wingless, in Drosophila cells in culture. RESULTS: Alternative splicing changes induced by insulin affect more than 150 genes and more than 50 genes are regulated by wingless activation. About 40% of the genes showing changes in alternative splicing also show regulation of mRNA levels, suggesting distinct but also significantly overlapping programs of transcriptional and post-transcriptional regulation. Distinct functional sets of genes are regulated by each pathway and, remarkably, a significant overlap is observed between functional categories of genes regulated transcriptionally and at the level of alternative splicing. Functions related to carbohydrate metabolism and cellular signaling are enriched among genes regulated by insulin and wingless, respectively. Computational searches identify pathway-specific sequence motifs enriched near regulated 5' splice sites. CONCLUSIONS: Taken together, our data indicate that signaling cascades trigger pathway-specific and biologically coherent regulatory programs of alternative splicing regulation. They also reveal that alternative splicing can provide a novel molecular mechanism for crosstalk between different signaling pathways.


Assuntos
Processamento Alternativo , Proteínas de Drosophila/metabolismo , Genômica , Insulina/metabolismo , Transdução de Sinais/genética , Proteína Wnt1/metabolismo , Animais , Metabolismo dos Carboidratos/genética , Linhagem Celular , Drosophila , Regulação da Expressão Gênica , Genes de Insetos , Análise de Sequência com Séries de Oligonucleotídeos , Receptor Cross-Talk , Transcrição Gênica
18.
Mol Biol Evol ; 26(3): 603-12, 2009 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-19064677

RESUMO

Genomes contain a large number of genes that do not have recognizable homologues in other species and that are likely to be involved in important species-specific adaptive processes. The origin of many such "orphan" genes remains unknown. Here we present the first systematic study of the characteristics and mechanisms of formation of primate-specific orphan genes. We determine that codon usage values for most orphan genes fall within the bulk of the codon usage distribution of bona fide human proteins, supporting their current protein-coding annotation. We also show that primate orphan genes display distinctive features in relation to genes of wider phylogenetic distribution: higher tissue specificity, more rapid evolution, and shorter peptide size. We estimate that around 24% are highly divergent members of mammalian protein families. Interestingly, around 53% of the orphan genes contain sequences derived from transposable elements (TEs) and are mostly located in primate-specific genomic regions. This indicates frequent recruitment of TEs as part of novel genes. Finally, we also obtain evidence that a small fraction of primate orphan genes, around 5.5%, might have originated de novo from mammalian noncoding genomic regions.


Assuntos
Genômica/métodos , Primatas/genética , Animais , Códon , Elementos de DNA Transponíveis , Evolução Molecular , Genoma , Humanos , Peptídeos , Distribuição Tecidual
19.
Genome Res ; 17(11): 1690-6, 2007 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-17895424

RESUMO

The goals of the human genome project did not include sequencing of the heterochromatic regions. We describe here an initial sequence of 1.1 Mb of the short arm of human chromosome 21 (HSA21p), estimated to be 10% of 21p. This region contains extensive euchromatic-like sequence and includes on average one transcript every 100 kb. These transcripts show multiple inter- and intrachromosomal copies, and extensive copy number and sequence variability. The sequencing of the "heterochromatic" regions of the human genome is likely to reveal many additional functional elements and provide important evolutionary information.


Assuntos
Cromossomos Humanos Par 21 , Eucromatina/genética , Polimorfismo Genético , Mapeamento de Sequências Contíguas , Genoma Humano , Humanos , Hibridização in Situ Fluorescente
20.
Genome Res ; 17(6): 746-59, 2007 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-17567994

RESUMO

This report presents systematic empirical annotation of transcript products from 399 annotated protein-coding loci across the 1% of the human genome targeted by the Encyclopedia of DNA elements (ENCODE) pilot project using a combination of 5' rapid amplification of cDNA ends (RACE) and high-density resolution tiling arrays. We identified previously unannotated and often tissue- or cell-line-specific transcribed fragments (RACEfrags), both 5' distal to the annotated 5' terminus and internal to the annotated gene bounds for the vast majority (81.5%) of the tested genes. Half of the distal RACEfrags span large segments of genomic sequences away from the main portion of the coding transcript and often overlap with the upstream-annotated gene(s). Notably, at least 20% of the resultant novel transcripts have changes in their open reading frames (ORFs), most of them fusing ORFs of adjacent transcripts. A significant fraction of distal RACEfrags show expression levels comparable to those of known exons of the same locus, suggesting that they are not part of very minority splice forms. These results have significant implications concerning (1) our current understanding of the architecture of protein-coding genes; (2) our views on locations of regulatory regions in the genome; and (3) the interpretation of sequence polymorphisms mapping to regions hitherto considered to be "noncoding," ultimately relating to the identification of disease-related sequence alterations.


Assuntos
Mapeamento Cromossômico , Éxons , Genoma Humano , Regiões Promotoras Genéticas , Locos de Características Quantitativas , Transcrição Gênica/fisiologia , DNA Complementar/genética , Projeto Genoma Humano , Humanos , Fases de Leitura Aberta
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...