Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 25
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Cell ; 153(5): 1134-48, 2013 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-23664764

RESUMO

Epigenetic mechanisms have been proposed to play crucial roles in mammalian development, but their precise functions are only partially understood. To investigate epigenetic regulation of embryonic development, we differentiated human embryonic stem cells into mesendoderm, neural progenitor cells, trophoblast-like cells, and mesenchymal stem cells and systematically characterized DNA methylation, chromatin modifications, and the transcriptome in each lineage. We found that promoters that are active in early developmental stages tend to be CG rich and mainly engage H3K27me3 upon silencing in nonexpressing lineages. By contrast, promoters for genes expressed preferentially at later stages are often CG poor and primarily employ DNA methylation upon repression. Interestingly, the early developmental regulatory genes are often located in large genomic domains that are generally devoid of DNA methylation in most lineages, which we termed DNA methylation valleys (DMVs). Our results suggest that distinct epigenetic mechanisms regulate early and late stages of ES cell differentiation.


Assuntos
Metilação de DNA , Células-Tronco Embrionárias/metabolismo , Epigenômica , Regulação da Expressão Gênica no Desenvolvimento , Animais , Diferenciação Celular , Cromatina/metabolismo , Ilhas de CpG , Células-Tronco Embrionárias/citologia , Histonas/metabolismo , Humanos , Metilação , Neoplasias/genética , Regiões Promotoras Genéticas , Peixe-Zebra/embriologia
2.
Thorax ; 78(3): 225-232, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-35710744

RESUMO

BACKGROUND: Adult asthma is phenotypically heterogeneous with unclear aetiology. We aimed to evaluate the potential contribution of environmental exposure and its ensuing response to asthma and its heterogeneity. METHODS: Environmental risk was evaluated by assessing the records of National Health Insurance Research Database (NHIRD) and residence-based air pollution (particulate matter with diameter less than 2.5 micrometers (PM2.5) and PM2.5-bound polycyclic aromatic hydrocarbons (PAHs)), integrating biomonitoring analysis of environmental pollutants, inflammatory markers and sphingolipid metabolites in case-control populations with mass spectrometry and ELISA. Phenotypic clustering was evaluated by t-distributed stochastic neighbor embedding (t-SNE) integrating 18 clinical and demographic variables. FINDINGS: In the NHIRD dataset, modest increase in the relative risk with time-lag effect for emergency (N=209 837) and outpatient visits (N=638 538) was observed with increasing levels of PM2.5 and PAHs. Biomonitoring analysis revealed a panel of metals and organic pollutants, particularly metal Ni and PAH, posing a significant risk for current asthma (ORs=1.28-3.48) and its severity, correlating with the level of oxidative stress markers, notably Nε-(hexanoyl)-lysine (r=0.108-0.311, p<0.05), but not with the accumulated levels of PM2.5 exposure. Further, levels of circulating sphingosine-1-phosphate and ceramide-1-phosphate were found to discriminate asthma (p<0.001 and p<0.05, respectively), correlating with the levels of PAH (r=0.196, p<0.01) and metal exposure (r=0.202-0.323, p<0.05), respectively, and both correlating with circulating inflammatory markers (r=0.186-0.427, p<0.01). Analysis of six phenotypic clusters and those cases with comorbid type 2 diabetes mellitus (T2DM) revealed cluster-selective environmental risks and biosignatures. INTERPRETATION: These results suggest the potential contribution of environmental factors from multiple sources, their ensuing oxidative stress and sphingolipid remodeling to adult asthma and its phenotypic heterogeneity.


Assuntos
Poluentes Atmosféricos , Poluição do Ar , Asma , Diabetes Mellitus Tipo 2 , Hidrocarbonetos Policíclicos Aromáticos , Adulto , Humanos , Poluentes Atmosféricos/toxicidade , Poluentes Atmosféricos/análise , Esfingolipídeos , Poluição do Ar/efeitos adversos , Poluição do Ar/análise , Material Particulado/toxicidade , Material Particulado/análise , Hidrocarbonetos Policíclicos Aromáticos/análise , Monitoramento Ambiental/métodos
3.
J Transl Med ; 20(1): 324, 2022 07 21.
Artigo em Inglês | MEDLINE | ID: mdl-35864526

RESUMO

Kidney transplantation is a lifesaving option for patients with end-stage kidney disease. In Taiwan, urothelial carcinoma (UC) is the most common de novo cancer after kidney transplantation (KT). UC has a greater degree of molecular heterogeneity than do other solid tumors. Few studies have explored genomic alterations in UC after KT. We performed whole-exome sequencing to compare the genetic alterations in UC developed after kidney transplantation (UCKT) and in UC in patients on hemodialysis (UCHD). After mapping and variant calling, 18,733 and 11,093 variants were identified in patients with UCKT and UCHD, respectively. We excluded known single-nucleotide polymorphisms (SNPs) and retained genes that were annotated in the Catalogue of Somatic Mutations in Cancer (COSMIC), in the Integrative Onco Genomic cancer mutations browser (IntOGen), and in the Cancer Genome Atlas (TCGA) database of genes associated with bladder cancer. A total of 14 UCKT-specific genes with SNPs identified in more than two patients were included in further analyses. The single-base substitution (SBS) profile and signatures showed a relative high T > A pattern compared to COMSIC UC mutations. Ingenuity pathway analysis was used to explore the connections among these genes. GNAQ, IKZF1, and NTRK3 were identified as potentially involved in the signaling network of UCKT. The genetic analysis of posttransplant malignancies may elucidate a fundamental aspect of the molecular pathogenesis of UCKT.


Assuntos
Carcinoma de Células de Transição , Transplante de Rim , Neoplasias da Bexiga Urinária , Humanos , Mutação/genética , Neoplasias da Bexiga Urinária/patologia , Sequenciamento do Exoma
5.
Nucleic Acids Res ; 42(5): 3009-16, 2014 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-24343027

RESUMO

DNA methylation is an important defense and regulatory mechanism. In mammals, most DNA methylation occurs at CpG sites, and asymmetric non-CpG methylation has only been detected at appreciable levels in a few cell types. We are the first to systematically study the strand-specific distribution of non-CpG methylation. With the divide-and-compare strategy, we show that CHG and CHH methylation are not intrinsically different in human embryonic stem cells (ESCs) and induced pluripotent stem cells (iPSCs). We also find that non-CpG methylation is skewed between the two strands in introns, especially at intron boundaries and in highly expressed genes. Controlling for the proximal sequences of non-CpG sites, we show that the skew of non-CpG methylation in introns is mainly guided by sequence skew. By studying subgroups of transposable elements, we also found that non-CpG methylation is distributed in a strand-specific manner in both short interspersed nuclear elements (SINE) and long interspersed nuclear elements (LINE), but not in long terminal repeats (LTR). Finally, we show that on the antisense strand of Alus, a non-CpG site just downstream of the A-box is highly methylated. Together, the divide-and-compare strategy leads us to identify regions with strand-specific distributions of non-CpG methylation in humans.


Assuntos
Metilação de DNA , Células-Tronco Pluripotentes/metabolismo , Linhagem Celular , Ilhas de CpG , Humanos , Íntrons , Elementos Nucleotídeos Longos e Dispersos , Análise de Sequência de DNA , Elementos Nucleotídeos Curtos e Dispersos , Sequências Repetidas Terminais , Transcrição Gênica
6.
Vision Res ; 222: 108447, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-38906036

RESUMO

Among tetrapod (terrestrial) vertebrates, amphibians remain more closely tied to an amphibious lifestyle than amniotes, and their visual opsin genes may be adapted to this lifestyle. Previous studies have discussed physiological, morphological, and molecular changes in the evolution of amphibian vision. We predicted the locations of the visual opsin genes, their neighboring genes, and the tuning sites of the visual opsins, in 39 amphibian genomes. We found that all of the examined genomes lacked the Rh2 gene. The caecilian genomes have further lost the SWS1 and SWS2 genes; only the Rh1 and LWS genes were retained. The loss of the SWS1 and SWS2 genes in caecilians may be correlated with their cryptic lifestyles. The opsin gene syntenies were predicted to be highly similar to those of other bony vertebrates. Moreover, dual syntenies were identified in allotetraploid Xenopus laevis and X. borealis. Tuning site analysis showed that only some Caudata species might have UV vision. In addition, the S164A that occurred several times in LWS evolution might either functionally compensate for the Rh2 gene loss or fine-tuning visual adaptation. Our study provides the first genomic evidence for a caecilian LWS gene and a genomic viewpoint of visual opsin genes by reviewing the gains and losses of visual opsin genes, the rearrangement of syntenies, and the alteration of spectral tuning in the course of amphibians' evolution.


Assuntos
Anfíbios , Evolução Molecular , Animais , Anfíbios/genética , Filogenia , Opsinas/genética , Opsinas de Bastonetes/genética , Genoma
7.
IEEE Trans Nanobioscience ; 23(3): 499-506, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38687648

RESUMO

Given an undirected, unweighted graph with n vertices and m edges, the maximum cut problem is to find a partition of the n vertices into disjoint subsets V1 and V2 such that the number of edges between them is as large as possible. Classically, it is an NP-complete problem, which has potential applications ranging from circuit layout design, statistical physics, computer vision, machine learning and network science to clustering. In this paper, we propose a biomolecular and a quantum algorithm to solve the maximum cut problem for any graph G. The quantum algorithm is inspired by the biomolecular algorithm and has a quadratic speedup over its classical counterparts, where the temporal and spatial complexities are reduced to, respectively, [Formula: see text] and [Formula: see text]. With respect to oracle-related quantum algorithms for NP-complete problems, we identify our algorithm as optimal. Furthermore, to justify the feasibility of the proposed algorithm, we successfully solve a typical maximum cut problem for a graph with three vertices and two edges by carrying out experiments on IBM's quantum simulator.


Assuntos
Algoritmos , Teoria Quântica , Biologia Computacional/métodos , Simulação por Computador
8.
Kaohsiung J Med Sci ; 40(5): 445-455, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38593276

RESUMO

Neurotrophic receptor tyrosine kinase 3 (NTRK3) has pleiotropic functions: it acts not only as an oncogene in breast and gastric cancers but also as a dependence receptor in tumor suppressor genes in colon cancer and neuroblastomas. However, the role of NTRK3 in upper tract urothelial carcinoma (UTUC) is not well documented. This study investigated the association between NTRK3 expression and outcomes in UTUC patients and validated the results in tests on UTUC cell lines. A total of 118 UTUC cancer tissue samples were examined to evaluate the expression of NTRK3. Survival curves were generated using Kaplan-Meier estimates, and Cox regression models were used for investigating survival outcomes. Higher NTRK3 expression was correlated with worse progression-free survival, cancer-specific survival, and overall survival. Moreover, the results of an Ingenuity Pathway Analysis suggested that NTRK3 may interact with the PI3K-AKT-mTOR signaling pathway to promote cancer. NTRK3 downregulation in BFTC909 cells through shRNA reduced cellular migration, invasion, and activity in the AKT-mTOR pathway. Furthermore, the overexpression of NTRK3 in UM-UC-14 cells promoted AKT-mTOR pathway activity, cellular migration, and cell invasion. From these observations, we concluded that NTRK3 may contribute to aggressive behaviors in UTUC by facilitating cell migration and invasion through its interaction with the AKT-mTOR pathway and the expression of NTRK3 is a potential predictor of clinical outcomes in cases of UTUC.


Assuntos
Movimento Celular , Receptor trkC , Neoplasias Urológicas , Feminino , Humanos , Masculino , Linhagem Celular Tumoral , Regulação Neoplásica da Expressão Gênica , Estimativa de Kaplan-Meier , Fosfatidilinositol 3-Quinases/metabolismo , Fosfatidilinositol 3-Quinases/genética , Proteínas Proto-Oncogênicas c-akt/metabolismo , Proteínas Proto-Oncogênicas c-akt/genética , Receptor trkC/metabolismo , Receptor trkC/genética , Transdução de Sinais , Serina-Treonina Quinases TOR/metabolismo , Serina-Treonina Quinases TOR/genética , Neoplasias Urológicas/genética , Neoplasias Urológicas/metabolismo , Neoplasias Urológicas/patologia
9.
Sci Rep ; 13(1): 4205, 2023 Mar 14.
Artigo em Inglês | MEDLINE | ID: mdl-36918570

RESUMO

A dominating set of a graph [Formula: see text] is a subset U of its vertices V, such that any vertex of G is either in U, or has a neighbor in U. The dominating-set problem is to find a minimum dominating set in G. Dominating sets are of critical importance for various types of networks/graphs, and find therefore potential applications in many fields. Particularly, in the area of communication, dominating sets are prominently used in the efficient organization of large-scale wireless ad hoc and sensor networks. However, the dominating set problem is also a hard optimization problem and thus currently is not efficiently solvable on classical computers. Here, we propose a biomolecular and a quantum algorithm for this problem, where the quantum algorithm provides a quadratic speedup over any classical algorithm. We show that the dominating set problem can be solved in [Formula: see text] queries by our proposed quantum algorithm, where n is the number of vertices in G. We also demonstrate that our quantum algorithm is the best known procedure to date for this problem. We confirm the correctness of our algorithm by executing it on IBM Quantum's qasm simulator and the Brooklyn superconducting quantum device. And lastly, we show that molecular solutions obtained from solving the dominating set problem are represented in terms of a unit vector in a finite-dimensional Hilbert space.

10.
Genome Res ; 19(11): 2144-53, 2009 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-19819906

RESUMO

How many species inhabit our immediate surroundings? A straightforward collection technique suitable for answering this question is known to anyone who has ever driven a car at highway speeds. The windshield of a moving vehicle is subjected to numerous insect strikes and can be used as a collection device for representative sampling. Unfortunately the analysis of biological material collected in that manner, as with most metagenomic studies, proves to be rather demanding due to the large number of required tools and considerable computational infrastructure. In this study, we use organic matter collected by a moving vehicle to design and test a comprehensive pipeline for phylogenetic profiling of metagenomic samples that includes all steps from processing and quality control of data generated by next-generation sequencing technologies to statistical analyses and data visualization. To the best of our knowledge, this is also the first publication that features a live online supplement providing access to exact analyses and workflows used in the article.


Assuntos
Algoritmos , Biologia Computacional/métodos , DNA/isolamento & purificação , Metagenômica/métodos , Animais , Automóveis , Bactérias/classificação , Bactérias/genética , DNA/química , DNA Bacteriano/química , DNA Bacteriano/isolamento & purificação , Bases de Dados de Ácidos Nucleicos , Humanos , Filogenia , Reprodutibilidade dos Testes , Análise de Sequência de DNA/métodos , Software
11.
Proc Natl Acad Sci U S A ; 106(31): 12741-6, 2009 Aug 04.
Artigo em Inglês | MEDLINE | ID: mdl-19617558

RESUMO

Brain structure and function experience dramatic changes from embryonic to postnatal development. Microarray analyses have detected differential gene expression at different stages and in disease models, but gene expression information during early brain development is limited. We have generated >27 million reads to identify mRNAs from the mouse cortex for >16,000 genes at either embryonic day 18 (E18) or postnatal day 7 (P7), a period of significant synaptogenesis for neural circuit formation. In addition, we devised strategies to detect alternative splice forms and uncovered more splice variants. We observed differential expression of 3,758 genes between the 2 stages, many with known functions or predicted to be important for neural development. Neurogenesis-related genes, such as those encoding Sox4, Sox11, and zinc-finger proteins, were more highly expressed at E18 than at P7. In contrast, the genes encoding synaptic proteins such as synaptotagmin, complexin 2, and syntaxin were up-regulated from E18 to P7. We also found that several neurological disorder-related genes were highly expressed at E18. Our transcriptome analysis may serve as a blueprint for gene expression pattern and provide functional clues of previously unknown genes and disease-related genes during early brain development.


Assuntos
Córtex Cerebral/embriologia , Córtex Cerebral/metabolismo , Perfilação da Expressão Gênica , Análise de Sequência de RNA , Animais , Animais Recém-Nascidos , Apoptose , Autofagia , Encefalopatias/genética , Feminino , Regulação da Expressão Gênica no Desenvolvimento , Camundongos , Camundongos Endogâmicos C57BL , Gravidez , Sinapses/fisiologia , Fatores de Transcrição/genética
12.
IEEE Trans Nanobioscience ; 21(2): 286-293, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-34822331

RESUMO

In this paper, we propose a bio-molecular algorithm with O( n 2) biological operations, O( 2n-1 ) DNA strands, O( n ) tubes and the longest DNA strand, O( n ), for inferring the value of a bit from the only output satisfying any given condition in an unsorted database with 2n items of n bits. We show that the value of each bit of the outcome is determined by executing our bio-molecular algorithm n times. Then, we show how to view a bio-molecular solution space with 2n-1 DNA strands as an eigenvector and how to find the corresponding unitary operator and eigenvalues for inferring the value of a bit in the output. We also show that using an extension of the quantum phase estimation and quantum counting algorithms computes its unitary operator and eigenvalues from bio-molecular solution space with 2n-1 DNA strands. Next, we demonstrate that the value of each bit of the output solution can be determined by executing the proposed extended quantum algorithms n times. To verify our theorem, we find the maximum-sized clique to a graph with two vertices and one edge and the solution b that satisfies b2 ≡ 1 (mod 15) and using IBM Quantum's backend.


Assuntos
Algoritmos , Computadores , DNA/química , Bases de Dados Factuais
13.
IEEE Trans Nanobioscience ; 20(3): 354-376, 2021 07.
Artigo em Inglês | MEDLINE | ID: mdl-33900920

RESUMO

In this paper, we propose a bio-molecular algorithm with O( n2 + m ) biological operations, O( 2n ) DNA strands, O( n ) tubes and the longest DNA strand, O( n ), for solving the independent-set problem for any graph G with m edges and n vertices. Next, we show that a new kind of the straightforward Boolean circuit yielded from the bio-molecular solutions with m NAND gates, ( m +n × ( n + 1 )) AND gates and (( n × ( n + 1 ))/2) NOT gates can find the maximal independent-set(s) to the independent-set problem for any graph G with m edges and n vertices. We show that a new kind of the proposed quantum-molecular algorithm can find the maximal independent set(s) with the lower bound Ω ( 2n/2 ) queries and the upper bound O( 2n/2 ) queries. This work offers an obvious evidence for that to solve the independent-set problem in any graph G with m edges and n vertices, bio-molecular computers are able to generate a new kind of the straightforward Boolean circuit such that by means of implementing it quantum computers can give a quadratic speed-up. This work also offers one obvious evidence that quantum computers can significantly accelerate the speed and enhance the scalability of bio-molecular computers. Next, the element distinctness problem with input of n bits is to determine whether the given 2n real numbers are distinct or not. The quantum lower bound of solving the element distinctness problem is Ω ( 2n×(2/3) ) queries in the case of a quantum walk algorithm. We further show that the proposed quantum-molecular algorithm reduces the quantum lower bound to Ω (( 2n/2 )/( [Formula: see text]) queries. Furthermore, to justify the feasibility of the proposed quantum-molecular algorithm, we successfully solve a typical independent set problem for a graph G with two vertices and one edge by carrying out experiments on the backend ibmqx4 with five quantum bits and the backend simulator with 32 quantum bits on IBM's quantum computer.


Assuntos
Algoritmos , Computadores Moleculares , Computadores , DNA
14.
J Asthma Allergy ; 14: 81-90, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33542635

RESUMO

PURPOSE: Exposure to polycyclic aromatic hydrocarbons (PAHs) associated with ambient air particulate matter (PM) poses significant health concerns. Increased acute exacerbation (AE) frequency in asthmatic patients has been associated with ambient PAHs, but which subgroup of patients are particularly susceptible to ambient PAHs is uncertain. We developed a new model to simulate grid-scale PM2.5-PAH levels in order to evaluate whether the severity of asthma as measured by the Global Initiative of Asthma (GINA) levels of treatment is related to cumulative exposure of ambient PAHs. METHODS: Patients with asthma residing in the northern Taiwan were reviewed retrospectively from 2014 to 2017. PM2.5 were sampled and analysed for PAHs twice a month over a 72-hour period, in addition to collecting the routinely monitored air pollutant data from an established air quality monitoring network. In combination with correlation analysis and principal component analysis, multivariate linear regression models were performed to simulate hourly grid-scale PM2.5-PAH concentrations (ng/m3). A geographic information system mapping approach with ordinary kriging interpolation method was used to calculate the annual exposure of PAHs (ng/m). RESULTS: Among the 387 patients with asthma aged 18 to 93 (median 62), 97 subjects were treated as GINA step 5 (24%). Asthmatics in GINA 5 subgroup with high annual PAHs exposure were likely to have a higher annual frequency of any AE (1 (0-12), p<0.0001). Annual PAHs exposure was correlated with the annual frequency of any exacerbation (r=0.11, p=0.02). This was more significant in the GINA 5 subgroup (r=0.29, p=0.005) and in the GINA 5 subgroup with severe acute exacerbations (r=0.51, p=0.002). Annual PAHs exposure, severe acute exacerbation and GINA steps were independent variables that predict annual frequency of any exacerbation. CONCLUSION: Asthmatic patients in the GINA 5 subgroup with acute exacerbations were more susceptible to the effect of environmental PAHs on their exacerbation frequency. Reducing environmental levels of PAHs will have the greatest impact on the more severe asthma patients.

15.
Bioinformatics ; 25(21): 2841-2, 2009 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-19736251

RESUMO

SUMMARY: We report on a major new version of the RMAP software for mapping reads from short-read sequencing technology. General improvements to accuracy and space requirements are included, along with novel functionality. Included in the RMAP software package are tools for mapping paired-end reads, mapping using more sophisticated use of quality scores, collecting ambiguous mapping locations and mapping bisulfite-treated reads. AVAILABILITY: The applications described in this note are available for download at http://www.cmb.usc.edu/people/andrewds/rmap and are distributed as Open Source software under the GPLv3.0. The software has been tested on Linux and OS X platforms. CONTACT: andrewds@usc.edu; mzhang@cshl.edu


Assuntos
Biologia Computacional/métodos , Análise de Sequência de DNA/métodos , Software , Algoritmos , Alinhamento de Sequência , Análise de Sequência de RNA/métodos
16.
Chemosphere ; 246: 125722, 2020 May.
Artigo em Inglês | MEDLINE | ID: mdl-31891849

RESUMO

Modeling approaches have been utilized to simulate ambient pollutant concentrations, but very limited efforts have been made to estimate volatile organic compounds in the atmosphere. For this reason, an hourly grid-scale simulation model was developed to determine ambient air concentrations of benzene, toluene, ethylbenzene, and xylene (BTEX). BTEX data were collected over a one-year time frame from the database of the Taiwan Environmental Protection Administration's photochemical assessment monitoring stations. Multivariate linear regression models were used along with correlation analysis to simulate hourly grid-scale BTEX concentrations, using criteria pollutants and selected meteorological variables as predictors. The simulation model was validated in the southern Taiwan area via a portable micro gas chromatography system (n = 121) with significant correlation (r = 0.566**, ** indicated p < 0.01). Moreover, the grid-scale model was applied to areas covering about 72% of the population in Taiwan. A geographic information system (GIS) was used to visualize the spatial distribution of BTEX concentrations from the modeling results. This new grid-scale modeling strategy, which incorporated the GIS output of the simulated data, provides a useful alternative tool for personal exposure analysis and health risk assessment of ambient air BTEX.


Assuntos
Monitoramento Ambiental/métodos , Modelos Químicos , Poluentes Atmosféricos/análise , Atmosfera/análise , Benzeno/análise , Derivados de Benzeno/análise , Sistemas de Informação Geográfica , Humanos , Modelos Lineares , Taiwan , Tolueno/análise , Compostos Orgânicos Voláteis/análise , Xilenos/análise
17.
PLoS Comput Biol ; 3(5): e91, 2007 May.
Artigo em Inglês | MEDLINE | ID: mdl-17511511

RESUMO

Coding of multiple proteins by overlapping reading frames is not a feature one would associate with eukaryotic genes. Indeed, codependency between codons of overlapping protein-coding regions imposes a unique set of evolutionary constraints, making it a costly arrangement. Yet in cases of tightly coexpressed interacting proteins, dual coding may be advantageous. Here we show that although dual coding is nearly impossible by chance, a number of human transcripts contain overlapping coding regions. Using newly developed statistical techniques, we identified 40 candidate genes with evolutionarily conserved overlapping coding regions. Because our approach is conservative, we expect mammals to possess more dual-coding genes. Our results emphasize that the skepticism surrounding eukaryotic dual coding is unwarranted: rather than being artifacts, overlapping reading frames are often hallmarks of fascinating biology.


Assuntos
Mapeamento Cromossômico/métodos , Mamíferos/genética , Família Multigênica/genética , Fases de Leitura Aberta/genética , Sítios de Splice de RNA/genética , Análise de Sequência de DNA/métodos , Animais , Sequência de Bases , Simulação por Computador , Humanos , Modelos Genéticos , Dados de Sequência Molecular
18.
Trends Genet ; 19(6): 306-10, 2003 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-12801721

RESUMO

We developed a new evolutionary method for identifying exons from genomic sequences and found 19000 potential coding exons that are absent from all existing annotations of the human genome. Of these, 13700 satisfied very stringent criteria and can with confidence be considered as novel exons. Evidently, a large number of new human genes can be identified using evolutionary approaches.


Assuntos
Evolução Molecular , Éxons , Genoma Humano , Análise de Sequência de DNA/métodos , Animais , Sequência Conservada , Humanos , Camundongos
19.
BMC Bioinformatics ; 7: 46, 2006 Jan 27.
Artigo em Inglês | MEDLINE | ID: mdl-16441884

RESUMO

BACKGROUND: While gene duplication is known to be one of the most common mechanisms of genome evolution, the fates of genes after duplication are still being debated. In particular, it is presently unknown whether most duplicate genes preserve (or subdivide) the functions of the parental gene or acquire new functions. One aspect of gene function, that is the expression profile in gene coexpression network, has been largely unexplored for duplicate genes. RESULTS: Here we build a human gene coexpression network using human tissue-specific microarray data and investigate the divergence of duplicate genes in it. The topology of this network is scale-free. Interestingly, our analysis indicates that duplicate genes rapidly lose shared coexpressed partners: after approximately 50 million years since duplication, the two duplicate genes in a pair have only slightly higher number of shared partners as compared with two random singletons. We also show that duplicate gene pairs quickly acquire new coexpressed partners: the average number of partners for a duplicate gene pair is significantly greater than that for a singleton (the latter number can be used as a proxy of the number of partners for a parental singleton gene before duplication). The divergence in gene expression between two duplicates in a pair occurs asymmetrically: one gene usually has more partners than the other one. The network is resilient to both random and degree-based in silico removal of either singletons or duplicate genes. In contrast, the network is especially vulnerable to the removal of highly connected genes when duplicate genes and singletons are considered together. CONCLUSION: Duplicate genes rapidly diverge in their expression profiles in the network and play similar role in maintaining the network robustness as compared with singletons.


Assuntos
Mapeamento Cromossômico/métodos , Regulação da Expressão Gênica/genética , Genes Duplicados/genética , Genoma Humano/genética , Modelos Genéticos , Família Multigênica/genética , Fatores de Transcrição/genética , Evolução Molecular , Variação Genética/genética , Humanos , Transdução de Sinais/genética
20.
Nucleic Acids Res ; 31(13): 3564-7, 2003 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-12824366

RESUMO

Since a large number of computationally predicted exons are not supported by existing sequence (e.g. ESTs) or experimental (e.g. expression analysis) data they need to be validated by other methods. ETOPE is designed to test computational predictions by using signals that have not been included in any current computational prediction method. The test is based on the ratio of non-synonymous to synonymous substitution rates between sequences from different genomes. It has been previously shown, by empirical data and computer simulation, to be a powerful criterion for identifying protein-coding regions. The ETOPE is available at http://nekrut.uchicago.edu/etope/.


Assuntos
Éxons , Análise de Sequência de DNA/métodos , Software , Substituição de Aminoácidos , Animais , Evolução Molecular , Genômica/métodos , Humanos , Internet , Camundongos , Mutação , Alinhamento de Sequência , Interface Usuário-Computador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA