Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 24
Filter
1.
Article in English | MEDLINE | ID: mdl-38687648

ABSTRACT

Given an undirected, unweighted graph with n vertices and m edges, the maximum cut problem is to find a partition of the n vertices into disjoint subsets V1 and V2 such that the number of edges between them is as large as possible. Classically, it is an NP-complete problem, which has potential applications ranging from circuit layout design, statistical physics, computer vision, machine learning and network science to clustering. In this paper, we propose a biomolecular and a quantum algorithm to solve the maximum cut problem for any graph G. The quantum algorithm is inspired by the biomolecular algorithm and has a quadratic speedup over its classical counterparts, where the temporal and spatial complexities are reduced to, respectively, O(√2n/r) and O(m2). With respect to oracle-related quantum algorithms for NP-complete problems, we identify our algorithm as optimal. Furthermore, to justify the feasibility of the proposed algorithm, we successfully solve a typical maximum cut problem for a graph with three vertices and two edges by carrying out experiments on IBM's quantum simulator.

2.
Kaohsiung J Med Sci ; 40(5): 445-455, 2024 May.
Article in English | MEDLINE | ID: mdl-38593276

ABSTRACT

Neurotrophic receptor tyrosine kinase 3 (NTRK3) has pleiotropic functions: it acts not only as an oncogene in breast and gastric cancers but also as a dependence receptor in tumor suppressor genes in colon cancer and neuroblastomas. However, the role of NTRK3 in upper tract urothelial carcinoma (UTUC) is not well documented. This study investigated the association between NTRK3 expression and outcomes in UTUC patients and validated the results in tests on UTUC cell lines. A total of 118 UTUC cancer tissue samples were examined to evaluate the expression of NTRK3. Survival curves were generated using Kaplan-Meier estimates, and Cox regression models were used for investigating survival outcomes. Higher NTRK3 expression was correlated with worse progression-free survival, cancer-specific survival, and overall survival. Moreover, the results of an Ingenuity Pathway Analysis suggested that NTRK3 may interact with the PI3K-AKT-mTOR signaling pathway to promote cancer. NTRK3 downregulation in BFTC909 cells through shRNA reduced cellular migration, invasion, and activity in the AKT-mTOR pathway. Furthermore, the overexpression of NTRK3 in UM-UC-14 cells promoted AKT-mTOR pathway activity, cellular migration, and cell invasion. From these observations, we concluded that NTRK3 may contribute to aggressive behaviors in UTUC by facilitating cell migration and invasion through its interaction with the AKT-mTOR pathway and the expression of NTRK3 is a potential predictor of clinical outcomes in cases of UTUC.


Subject(s)
Cell Movement , Proto-Oncogene Proteins c-akt , Receptor, trkC , Signal Transduction , Humans , Receptor, trkC/metabolism , Receptor, trkC/genetics , Female , Cell Line, Tumor , Male , Proto-Oncogene Proteins c-akt/metabolism , Proto-Oncogene Proteins c-akt/genetics , Middle Aged , Aged , TOR Serine-Threonine Kinases/metabolism , TOR Serine-Threonine Kinases/genetics , Gene Expression Regulation, Neoplastic , Phosphatidylinositol 3-Kinases/metabolism , Phosphatidylinositol 3-Kinases/genetics , Kaplan-Meier Estimate , Urologic Neoplasms/genetics , Urologic Neoplasms/metabolism , Urologic Neoplasms/pathology
3.
Sci Rep ; 13(1): 4205, 2023 Mar 14.
Article in English | MEDLINE | ID: mdl-36918570

ABSTRACT

A dominating set of a graph [Formula: see text] is a subset U of its vertices V, such that any vertex of G is either in U, or has a neighbor in U. The dominating-set problem is to find a minimum dominating set in G. Dominating sets are of critical importance for various types of networks/graphs, and find therefore potential applications in many fields. Particularly, in the area of communication, dominating sets are prominently used in the efficient organization of large-scale wireless ad hoc and sensor networks. However, the dominating set problem is also a hard optimization problem and thus currently is not efficiently solvable on classical computers. Here, we propose a biomolecular and a quantum algorithm for this problem, where the quantum algorithm provides a quadratic speedup over any classical algorithm. We show that the dominating set problem can be solved in [Formula: see text] queries by our proposed quantum algorithm, where n is the number of vertices in G. We also demonstrate that our quantum algorithm is the best known procedure to date for this problem. We confirm the correctness of our algorithm by executing it on IBM Quantum's qasm simulator and the Brooklyn superconducting quantum device. And lastly, we show that molecular solutions obtained from solving the dominating set problem are represented in terms of a unit vector in a finite-dimensional Hilbert space.

5.
Thorax ; 78(3): 225-232, 2023 03.
Article in English | MEDLINE | ID: mdl-35710744

ABSTRACT

BACKGROUND: Adult asthma is phenotypically heterogeneous with unclear aetiology. We aimed to evaluate the potential contribution of environmental exposure and its ensuing response to asthma and its heterogeneity. METHODS: Environmental risk was evaluated by assessing the records of National Health Insurance Research Database (NHIRD) and residence-based air pollution (particulate matter with diameter less than 2.5 micrometers (PM2.5) and PM2.5-bound polycyclic aromatic hydrocarbons (PAHs)), integrating biomonitoring analysis of environmental pollutants, inflammatory markers and sphingolipid metabolites in case-control populations with mass spectrometry and ELISA. Phenotypic clustering was evaluated by t-distributed stochastic neighbor embedding (t-SNE) integrating 18 clinical and demographic variables. FINDINGS: In the NHIRD dataset, modest increase in the relative risk with time-lag effect for emergency (N=209 837) and outpatient visits (N=638 538) was observed with increasing levels of PM2.5 and PAHs. Biomonitoring analysis revealed a panel of metals and organic pollutants, particularly metal Ni and PAH, posing a significant risk for current asthma (ORs=1.28-3.48) and its severity, correlating with the level of oxidative stress markers, notably Nε-(hexanoyl)-lysine (r=0.108-0.311, p<0.05), but not with the accumulated levels of PM2.5 exposure. Further, levels of circulating sphingosine-1-phosphate and ceramide-1-phosphate were found to discriminate asthma (p<0.001 and p<0.05, respectively), correlating with the levels of PAH (r=0.196, p<0.01) and metal exposure (r=0.202-0.323, p<0.05), respectively, and both correlating with circulating inflammatory markers (r=0.186-0.427, p<0.01). Analysis of six phenotypic clusters and those cases with comorbid type 2 diabetes mellitus (T2DM) revealed cluster-selective environmental risks and biosignatures. INTERPRETATION: These results suggest the potential contribution of environmental factors from multiple sources, their ensuing oxidative stress and sphingolipid remodeling to adult asthma and its phenotypic heterogeneity.


Subject(s)
Air Pollutants , Air Pollution , Asthma , Diabetes Mellitus, Type 2 , Polycyclic Aromatic Hydrocarbons , Adult , Humans , Air Pollutants/toxicity , Air Pollutants/analysis , Sphingolipids , Air Pollution/adverse effects , Air Pollution/analysis , Particulate Matter/toxicity , Particulate Matter/analysis , Polycyclic Aromatic Hydrocarbons/analysis , Environmental Monitoring/methods
6.
J Transl Med ; 20(1): 324, 2022 07 21.
Article in English | MEDLINE | ID: mdl-35864526

ABSTRACT

Kidney transplantation is a lifesaving option for patients with end-stage kidney disease. In Taiwan, urothelial carcinoma (UC) is the most common de novo cancer after kidney transplantation (KT). UC has a greater degree of molecular heterogeneity than do other solid tumors. Few studies have explored genomic alterations in UC after KT. We performed whole-exome sequencing to compare the genetic alterations in UC developed after kidney transplantation (UCKT) and in UC in patients on hemodialysis (UCHD). After mapping and variant calling, 18,733 and 11,093 variants were identified in patients with UCKT and UCHD, respectively. We excluded known single-nucleotide polymorphisms (SNPs) and retained genes that were annotated in the Catalogue of Somatic Mutations in Cancer (COSMIC), in the Integrative Onco Genomic cancer mutations browser (IntOGen), and in the Cancer Genome Atlas (TCGA) database of genes associated with bladder cancer. A total of 14 UCKT-specific genes with SNPs identified in more than two patients were included in further analyses. The single-base substitution (SBS) profile and signatures showed a relative high T > A pattern compared to COMSIC UC mutations. Ingenuity pathway analysis was used to explore the connections among these genes. GNAQ, IKZF1, and NTRK3 were identified as potentially involved in the signaling network of UCKT. The genetic analysis of posttransplant malignancies may elucidate a fundamental aspect of the molecular pathogenesis of UCKT.


Subject(s)
Carcinoma, Transitional Cell , Kidney Transplantation , Urinary Bladder Neoplasms , Humans , Mutation/genetics , Urinary Bladder Neoplasms/pathology , Exome Sequencing
7.
IEEE Trans Nanobioscience ; 21(2): 286-293, 2022 04.
Article in English | MEDLINE | ID: mdl-34822331

ABSTRACT

In this paper, we propose a bio-molecular algorithm with O( n 2) biological operations, O( 2n-1 ) DNA strands, O( n ) tubes and the longest DNA strand, O( n ), for inferring the value of a bit from the only output satisfying any given condition in an unsorted database with 2n items of n bits. We show that the value of each bit of the outcome is determined by executing our bio-molecular algorithm n times. Then, we show how to view a bio-molecular solution space with 2n-1 DNA strands as an eigenvector and how to find the corresponding unitary operator and eigenvalues for inferring the value of a bit in the output. We also show that using an extension of the quantum phase estimation and quantum counting algorithms computes its unitary operator and eigenvalues from bio-molecular solution space with 2n-1 DNA strands. Next, we demonstrate that the value of each bit of the output solution can be determined by executing the proposed extended quantum algorithms n times. To verify our theorem, we find the maximum-sized clique to a graph with two vertices and one edge and the solution b that satisfies b2 ≡ 1 (mod 15) and using IBM Quantum's backend.


Subject(s)
Algorithms , Computers , DNA/chemistry , Databases, Factual
8.
IEEE Trans Nanobioscience ; 20(3): 354-376, 2021 07.
Article in English | MEDLINE | ID: mdl-33900920

ABSTRACT

In this paper, we propose a bio-molecular algorithm with O( n2 + m ) biological operations, O( 2n ) DNA strands, O( n ) tubes and the longest DNA strand, O( n ), for solving the independent-set problem for any graph G with m edges and n vertices. Next, we show that a new kind of the straightforward Boolean circuit yielded from the bio-molecular solutions with m NAND gates, ( m +n × ( n + 1 )) AND gates and (( n × ( n + 1 ))/2) NOT gates can find the maximal independent-set(s) to the independent-set problem for any graph G with m edges and n vertices. We show that a new kind of the proposed quantum-molecular algorithm can find the maximal independent set(s) with the lower bound Ω ( 2n/2 ) queries and the upper bound O( 2n/2 ) queries. This work offers an obvious evidence for that to solve the independent-set problem in any graph G with m edges and n vertices, bio-molecular computers are able to generate a new kind of the straightforward Boolean circuit such that by means of implementing it quantum computers can give a quadratic speed-up. This work also offers one obvious evidence that quantum computers can significantly accelerate the speed and enhance the scalability of bio-molecular computers. Next, the element distinctness problem with input of n bits is to determine whether the given 2n real numbers are distinct or not. The quantum lower bound of solving the element distinctness problem is Ω ( 2n×(2/3) ) queries in the case of a quantum walk algorithm. We further show that the proposed quantum-molecular algorithm reduces the quantum lower bound to Ω (( 2n/2 )/( [Formula: see text]) queries. Furthermore, to justify the feasibility of the proposed quantum-molecular algorithm, we successfully solve a typical independent set problem for a graph G with two vertices and one edge by carrying out experiments on the backend ibmqx4 with five quantum bits and the backend simulator with 32 quantum bits on IBM's quantum computer.


Subject(s)
Algorithms , Computers, Molecular , Computers , DNA
9.
J Asthma Allergy ; 14: 81-90, 2021.
Article in English | MEDLINE | ID: mdl-33542635

ABSTRACT

PURPOSE: Exposure to polycyclic aromatic hydrocarbons (PAHs) associated with ambient air particulate matter (PM) poses significant health concerns. Increased acute exacerbation (AE) frequency in asthmatic patients has been associated with ambient PAHs, but which subgroup of patients are particularly susceptible to ambient PAHs is uncertain. We developed a new model to simulate grid-scale PM2.5-PAH levels in order to evaluate whether the severity of asthma as measured by the Global Initiative of Asthma (GINA) levels of treatment is related to cumulative exposure of ambient PAHs. METHODS: Patients with asthma residing in the northern Taiwan were reviewed retrospectively from 2014 to 2017. PM2.5 were sampled and analysed for PAHs twice a month over a 72-hour period, in addition to collecting the routinely monitored air pollutant data from an established air quality monitoring network. In combination with correlation analysis and principal component analysis, multivariate linear regression models were performed to simulate hourly grid-scale PM2.5-PAH concentrations (ng/m3). A geographic information system mapping approach with ordinary kriging interpolation method was used to calculate the annual exposure of PAHs (ng/m). RESULTS: Among the 387 patients with asthma aged 18 to 93 (median 62), 97 subjects were treated as GINA step 5 (24%). Asthmatics in GINA 5 subgroup with high annual PAHs exposure were likely to have a higher annual frequency of any AE (1 (0-12), p<0.0001). Annual PAHs exposure was correlated with the annual frequency of any exacerbation (r=0.11, p=0.02). This was more significant in the GINA 5 subgroup (r=0.29, p=0.005) and in the GINA 5 subgroup with severe acute exacerbations (r=0.51, p=0.002). Annual PAHs exposure, severe acute exacerbation and GINA steps were independent variables that predict annual frequency of any exacerbation. CONCLUSION: Asthmatic patients in the GINA 5 subgroup with acute exacerbations were more susceptible to the effect of environmental PAHs on their exacerbation frequency. Reducing environmental levels of PAHs will have the greatest impact on the more severe asthma patients.

10.
Chemosphere ; 246: 125722, 2020 May.
Article in English | MEDLINE | ID: mdl-31891849

ABSTRACT

Modeling approaches have been utilized to simulate ambient pollutant concentrations, but very limited efforts have been made to estimate volatile organic compounds in the atmosphere. For this reason, an hourly grid-scale simulation model was developed to determine ambient air concentrations of benzene, toluene, ethylbenzene, and xylene (BTEX). BTEX data were collected over a one-year time frame from the database of the Taiwan Environmental Protection Administration's photochemical assessment monitoring stations. Multivariate linear regression models were used along with correlation analysis to simulate hourly grid-scale BTEX concentrations, using criteria pollutants and selected meteorological variables as predictors. The simulation model was validated in the southern Taiwan area via a portable micro gas chromatography system (n = 121) with significant correlation (r = 0.566**, ** indicated p < 0.01). Moreover, the grid-scale model was applied to areas covering about 72% of the population in Taiwan. A geographic information system (GIS) was used to visualize the spatial distribution of BTEX concentrations from the modeling results. This new grid-scale modeling strategy, which incorporated the GIS output of the simulated data, provides a useful alternative tool for personal exposure analysis and health risk assessment of ambient air BTEX.


Subject(s)
Environmental Monitoring/methods , Models, Chemical , Air Pollutants/analysis , Atmosphere/analysis , Benzene/analysis , Benzene Derivatives/analysis , Geographic Information Systems , Humans , Linear Models , Taiwan , Toluene/analysis , Volatile Organic Compounds/analysis , Xylenes/analysis
11.
J Hazard Mater ; 314: 286-294, 2016 08 15.
Article in English | MEDLINE | ID: mdl-27136734

ABSTRACT

Exposure to polycyclic aromatic hydrocarbons (PAHs) associated with ambient air particulate matter (PM) poses significant health concerns. Several modeling approaches have been developed for simulating ambient PAHs, but no hourly intra-urban spatial data are currently available. The aim of this study is to develop a new modeling strategy in simulating, on an hourly basis, grid-scale PM2.5-PAH levels. PM and PAHs were collected over a one-year time frame through an established air quality monitoring network within a metropolitan area of Taiwan. Multivariate linear regression models, in combination with correlation analysis and PAH source identification by principal component analysis (PCA), were performed to simulate hourly grid-scale PM2.5-PAH concentrations, taking criteria pollutants and meteorological variables selected as possible predictors. The simulated levels of 72-h personal exposure were found to be significantly (R=0.729**, p<0.01) correlated with those analyzed from portable personal monitors. A geographic information system (GIS) was used to visualize spatially distributed PM2.5-PAH concentrations of the modeling results. This new grid-scale modeling strategy, incorporating the output of simulated data by GIS, provides a useful and versatile tool in personal exposure analysis and in the health risk assessment of air pollution.


Subject(s)
Air Pollutants/analysis , Environmental Exposure/analysis , Environmental Monitoring/methods , Polycyclic Aromatic Hydrocarbons/analysis , Humans , Linear Models , Multivariate Analysis , Particulate Matter/analysis , Principal Component Analysis , Spatio-Temporal Analysis , Taiwan
12.
Nucleic Acids Res ; 42(5): 3009-16, 2014 Mar.
Article in English | MEDLINE | ID: mdl-24343027

ABSTRACT

DNA methylation is an important defense and regulatory mechanism. In mammals, most DNA methylation occurs at CpG sites, and asymmetric non-CpG methylation has only been detected at appreciable levels in a few cell types. We are the first to systematically study the strand-specific distribution of non-CpG methylation. With the divide-and-compare strategy, we show that CHG and CHH methylation are not intrinsically different in human embryonic stem cells (ESCs) and induced pluripotent stem cells (iPSCs). We also find that non-CpG methylation is skewed between the two strands in introns, especially at intron boundaries and in highly expressed genes. Controlling for the proximal sequences of non-CpG sites, we show that the skew of non-CpG methylation in introns is mainly guided by sequence skew. By studying subgroups of transposable elements, we also found that non-CpG methylation is distributed in a strand-specific manner in both short interspersed nuclear elements (SINE) and long interspersed nuclear elements (LINE), but not in long terminal repeats (LTR). Finally, we show that on the antisense strand of Alus, a non-CpG site just downstream of the A-box is highly methylated. Together, the divide-and-compare strategy leads us to identify regions with strand-specific distributions of non-CpG methylation in humans.


Subject(s)
DNA Methylation , Pluripotent Stem Cells/metabolism , Cell Line , CpG Islands , Humans , Introns , Long Interspersed Nucleotide Elements , Sequence Analysis, DNA , Short Interspersed Nucleotide Elements , Terminal Repeat Sequences , Transcription, Genetic
13.
Cell ; 153(5): 1134-48, 2013 May 23.
Article in English | MEDLINE | ID: mdl-23664764

ABSTRACT

Epigenetic mechanisms have been proposed to play crucial roles in mammalian development, but their precise functions are only partially understood. To investigate epigenetic regulation of embryonic development, we differentiated human embryonic stem cells into mesendoderm, neural progenitor cells, trophoblast-like cells, and mesenchymal stem cells and systematically characterized DNA methylation, chromatin modifications, and the transcriptome in each lineage. We found that promoters that are active in early developmental stages tend to be CG rich and mainly engage H3K27me3 upon silencing in nonexpressing lineages. By contrast, promoters for genes expressed preferentially at later stages are often CG poor and primarily employ DNA methylation upon repression. Interestingly, the early developmental regulatory genes are often located in large genomic domains that are generally devoid of DNA methylation in most lineages, which we termed DNA methylation valleys (DMVs). Our results suggest that distinct epigenetic mechanisms regulate early and late stages of ES cell differentiation.


Subject(s)
DNA Methylation , Embryonic Stem Cells/metabolism , Epigenomics , Gene Expression Regulation, Developmental , Animals , Cell Differentiation , Chromatin/metabolism , CpG Islands , Embryonic Stem Cells/cytology , Histones/metabolism , Humans , Methylation , Neoplasms/genetics , Promoter Regions, Genetic , Zebrafish/embryology
14.
Nat Biotechnol ; 28(10): 1097-105, 2010 Oct.
Article in English | MEDLINE | ID: mdl-20852635

ABSTRACT

Analysis of DNA methylation patterns relies increasingly on sequencing-based profiling methods. The four most frequently used sequencing-based technologies are the bisulfite-based methods MethylC-seq and reduced representation bisulfite sequencing (RRBS), and the enrichment-based techniques methylated DNA immunoprecipitation sequencing (MeDIP-seq) and methylated DNA binding domain sequencing (MBD-seq). We applied all four methods to biological replicates of human embryonic stem cells to assess their genome-wide CpG coverage, resolution, cost, concordance and the influence of CpG density and genomic context. The methylation levels assessed by the two bisulfite methods were concordant (their difference did not exceed a given threshold) for 82% for CpGs and 99% of the non-CpG cytosines. Using binary methylation calls, the two enrichment methods were 99% concordant and regions assessed by all four methods were 97% concordant. We combined MeDIP-seq with methylation-sensitive restriction enzyme (MRE-seq) sequencing for comprehensive methylome coverage at lower cost. This, along with RNA-seq and ChIP-seq of the ES cells enabled us to detect regions with allele-specific epigenetic states, identifying most known imprinted regions and new loci with monoallelic epigenetic marks and monoallelic expression.


Subject(s)
Alleles , DNA Methylation/genetics , Epigenesis, Genetic , Sequence Analysis, DNA/methods , Cell Line , CpG Islands/genetics , Cytosine/metabolism , Embryonic Stem Cells/metabolism , Gene Expression Regulation , Humans , Sulfites/metabolism
15.
Surg Neurol ; 72 Suppl 2: S66-73; discussion S73-4, 2009 Dec.
Article in English | MEDLINE | ID: mdl-19818476

ABSTRACT

BACKGROUND: Severe TBIs are major causes of disability and death in accidents. The Brain Trauma Foundation supported the first edition of the Guidelines for the Management of Severe Traumatic Brain Injury in 1995 and revised it in 2000. The recommendations in these guidelines are well accepted in the world. There are still some different views on trauma mechanisms, pathogenesis, and managements in different areas. Individualized guidelines for different countries would be necessary, and Taiwan is no exception. METHODS: In November 2005, we organized the severe TBI guidelines committee and selected 9 topics, including ER treatment, ICP monitoring, CPP, fluid therapy, use of sedatives, nutrition, intracranial hypertension, seizure prophylaxis, and second-tier therapy. We have since searched key questions in these topics on Medline. References are classified into 8 levels of evidence: 1++, 1+, 1-, 2++, 2+, 2-, 3, and 4 based on the criteria of the SIGN. RESULTS: Recommendations are formed and graded as A, B, C, and D. Grade A means that at least one piece of evidence is rated as 1++, whereas grade B means inclusion of studies rated as 2++. Grade C means inclusion of references rated as 2+, and grade D means levels of evidence rated as 3 or 4. Overall, 42 recommendations are formed. Three of these are rated as grade A, 13 as grade B, 21 as grade C, and 5 as grade D. CONCLUSIONS: We have completed the first evidence-based, clinical practice guidelines for severe TBIs. It is hoped that the guidelines will provide concepts and recommendations to promote the quality of care for severe TBIs in Taiwan.


Subject(s)
Brain Injuries/diagnosis , Brain Injuries/therapy , Emergency Medical Services/standards , Brain Edema/diagnosis , Brain Edema/therapy , Coma/chemically induced , Evidence-Based Medicine , Humans , Hyperventilation , Hypothermia, Induced/standards , Intracranial Hypertension/diagnosis , Intracranial Hypertension/therapy , Intracranial Pressure/physiology , Monitoring, Physiologic/standards , Steroids/therapeutic use , Taiwan
16.
Genome Res ; 19(11): 2144-53, 2009 Nov.
Article in English | MEDLINE | ID: mdl-19819906

ABSTRACT

How many species inhabit our immediate surroundings? A straightforward collection technique suitable for answering this question is known to anyone who has ever driven a car at highway speeds. The windshield of a moving vehicle is subjected to numerous insect strikes and can be used as a collection device for representative sampling. Unfortunately the analysis of biological material collected in that manner, as with most metagenomic studies, proves to be rather demanding due to the large number of required tools and considerable computational infrastructure. In this study, we use organic matter collected by a moving vehicle to design and test a comprehensive pipeline for phylogenetic profiling of metagenomic samples that includes all steps from processing and quality control of data generated by next-generation sequencing technologies to statistical analyses and data visualization. To the best of our knowledge, this is also the first publication that features a live online supplement providing access to exact analyses and workflows used in the article.


Subject(s)
Algorithms , Computational Biology/methods , DNA/isolation & purification , Metagenomics/methods , Animals , Automobiles , Bacteria/classification , Bacteria/genetics , DNA/chemistry , DNA, Bacterial/chemistry , DNA, Bacterial/isolation & purification , Databases, Nucleic Acid , Humans , Phylogeny , Reproducibility of Results , Sequence Analysis, DNA/methods , Software
17.
Bioinformatics ; 25(21): 2841-2, 2009 Nov 01.
Article in English | MEDLINE | ID: mdl-19736251

ABSTRACT

SUMMARY: We report on a major new version of the RMAP software for mapping reads from short-read sequencing technology. General improvements to accuracy and space requirements are included, along with novel functionality. Included in the RMAP software package are tools for mapping paired-end reads, mapping using more sophisticated use of quality scores, collecting ambiguous mapping locations and mapping bisulfite-treated reads. AVAILABILITY: The applications described in this note are available for download at http://www.cmb.usc.edu/people/andrewds/rmap and are distributed as Open Source software under the GPLv3.0. The software has been tested on Linux and OS X platforms. CONTACT: andrewds@usc.edu; mzhang@cshl.edu


Subject(s)
Computational Biology/methods , Sequence Analysis, DNA/methods , Software , Algorithms , Sequence Alignment , Sequence Analysis, RNA/methods
18.
Proc Natl Acad Sci U S A ; 106(31): 12741-6, 2009 Aug 04.
Article in English | MEDLINE | ID: mdl-19617558

ABSTRACT

Brain structure and function experience dramatic changes from embryonic to postnatal development. Microarray analyses have detected differential gene expression at different stages and in disease models, but gene expression information during early brain development is limited. We have generated >27 million reads to identify mRNAs from the mouse cortex for >16,000 genes at either embryonic day 18 (E18) or postnatal day 7 (P7), a period of significant synaptogenesis for neural circuit formation. In addition, we devised strategies to detect alternative splice forms and uncovered more splice variants. We observed differential expression of 3,758 genes between the 2 stages, many with known functions or predicted to be important for neural development. Neurogenesis-related genes, such as those encoding Sox4, Sox11, and zinc-finger proteins, were more highly expressed at E18 than at P7. In contrast, the genes encoding synaptic proteins such as synaptotagmin, complexin 2, and syntaxin were up-regulated from E18 to P7. We also found that several neurological disorder-related genes were highly expressed at E18. Our transcriptome analysis may serve as a blueprint for gene expression pattern and provide functional clues of previously unknown genes and disease-related genes during early brain development.


Subject(s)
Cerebral Cortex/embryology , Cerebral Cortex/metabolism , Gene Expression Profiling , Sequence Analysis, RNA , Animals , Animals, Newborn , Apoptosis , Autophagy , Brain Diseases/genetics , Female , Gene Expression Regulation, Developmental , Mice , Mice, Inbred C57BL , Pregnancy , Synapses/physiology , Transcription Factors/genetics
19.
PLoS Comput Biol ; 3(5): e91, 2007 May.
Article in English | MEDLINE | ID: mdl-17511511

ABSTRACT

Coding of multiple proteins by overlapping reading frames is not a feature one would associate with eukaryotic genes. Indeed, codependency between codons of overlapping protein-coding regions imposes a unique set of evolutionary constraints, making it a costly arrangement. Yet in cases of tightly coexpressed interacting proteins, dual coding may be advantageous. Here we show that although dual coding is nearly impossible by chance, a number of human transcripts contain overlapping coding regions. Using newly developed statistical techniques, we identified 40 candidate genes with evolutionarily conserved overlapping coding regions. Because our approach is conservative, we expect mammals to possess more dual-coding genes. Our results emphasize that the skepticism surrounding eukaryotic dual coding is unwarranted: rather than being artifacts, overlapping reading frames are often hallmarks of fascinating biology.


Subject(s)
Chromosome Mapping/methods , Mammals/genetics , Multigene Family/genetics , Open Reading Frames/genetics , RNA Splice Sites/genetics , Sequence Analysis, DNA/methods , Animals , Base Sequence , Computer Simulation , Humans , Models, Genetic , Molecular Sequence Data
20.
BMC Bioinformatics ; 7: 46, 2006 Jan 27.
Article in English | MEDLINE | ID: mdl-16441884

ABSTRACT

BACKGROUND: While gene duplication is known to be one of the most common mechanisms of genome evolution, the fates of genes after duplication are still being debated. In particular, it is presently unknown whether most duplicate genes preserve (or subdivide) the functions of the parental gene or acquire new functions. One aspect of gene function, that is the expression profile in gene coexpression network, has been largely unexplored for duplicate genes. RESULTS: Here we build a human gene coexpression network using human tissue-specific microarray data and investigate the divergence of duplicate genes in it. The topology of this network is scale-free. Interestingly, our analysis indicates that duplicate genes rapidly lose shared coexpressed partners: after approximately 50 million years since duplication, the two duplicate genes in a pair have only slightly higher number of shared partners as compared with two random singletons. We also show that duplicate gene pairs quickly acquire new coexpressed partners: the average number of partners for a duplicate gene pair is significantly greater than that for a singleton (the latter number can be used as a proxy of the number of partners for a parental singleton gene before duplication). The divergence in gene expression between two duplicates in a pair occurs asymmetrically: one gene usually has more partners than the other one. The network is resilient to both random and degree-based in silico removal of either singletons or duplicate genes. In contrast, the network is especially vulnerable to the removal of highly connected genes when duplicate genes and singletons are considered together. CONCLUSION: Duplicate genes rapidly diverge in their expression profiles in the network and play similar role in maintaining the network robustness as compared with singletons.


Subject(s)
Chromosome Mapping/methods , Gene Expression Regulation/genetics , Genes, Duplicate/genetics , Genome, Human/genetics , Models, Genetic , Multigene Family/genetics , Transcription Factors/genetics , Evolution, Molecular , Genetic Variation/genetics , Humans , Signal Transduction/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...