Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 35
Filtrar
1.
Mol Cell Proteomics ; 21(8): 100261, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-35738554

RESUMO

Brain development and function are governed by precisely regulated protein expressions in different regions. To date, multiregional brain proteomes have been systematically analyzed only for adult human and mouse brains. To understand the underpinnings of brain development and function, we generated proteomes from six regions of the postnatal brain at three developmental stages of domestic dogs (Canis familiaris), which are special among animals in terms of their remarkable human-like social cognitive abilities. Quantitative analysis of the spatiotemporal proteomes identified region-enriched synapse types at different developmental stages and differential myelination progression in different brain regions. Through integrative analysis of inter-regional expression patterns of orthologous proteins and genome-wide cis-regulatory element frequencies, we found that proteins related with myelination and hippocampus were highly correlated between dog and human but not between mouse and human, although mouse is phylogenetically closer to human. Moreover, the global expression patterns of neurodegenerative disease and autism spectrum disorder-associated proteins in dog brain more resemble human brain than in mouse brain. The high similarity of myelination and hippocampus-related pathways in dog and human at both proteomic and genetic levels may contribute to their shared social cognitive abilities. The inter-regional expression patterns of disease-associated proteins in the brain of different species provide important information to guide mechanistic and translational study using appropriate animal models.


Assuntos
Transtorno do Espectro Autista , Doenças Neurodegenerativas , Adulto , Animais , Encéfalo , Cães , Humanos , Camundongos , Proteoma , Proteômica
2.
BMC Bioinformatics ; 24(1): 249, 2023 Jun 13.
Artigo em Inglês | MEDLINE | ID: mdl-37312038

RESUMO

BACKGROUND: Closing gaps in draft genomes leads to more complete and continuous genome assemblies. The ubiquitous genomic repeats are challenges to the existing gap-closing methods, based on either the k-mer representation by the de Bruijn graph or the overlap-layout-consensus paradigm. Besides, chimeric reads will cause erroneous k-mers in the former and false overlaps of reads in the latter. RESULTS: We propose a novel local assembly approach to gap closing, called RegCloser. It represents read coordinates and their overlaps respectively by parameters and observations in a linear regression model. The optimal overlap is searched only in the restricted range consistent with insert sizes. Under this linear regression framework, the local DNA assembly becomes a robust parameter estimation problem. We solved the problem by a customized robust regression procedure that resists the influence of false overlaps by optimizing a convex global Huber loss function. The global optimum is obtained by iteratively solving the sparse system of linear equations. On both simulated and real datasets, RegCloser outperformed other popular methods in accurately resolving the copy number of tandem repeats, and achieved superior completeness and contiguity. Applying RegCloser to a plateau zokor draft genome that had been improved by long reads further increased contig N50 to 3-fold long. We also tested the robust regression approach on layout generation of long reads. CONCLUSIONS: RegCloser is a competitive gap-closing tool. The software is available at https://github.com/csh3/RegCloser . The robust regression approach has a prospect to be incorporated into the layout module of long read assemblers.


Assuntos
Genômica , Software , Consenso , Modelos Lineares , Sequências de Repetição em Tandem
3.
Bioinformatics ; 38(10): 2675-2682, 2022 05 13.
Artigo em Inglês | MEDLINE | ID: mdl-35561180

RESUMO

MOTIVATION: Crucial to the correctness of a genome assembly is the accuracy of the underlying scaffolds that specify the orders and orientations of contigs together with the gap distances between contigs. The current methods construct scaffolds based on the alignments of 'linking' reads against contigs. We found that some 'optimal' alignments are mistaken due to factors such as the contig boundary effect, particularly in the presence of repeats. Occasionally, the incorrect alignments can even overwhelm the correct ones. The detection of the incorrect linking information is challenging in any existing methods. RESULTS: In this study, we present a novel scaffolding method RegScaf. It first examines the distribution of distances between contigs from read alignment by the kernel density. When multiple modes are shown in a density, orientation-supported links are grouped into clusters, each of which defines a linking distance corresponding to a mode. The linear model parameterizes contigs by their positions on the genome; then each linking distance between a pair of contigs is taken as an observation on the difference of their positions. The parameters are estimated by minimizing a global loss function, which is a version of trimmed sum of squares. The least trimmed squares estimate has such a high breakdown value that it can automatically remove the mistaken linking distances. The results on both synthetic and real datasets demonstrate that RegScaf outperforms some popular scaffolders, especially in the accuracy of gap estimates by substantially reducing extremely abnormal errors. Its strength in resolving repeat regions is exemplified by a real case. Its adaptability to large genomes and TGS long reads is validated as well. AVAILABILITY AND IMPLEMENTATION: RegScaf is publicly available at https://github.com/lemontealala/RegScaf.git. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Software , Mapeamento de Sequências Contíguas/métodos , Genoma , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos
4.
BMC Infect Dis ; 23(1): 679, 2023 Oct 11.
Artigo em Inglês | MEDLINE | ID: mdl-37821841

RESUMO

BACKGROUND: The emergency of new COVID-19 variants over the past three years posed a serious challenge to the public health. Cities in China implemented mass daily RT-PCR tests by pooling strategies. However, a random delay exists between an infection and its first positive RT-PCR test. It is valuable for disease control to know the delay pattern and daily infection incidences reconstructed from RT-PCR test observations. METHODS: We formulated the convolution model between daily incidences and positive RT-PCR test counts as a linear inverse problem with positivity restrictions. Consequently, the Richard-Lucy deconvolution algorithm was used to reconstruct COVID-19 incidences from daily PCR tests. A real-time deconvolution was further developed based on the same mathematical principle. The method was applied to an Omicron epidemic data set of a bar outbreak in Beijing and another in Wuxi in June 2022. We estimated the delay function by maximizing likelihood via an E-M algorithm. RESULTS: The delay function of the bar-outbreak in 2022 differs from that reported in 2020. Its mode was shortened to 4 days by one day. A 95% confidence interval of the mean delay is [4.43,5.55] as evaluated by bootstrap. In addition, the deconvolved infection incidences successfully detected two associated infection events after the bar was closed. The application of the real-time deconvolution to the Wuxi data identified all explosive incidence increases. The results revealed the progression of the two COVID-19 outbreaks and provided new insights for prevention and control strategies, especially for the role of mass daily RT-PCR testing. CONCLUSIONS: The proposed deconvolution method is generally applicable to other infectious diseases if the delay model can be assumed to be approximately valid. To ensure a fair reconstruction of daily infection incidences, the delay function should be estimated in a similar context in terms of virus variant and test protocol. Both the delay estimate from the E-M algorithm and the incidences resulted from deconvolution are valuable for epidemic prevention and control. The real-time feedback is particularly useful during the epidemic's acute phase because it can help the local disease control authorities modify the control measures more promptly and precisely.


Assuntos
COVID-19 , Humanos , COVID-19/diagnóstico , COVID-19/epidemiologia , SARS-CoV-2/genética , Incidência , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Teste para COVID-19
5.
BMC Bioinformatics ; 22(1): 386, 2021 Jul 28.
Artigo em Inglês | MEDLINE | ID: mdl-34320923

RESUMO

BACKGROUND: Normalization of RNA-seq data aims at identifying biological expression differentiation between samples by removing the effects of unwanted confounding factors. Explicitly or implicitly, the justification of normalization requires a set of housekeeping genes. However, the existence of housekeeping genes common for a very large collection of samples, especially under a wide range of conditions, is questionable. RESULTS: We propose to carry out pairwise normalization with respect to multiple references, selected from representative samples. Then the pairwise intermediates are integrated based on a linear model that adjusts the reference effects. Motivated by the notion of housekeeping genes and their statistical counterparts, we adopt the robust least trimmed squares regression in pairwise normalization. The proposed method (MUREN) is compared with other existing tools on some standard data sets. The goodness of normalization emphasizes on preserving possible asymmetric differentiation, whose biological significance is exemplified by a single cell data of cell cycle. MUREN is implemented as an R package. The code under license GPL-3 is available on the github platform: github.com/hippo-yf/MUREN and on the conda platform: anaconda.org/hippo-yf/r-muren. CONCLUSIONS: MUREN performs the RNA-seq normalization using a two-step statistical regression induced from a general principle. We propose that the densities of pairwise differentiations are used to evaluate the goodness of normalization. MUREN adjusts the mode of differentiation toward zero while preserving the skewness due to biological asymmetric differentiation. Moreover, by robustly integrating pre-normalized counts with respect to multiple references, MUREN is immune to individual outlier samples.


Assuntos
Perfilação da Expressão Gênica , Genes Essenciais , RNA-Seq , Análise de Sequência de RNA , Sequenciamento do Exoma
6.
Mol Biol Evol ; 37(6): 1679-1693, 2020 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-32068872

RESUMO

To understand the genomic basis accounting for the phenotypic differences between human and apes, we compare the matrices consisting of the cis-element frequencies in the proximal regulatory regions of their genomes. One such frequency matrix is represented by a robust singular value decomposition. For each singular value, the negative and positive ends of the sorted motif eigenvector correspond to the dual ends of the sorted gene eigenvector, respectively, comprising a dual eigen-module defined by cis-regulatory element frequencies (CREF). The CREF eigen-modules at levels 1, 2, 3, and 6 are highly conserved across humans, chimpanzees, and orangutans. The key biological processes embedded in the top three CREF eigen-modules are reproduction versus embryogenesis, fetal maturation versus immune system, and stress responses versus mitosis. Although the divergence at the nucleotide level between the chimpanzee and human genome was small, their cis-element frequency matrices crossed a singularity point, at which the fourth and fifth singular values were identical. The CREF eigen-modules corresponding to the fourth and fifth singular values were reorganized along the evolution from apes to human. Interestingly, the fourth sorted gene eigenvector encodes the phenotypes unique to human such as long-term memory, language development, and social behavior. The number of motifs present on Alu elements increases substantially at the fourth level. The motif analysis together with the cases of human-specific Alu insertions suggests that mutations related to Alu elements play a critical role in the evolution of the human-phenotypic gene eigenvector.


Assuntos
Elementos Alu , Evolução Biológica , Genoma Humano , Hominidae/genética , Elementos Reguladores de Transcrição , Animais , Proteínas de Ciclo Celular/genética , Cognição , Desenvolvimento Embrionário/genética , Humanos , Desenvolvimento da Linguagem , Memória de Longo Prazo , Fenótipo , Comportamento Social
7.
BMC Bioinformatics ; 20(Suppl 7): 201, 2019 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-31074378

RESUMO

BACKGROUND: A key problem in systems biology is the determination of the regulatory mechanism corresponding to a phenotype. An empirical approach in this regard is to compare the expression profiles of cells under two conditions or tissues from two phenotypes and to unravel the underlying transcriptional regulation. We have proposed the method BASE to statistically infer the effective regulatory factors that are responsible for the gene expression differentiation with the help from the binding data between factors and genes. Usually the protein-DNA binding data are obtained by ChIP-seq experiments, which could be costly and are condition-specific. RESULTS: Here we report a definition of binding strength based on a probability model. Using this condition-free definition, the BASE method needs only the frequencies of cis-motifs in regulatory regions, thereby the inferences can be carried out in silico. The directional regulation can be inferred by considering down- and up-regulation separately. We showed the effectiveness of the approach by one case study. In the study of the effects of polyunsaturated fatty acids (PUFA), namely, docosahexaenoic (DHA) and eicosapentaenoic (EPA) diets on mouse small intestine cells, the inferences of regulations are consistent with those reported in the literature, including PPARα and NFκB, respectively corresponding to enhanced adipogenesis and reduced inflammation. Moreover, we discovered enhanced RORA regulation of circadian rhythm, and reduced ETS1 regulation of angiogenesis. CONCLUSIONS: With the probabilistic definition of cis-trans binding affinity, the BASE method could obtain the significances of TF regulation changes corresponding to a gene expression differentiation profile between treatment and control samples. The landscape of the inferred cis-trans regulations is helpful for revealing the underlying molecular mechanisms. Particularly we reported a more comprehensive regulation induced by EPA&DHA diet.


Assuntos
Indutores da Angiogênese/administração & dosagem , Ácidos Docosa-Hexaenoicos/administração & dosagem , Ácido Eicosapentaenoico/administração & dosagem , Regulação da Expressão Gênica , Hiperlipidemias/genética , Motivos de Nucleotídeos , Transcrição Gênica , Adipogenia/efeitos dos fármacos , Animais , Hiperlipidemias/tratamento farmacológico , Intestino Delgado/metabolismo , Camundongos , Regiões Promotoras Genéticas
8.
Bioinformatics ; 34(12): 2019-2028, 2018 06 15.
Artigo em Inglês | MEDLINE | ID: mdl-29346504

RESUMO

Motivation: It is highly desirable to assemble genomes of high continuity and consistency at low cost. The current bottleneck of draft genome continuity using the second generation sequencing (SGS) reads is primarily caused by uncertainty among repetitive sequences. Even though the single-molecule real-time sequencing technology is very promising to overcome the uncertainty issue, its relatively high cost and error rate add burden on budget or computation. Many long-read assemblers take the overlap-layout-consensus (OLC) paradigm, which is less sensitive to sequencing errors, heterozygosity and variability of coverage. However, current assemblers of SGS data do not sufficiently take advantage of the OLC approach. Results: Aiming at minimizing uncertainty, the proposed method BAUM, breaks the whole genome into regions by adaptive unique mapping; then the local OLC is used to assemble each region in parallel. BAUM can (i) perform reference-assisted assembly based on the genome of a close species (ii) or improve the results of existing assemblies that are obtained based on short or long sequencing reads. The tests on two eukaryote genomes, a wild rice Oryza longistaminata and a parrot Melopsittacus undulatus, show that BAUM achieved substantial improvement on genome size and continuity. Besides, BAUM reconstructed a considerable amount of repetitive regions that failed to be assembled by existing short read assemblers. We also propose statistical approaches to control the uncertainty in different steps of BAUM. Availability and implementation: http://www.zhanyuwang.xin/wordpress/index.php/2017/07/21/baum. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequências Repetitivas de Ácido Nucleico , Análise de Sequência de DNA/métodos , Software , Genômica/métodos , Incerteza
9.
BMC Bioinformatics ; 18(1): 335, 2017 Jul 11.
Artigo em Inglês | MEDLINE | ID: mdl-28697757

RESUMO

BACKGROUND: Phred quality scores are essential for downstream DNA analysis such as SNP detection and DNA assembly. Thus a valid model to define them is indispensable for any base-calling software. Recently, we developed the base-caller 3Dec for Illumina sequencing platforms, which reduces base-calling errors by 44-69% compared to the existing ones. However, the model to predict its quality scores has not been fully investigated yet. RESULTS: In this study, we used logistic regression models to evaluate quality scores from predictive features, which include different aspects of the sequencing signals as well as local DNA contents. Sparse models were further obtained by three methods: the backward deletion with either AIC or BIC and the L 1 regularization learning method. The L 1-regularized one was then compared with the Illumina scoring method. CONCLUSIONS: The L 1-regularized logistic regression improves the empirical discrimination power by as large as 14 and 25% respectively for two kinds of preprocessed sequencing signals, compared to the Illumina scoring method. Namely, the L 1 method identifies more base calls of high fidelity. Computationally, the L 1 method can handle large dataset and is efficient enough for daily sequencing. Meanwhile, the logistic model resulted from BIC is more interpretable. The modeling suggested that the most prominent quenching pattern in the current chemistry of Illumina occurred at the dinucleotide "GT". Besides, nucleotides were more likely to be miscalled as the previous bases if the preceding ones were not "G". It suggested that the phasing effect of bases after "G" was somewhat different from those after other nucleotide types.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Software , Modelos Logísticos
10.
Methods ; 67(3): 394-406, 2014 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-24440483

RESUMO

The nanoparticle gadolinium endohedral metallofullerenol [Gd@C82(OH)22]n is a new candidate for cancer treatment with low toxicity. However, its anti-cancer mechanisms remain mostly unknown. In this study, we took a systems biology view of the gene expression profiles of human breast cancer cells (MCF-7) and human umbilical vein endothelial cells (ECV304) treated with and without [Gd@C82(OH)22]n, respectively, measured by the Agilent Gene Chip G4112F. To properly analyze these data, we modified a suit of statistical methods we developed. For the first time we applied the sub-sub normalization to Agilent two-color microarrays. Instead of a simple linear regression, we proposed to use a one-knot SPLINE model in the sub-sub normalization to account for nonlinear spatial effects. The parameters estimated by least trimmed squares- and S-estimators show similar normalization results. We made several kinds of inferences by integrating the expression profiles with the bioinformatic knowledge in KEGG pathways, Gene Ontology, JASPAR, and TRANSFAC. In the transcriptional inference, we proposed the BASE2.0 method to infer a transcription factor's up-regulation and down-regulation activities separately. Overall, [Gd@C82(OH)22]n induces more differentiation in MCF-7 cells than in ECV304 cells, particularly in the reduction of protein processing such as protein glucosylation, folding, targeting, exporting, and transporting. Among the KEGG pathways, the ErbB signaling pathway is up-regulated, whereas protein processing in endoplasmic reticulum (ER) is down-regulated. CHOP, a key pro-apoptotic gene downstream of the ER stress pathway, increases to nine folds in MCF-7 cells after treatment. These findings indicate that ER stress may be one important factor that induces apoptosis in MCF-7 cells after [Gd@C82(OH)22]n treatment. The expression profiles of genes associated with ER stress and apoptosis are statistically consistent with other profiles reported in the literature, such as those of HEK293T and MCF-7 cells induced by the miR-23a∼27a∼24-2 cluster. Furthermore, one of the inferred regulatory mechanisms comprises the apoptosis network centered around TP53, whose effective regulation of apoptosis is somehow reestablished after [Gd@C82(OH)22]n treatment. These results elucidate the application and development of [Gd@C82(OH)22]n and other fullerene derivates.


Assuntos
Apoptose/efeitos dos fármacos , Retículo Endoplasmático/efeitos dos fármacos , Biologia de Sistemas/métodos , Proliferação de Células/efeitos dos fármacos , Fulerenos/química , Fulerenos/uso terapêutico , Gadolínio/química , Gadolínio/uso terapêutico , Redes Reguladoras de Genes , Humanos , Células MCF-7 , Nanopartículas/química , Nanopartículas/uso terapêutico , Análise de Sequência com Séries de Oligonucleotídeos , Estresse Fisiológico , Transcriptoma , Proteína Supressora de Tumor p53/genética , Proteína Supressora de Tumor p53/metabolismo , Proteína Supressora de Tumor p53/fisiologia
11.
Nucleic Acids Res ; 39(Web Server issue): W557-61, 2011 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-21576217

RESUMO

The massively parallel sequencing technologies have recently flourished and dramatically cut the cost to sequence personal human genomes. Haplotype assembly from personal genomes sequenced using the massively parallel sequencing technologies is becoming a cost-effective and promising tool for human disease study. Computational assembly of haplotypes has been proved to be very accurate, but obviously contains errors. Here we present a tool, HapEdit, to assess the accuracy of assembled haplotypes and edit them manually. Using this tool, a user can break erroneous haplotype segments into smaller segments, or concatenate haplotype segments if the concatenated haplotype segments are sufficiently supported. A user can also edit bases with low-quality scores. HapEdit displays haplotype assemblies so that a user can easily navigate and pinpoint a region of interest. As inputs, HapEdit currently takes reads from the Polonator, Illumina, SOLiD, 454 and Sanger sequencing technologies.


Assuntos
Haplótipos , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência de DNA , Software , Internet
12.
PLoS One ; 18(10): e0292579, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37816033

RESUMO

Pancreatic islet failure is a key characteristic of type 2 diabetes besides insulin resistance. To get molecular insights into the pathology of islets in type 2 diabetes, we developed a computational approach to integrating expression profiles of Goto-Kakizaki and Wistar rat islets from a designed experiment with those of the human islets from an observational study. A principal gene-eigenvector in the expression profiles characterized by up-regulated angiogenesis and down-regulated oxidative phosphorylation was identified conserved across the two species. In the case of Goto-Kakizaki versus Wistar islets, such alteration in gene expression can be verified directly by the treatment-control tests over time, and corresponds to the alteration of α/ß-cell distribution obtained by quantifying the islet micrographs. Furthermore, the correspondence between the dual sample- and gene-eigenvectors unveils more delicate structures. In the case of rats, the up- and down-trend of insulin mRNA levels before and after week 8 correspond respectively to the top two principal eigenvectors. In the case of human, the top two principal eigenvectors correspond respectively to the late and early stages of diabetes. According to the aggregated expression signature, a large portion of genes involved in the hypoxia-inducible factor signaling pathway, which activates transcription of angiogenesis, were significantly up-regulated. Furthermore, top-ranked anti-angiogenic genes THBS1 and PEDF indicate the existence of a counteractive mechanism that is in line with thickened and fragmented capillaries found in the deteriorated islets. Overall, the integrative analysis unravels the principal transcriptional alterations underlying the islet deterioration of morphology and insulin secretion along type 2 diabetes progression.


Assuntos
Diabetes Mellitus Tipo 2 , Células Secretoras de Insulina , Ilhotas Pancreáticas , Ratos , Humanos , Animais , Diabetes Mellitus Tipo 2/patologia , Ratos Wistar , Ilhotas Pancreáticas/metabolismo , Células Secretoras de Insulina/metabolismo , Secreção de Insulina , Insulina/genética , Insulina/metabolismo
13.
Nucleic Acids Res ; 38(1): 143-58, 2010 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-19880387

RESUMO

In an attempt to elucidate the underlying longevity-promoting mechanisms of mutants lacking SCH9, which live three times as long as wild type chronologically, we measured their time-course gene expression profiles. We interpreted their expression time differences by statistical inferences based on prior biological knowledge, and identified the following significant changes: (i) between 12 and 24 h, stress response genes were up-regulated by larger fold changes and ribosomal RNA (rRNA) processing genes were down-regulated more dramatically; (ii) mitochondrial ribosomal protein genes were not up-regulated between 12 and 60 h as wild type were; (iii) electron transport, oxidative phosphorylation and TCA genes were down-regulated early; (iv) the up-regulation of TCA and electron transport was accompanied by deep down-regulation of rRNA processing over time; and (v) rRNA processing genes were more volatile over time, and three associated cis-regulatory elements [rRNA processing element (rRPE), polymerase A and C (PAC) and glucose response element (GRE)] were identified. Deletion of AZF1, which encodes the transcriptional factor that binds to the GRE element, reversed the lifespan extension of sch9Delta. The significant alterations in these time-dependent expression profiles imply that the lack of SCH9 turns on the longevity programme that extends the lifespan through changes in metabolic pathways and protection mechanisms, particularly, the regulation of aerobic respiration and rRNA processing.


Assuntos
Regulação Fúngica da Expressão Gênica , Proteínas Serina-Treonina Quinases/genética , Proteínas de Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/genética , Ciclo do Ácido Cítrico/genética , Transporte de Elétrons/genética , Perfilação da Expressão Gênica , Cinética , Proteínas Mitocondriais/genética , Proteínas Mitocondriais/metabolismo , Mutação , Análise de Sequência com Séries de Oligonucleotídeos , Fosforilação Oxidativa , Regiões Promotoras Genéticas , Processamento Pós-Transcricional do RNA , RNA Ribossômico/metabolismo , Elementos de Resposta , Proteínas Ribossômicas/genética , Proteínas Ribossômicas/metabolismo , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismo , Schizosaccharomyces/genética , Schizosaccharomyces/metabolismo , Estresse Fisiológico/genética , Fatores de Transcrição/metabolismo
14.
PLoS Genet ; 5(5): e1000467, 2009 May.
Artigo em Inglês | MEDLINE | ID: mdl-19424415

RESUMO

The effect of calorie restriction (CR) on life span extension, demonstrated in organisms ranging from yeast to mice, may involve the down-regulation of pathways, including Tor, Akt, and Ras. Here, we present data suggesting that yeast Tor1 and Sch9 (a homolog of the mammalian kinases Akt and S6K) is a central component of a network that controls a common set of genes implicated in a metabolic switch from the TCA cycle and respiration to glycolysis and glycerol biosynthesis. During chronological survival, mutants lacking SCH9 depleted extracellular ethanol and reduced stored lipids, but synthesized and released glycerol. Deletion of the glycerol biosynthesis genes GPD1, GPD2, or RHR2, among the most up-regulated in long-lived sch9Delta, tor1Delta, and ras2Delta mutants, was sufficient to reverse chronological life span extension in sch9Delta mutants, suggesting that glycerol production, in addition to the regulation of stress resistance systems, optimizes life span extension. Glycerol, unlike glucose or ethanol, did not adversely affect the life span extension induced by calorie restriction or starvation, suggesting that carbon source substitution may represent an alternative to calorie restriction as a strategy to delay aging.


Assuntos
Fosfatidilinositol 3-Quinases/genética , Fosfatidilinositol 3-Quinases/metabolismo , Proteínas Serina-Treonina Quinases/genética , Proteínas Serina-Treonina Quinases/metabolismo , Proteínas de Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Animais , Restrição Calórica , Carbono/metabolismo , Respiração Celular , Ciclo do Ácido Cítrico , Meios de Cultura , Etanol/metabolismo , Perfilação da Expressão Gênica , Genes Fúngicos , Glicerol/metabolismo , Glicólise , Longevidade , Modelos Biológicos , Mutação , Transdução de Sinais , Proteínas ras/genética , Proteínas ras/metabolismo
15.
Bioinformatics ; 25(18): 2430-1, 2009 Sep 15.
Artigo em Inglês | MEDLINE | ID: mdl-19561337

RESUMO

SUMMARY: Haplotype assembly is becoming a very important tool in genome sequencing of human and other organisms. Although haplotypes were previously inferred from genome assemblies, there has never been a comparative haplotype browser that depicts a global picture of whole-genome alignments among haplotypes of different organisms. We introduce a whole-genome HAPLotype brOWSER (HAPLOWSER), providing evolutionary perspectives from multiple aligned haplotypes and functional annotations. Haplowser enables the comparison of haplotypes from metagenomes, and associates conserved regions or the bases at the conserved regions with functional annotations and custom tracks. The associations are quantified for further analysis and presented as pie charts. Functional annotations and custom tracks that are projected onto haplotypes are saved as multiple files in FASTA format. Haplowser provides a user-friendly interface, and can display alignments of haplotypes with functional annotations at any resolution. AVAILABILITY: Haplowser, written in Java, supports multiple platforms including Windows and Linux. Haplowser is publicly available at http://embio.yonsei.ac.kr/haplowser .


Assuntos
Biologia Computacional/métodos , Genoma , Haplótipos , Metagenoma , Software , Bases de Dados Genéticas , Genômica , Internet
16.
BMC Genomics ; 10: 225, 2009 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-19442316

RESUMO

BACKGROUND: Aberrant activation or expression of transcription factors has been implicated in the tumorigenesis of various types of cancer. In spite of the prevalent application of microarray experiments for profiling gene expression in cancer samples, they provide limited information regarding the activities of transcription factors. However, the association between transcription factors and cancers is largely dependent on the transcription regulatory activities rather than mRNA expression levels. RESULTS: In this paper, we propose a computational approach that integrates microarray expression data with the transcription factor binding site information to systematically identify transcription factors associated with patient survival given a specific cancer type. This approach was applied to two gene expression data sets for breast cancer and acute myeloid leukemia. We found that two transcription factor families, the steroid nuclear receptor family and the ATF/CREB family, are significantly correlated with the survival of patients with breast cancer; and that a transcription factor named T-cell acute lymphocytic leukemia 1 is significantly correlated with acute myeloid leukemia patient survival. CONCLUSION: Our analysis identifies transcription factors associating with patient survival and provides insight into the regulatory mechanism underlying the breast cancer and leukemia. The transcription factors identified by our method are biologically meaningful and consistent with prior knowledge. As an insightful tool, this approach can also be applied to other microarray cancer data sets to help researchers better understand the intricate relationship between transcription factors and diseases.


Assuntos
Neoplasias da Mama/genética , Perfilação da Expressão Gênica/métodos , Leucemia Mieloide Aguda/genética , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Fatores de Transcrição/genética , Fatores de Transcrição Hélice-Alça-Hélice Básicos/genética , Humanos , Modelos Logísticos , Modelos de Riscos Proporcionais , Proteínas Proto-Oncogênicas/genética , Receptores de Esteroides/genética , Taxa de Sobrevida , Proteína 1 de Leucemia Linfocítica Aguda de Células T
17.
BMC Bioinformatics ; 9: 194, 2008 Apr 14.
Artigo em Inglês | MEDLINE | ID: mdl-18410691

RESUMO

BACKGROUND: Microarray pre-processing usually consists of normalization and summarization. Normalization aims to remove non-biological variations across different arrays. The normalization algorithms generally require the specification of reference and target arrays. The issue of reference selection has not been fully addressed. Summarization aims to estimate the transcript abundance from normalized intensities. In this paper, we consider normalization and summarization jointly by a new strategy of reference selection. RESULTS: We propose a Probe-Treatment-Reference (PTR) model to streamline normalization and summarization by allowing multiple references. We estimate parameters in the model by the Least Absolute Deviations (LAD) approach and implement the computation by median polishing. We show that the LAD estimator is robust in the sense that it has bounded influence in the three-factor PTR model. This model fitting, implicitly, defines an "optimal reference" for each probe-set. We evaluate the effectiveness of the PTR method by two Affymetrix spike-in data sets. Our method reduces the variations of non-differentially expressed genes and thereby increases the detection power of differentially expressed genes. CONCLUSION: Our results indicate that the reference effect is important and should be considered in microarray pre-processing. The proposed PTR method is a general framework to deal with the issue of reference selection and can readily be applied to existing normalization algorithms such as the invariant-set, sub-array and quantile method.


Assuntos
Sondas de DNA/genética , Perfilação da Expressão Gênica/métodos , Modelos Genéticos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Algoritmos , Sequência de Bases , Simulação por Computador , Sondas de DNA/normas , Perfilação da Expressão Gênica/normas , Dados de Sequência Molecular , Análise de Sequência com Séries de Oligonucleotídeos/normas , Valores de Referência
18.
BMC Genomics ; 9: 116, 2008 Mar 03.
Artigo em Inglês | MEDLINE | ID: mdl-18315882

RESUMO

BACKGROUND: The cell cycle has long been an important model to study the genome-wide transcriptional regulation. Although several methods have been introduced to identify cell cycle regulated genes from microarray data, they can not be directly used to investigate cell cycle regulated transcription factors (CCRTFs), because for many transcription factors (TFs) it is their activities instead of expressions that are periodically regulated across the cell cycle. To overcome this problem, it is useful to infer TF activities across the cell cycle by integrating microarray expression data with ChIP-chip data, and then examine the periodicity of the inferred activities. For most species, however, large-scale ChIP-chip data are still not available. RESULTS: We propose a two-step method to identify the CCRTFs by integrating microarray cell cycle data with ChIP-chip data or motif discovery data. In S. cerevisiae, we identify 42 CCRTFs, among which 23 have been verified experimentally. The cell cycle related behaviors (e.g. at which cell cycle phase a TF achieves the highest activity) predicted by our method are consistent with the well established knowledge about them. We also find that the periodical activity fluctuation of some TFs can be perturbed by the cell synchronization treatment. Moreover, by integrating expression data with in-silico motif discovery data, we identify 8 cell cycle associated regulatory motifs, among which 7 are binding sites for well-known cell cycle related TFs. CONCLUSION: Our method is effective to identify CCRTFs by integrating microarray cell cycle data with TF-gene binding information. In S. cerevisiae, the TF-gene binding information is provided by the systematic ChIP-chip experiments. In other species where systematic ChIP-chip data is not available, in-silico motif discovery and analysis provide us with an alternative method. Therefore, our method is ready to be implemented to the microarray cell cycle data sets from different species. The C++ program for AC score calculation is available for download from URL http://leili-lab.cmb.usc.edu/yeastaging/projects/project-base/.


Assuntos
Proteínas de Ciclo Celular/genética , Ciclo Celular/genética , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Fatores de Transcrição/genética , Algoritmos , Perfilação da Expressão Gênica , Saccharomyces cerevisiae/genética
19.
BMC Bioinformatics ; 8: 452, 2007 Nov 16.
Artigo em Inglês | MEDLINE | ID: mdl-18021409

RESUMO

BACKGROUND: The identification of transcription factors (TFs) associated with a biological process is fundamental to understanding its regulatory mechanisms. From microarray data, however, the activity changes of TFs often cannot be directly observed due to their relatively low expression levels, post-transcriptional modifications, and other complications. Several approaches have been proposed to infer TF activity changes from microarray data. In some models, a linear relationship between gene expression and TF-gene binding strength is assumed. In some other models, the target genes of a TF are first determined by a significance cutoff to binding affinity scores, and then expression differentiation is checked between the target and other genes. RESULTS: We propose a novel method, referred to as BASE (binding association with sorted expression), to infer TF activity changes from microarray expression profiles with the help of binding affinity data. It searches the maximum association between bind affinity profile of a TF and expression change profile along the direction of sorted differentiation. The method does not make hard target gene selection, rather, the significances of TF activity changes are evaluated by permutation tests of binding association at the end. To show the effectiveness of this method, we apply it to three typical examples using different kinds of binding affinity data, namely, ChIP-chip data, motif discovery data, and positional weighted matrix scanning data, respectively. The implications obtained from all three examples are consistent with established biological results. Moreover, the inferences suggest new and biological meaningful hypotheses for further investigation. CONCLUSION: The proposed method makes transcription inference from profiles of expression and binding affinity. The same machinery can be used to deal with various kinds of binding affinity data. The method does not require a linear assumption, and has the desirable property of scale-invariance with respect to TF-specific binding affinity. This method is easy to implement and can be routinely applied for transcriptional inferences in microarray studies.


Assuntos
Algoritmos , Perfilação da Expressão Gênica/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Análise de Sequência de DNA/métodos , Fatores de Transcrição/química , Fatores de Transcrição/genética , Sequência de Bases , Sítios de Ligação , Dados de Sequência Molecular , Ligação Proteica , Alinhamento de Sequência , Relação Estrutura-Atividade
20.
BMC Genomics ; 8: 219, 2007 Jul 06.
Artigo em Inglês | MEDLINE | ID: mdl-17617911

RESUMO

BACKGROUND: Three kinases: Sch9, PKA and TOR, are suggested to be involved in both the replicative and chronological ageing in yeast. They function in pathways whose down-regulation leads to life span extension. Several stress response proteins, including two transcription factors Msn2 and Msn4, mediate the longevity extension phenotype associated with decreased activity of either Sch9, PKA, or TOR. However, the mechanisms of longevity, especially the underlying transcription program have not been fully understood. RESULTS: We measured the gene expression profiles in wild type yeast and three long-lived mutants: sch9Delta, ras2Delta, and tor1Delta. To elucidate the transcription program that may account for the longevity extension, we identified the transcription factors that are systematically and significantly associated with the expression differentiation in these mutants with respect to wild type by integrating microarray expression data with motif and ChIP-chip data, respectively. Our analysis suggests that three stress response transcription factors, Msn2, Msn4 and Gis1, are activated in all the three mutants. We also identify some other transcription factors such as Fhl1 and Hsf1, which may also be involved in the transcriptional modification in the long-lived mutants. CONCLUSION: Combining microarray expression data with other data sources such as motif and ChIP-chip data provides biological insights into the transcription modification that leads to life span extension. In the chronologically long-lived mutant: sch9Delta, ras2Delta, and tor1Delta, several common stress response transcription factors are activated compared with the wild type according to our systematic transcription inference.


Assuntos
Perfilação da Expressão Gênica , Regulação Fúngica da Expressão Gênica , Longevidade/genética , Saccharomyces cerevisiae/genética , Transcrição Gênica , Motivos de Aminoácidos , Sequência de Bases , Imunoprecipitação da Cromatina , Redes Reguladoras de Genes , Genes Fúngicos , Dados de Sequência Molecular , Proteínas de Saccharomyces cerevisiae/genética , Análise de Sequência de DNA , Fatores de Transcrição/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA