Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 19 de 19
Filtrar
1.
BMC Med Inform Decis Mak ; 23(1): 153, 2023 08 08.
Artigo em Inglês | MEDLINE | ID: mdl-37553569

RESUMO

BACKGROUND: The recent advances in biotechnology and computer science have led to an ever-increasing availability of public biomedical data distributed in large databases worldwide. However, these data collections are far from being "standardized" so to be harmonized or even integrated, making it impossible to fully exploit the latest machine learning technologies for the analysis of data themselves. Hence, facing this huge flow of biomedical data is a challenging task for researchers and clinicians due to their complexity and high heterogeneity. This is the case of neurodegenerative diseases and the Alzheimer's Disease (AD) in whose context specialized data collections such as the one by the Alzheimer's Disease Neuroimaging Initiative (ADNI) are maintained. METHODS: Ontologies are controlled vocabularies that allow the semantics of data and their relationships in a given domain to be represented. They are often exploited to aid knowledge and data management in healthcare research. Computational Ontologies are the result of the combination of data management systems and traditional ontologies. Our approach is i) to define a computational ontology representing a logic-based formal conceptual model of the ADNI data collection and ii) to provide a means for populating the ontology with the actual data in the Alzheimer Disease Neuroimaging Initiative (ADNI). These two components make it possible to semantically query the ADNI database in order to support data extraction in a more intuitive manner. RESULTS: We developed: i) a detailed computational ontology for clinical multimodal datasets from the ADNI repository in order to simplify the access to these data; ii) a means for populating this ontology with the actual ADNI data. Such computational ontology immediately makes it possible to facilitate complex queries to the ADNI files, obtaining new diagnostic knowledge about Alzheimer's disease. CONCLUSIONS: The proposed ontology will improve the access to the ADNI dataset, allowing queries to extract multivariate datasets to perform multidimensional and longitudinal statistical analyses. Moreover, the proposed ontology can be a candidate for supporting the design and implementation of new information systems for the collection and management of AD data and metadata, and for being a reference point for harmonizing or integrating data residing in different sources.


Assuntos
Doença de Alzheimer , Humanos , Doença de Alzheimer/diagnóstico por imagem , Semântica , Gerenciamento de Dados
2.
BMC Bioinformatics ; 19(Suppl 10): 354, 2018 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-30367574

RESUMO

BACKGROUND: The high growth of Next Generation Sequencing data currently demands new knowledge extraction methods. In particular, the RNA sequencing gene expression experimental technique stands out for case-control studies on cancer, which can be addressed with supervised machine learning techniques able to extract human interpretable models composed of genes, and their relation to the investigated disease. State of the art rule-based classifiers are designed to extract a single classification model, possibly composed of few relevant genes. Conversely, we aim to create a large knowledge base composed of many rule-based models, and thus determine which genes could be potentially involved in the analyzed tumor. This comprehensive and open access knowledge base is required to disseminate novel insights about cancer. RESULTS: We propose CamurWeb, a new method and web-based software that is able to extract multiple and equivalent classification models in form of logic formulas ("if then" rules) and to create a knowledge base of these rules that can be queried and analyzed. The method is based on an iterative classification procedure and an adaptive feature elimination technique that enables the computation of many rule-based models related to the cancer under study. Additionally, CamurWeb includes a user friendly interface for running the software, querying the results, and managing the performed experiments. The user can create her profile, upload her gene expression data, run the classification analyses, and interpret the results with predefined queries. In order to validate the software we apply it to all public available RNA sequencing datasets from The Cancer Genome Atlas database obtaining a large open access knowledge base about cancer. CamurWeb is available at http://bioinformatics.iasi.cnr.it/camurweb . CONCLUSIONS: The experiments prove the validity of CamurWeb, obtaining many classification models and thus several genes that are associated to 21 different cancer types. Finally, the comprehensive knowledge base about cancer and the software tool are released online; interested researchers have free access to them for further studies and to design biological experiments in cancer research.


Assuntos
Regulação Neoplásica da Expressão Gênica , Bases de Conhecimento , Neoplasias/genética , Software , Sequência de Bases , Genes Neoplásicos , Genoma Humano , Humanos , Análise de Sequência de RNA
3.
BMC Med Inform Decis Mak ; 18(1): 35, 2018 05 31.
Artigo em Inglês | MEDLINE | ID: mdl-29855305

RESUMO

BACKGROUND: Alzheimer's Disease (AD) is a neurodegenaritive disorder characterized by a progressive dementia, for which actually no cure is known. An early detection of patients affected by AD can be obtained by analyzing their electroencephalography (EEG) signals, which show a reduction of the complexity, a perturbation of the synchrony, and a slowing down of the rhythms. METHODS: In this work, we apply a procedure that exploits feature extraction and classification techniques to EEG signals, whose aim is to distinguish patient affected by AD from the ones affected by Mild Cognitive Impairment (MCI) and healthy control (HC) samples. Specifically, we perform a time-frequency analysis by applying both the Fourier and Wavelet Transforms on 109 samples belonging to AD, MCI, and HC classes. The classification procedure is designed with the following steps: (i) preprocessing of EEG signals; (ii) feature extraction by means of the Discrete Fourier and Wavelet Transforms; and (iii) classification with tree-based supervised methods. RESULTS: By applying our procedure, we are able to extract reliable human-interpretable classification models that allow to automatically assign the patients into their belonging class. In particular, by exploiting a Wavelet feature extraction we achieve 83%, 92%, and 79% of accuracy when dealing with HC vs AD, HC vs MCI, and MCI vs AD classification problems, respectively. CONCLUSIONS: Finally, by comparing the classification performances with both feature extraction methods, we find out that Wavelets analysis outperforms Fourier. Hence, we suggest it in combination with supervised methods for automatic patients classification based on their EEG signals for aiding the medical diagnosis of dementia.


Assuntos
Doença de Alzheimer/diagnóstico , Classificação/métodos , Disfunção Cognitiva/diagnóstico , Eletroencefalografia/métodos , Processamento de Sinais Assistido por Computador , Idoso , Idoso de 80 Anos ou mais , Doença de Alzheimer/fisiopatologia , Disfunção Cognitiva/fisiopatologia , Feminino , Humanos , Masculino , Pessoa de Meia-Idade
4.
Bioinformatics ; 32(5): 697-704, 2016 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-26519501

RESUMO

MOTIVATION: Nowadays, knowledge extraction methods from Next Generation Sequencing data are highly requested. In this work, we focus on RNA-seq gene expression analysis and specifically on case-control studies with rule-based supervised classification algorithms that build a model able to discriminate cases from controls. State of the art algorithms compute a single classification model that contains few features (genes). On the contrary, our goal is to elicit a higher amount of knowledge by computing many classification models, and therefore to identify most of the genes related to the predicted class. RESULTS: We propose CAMUR, a new method that extracts multiple and equivalent classification models. CAMUR iteratively computes a rule-based classification model, calculates the power set of the genes present in the rules, iteratively eliminates those combinations from the data set, and performs again the classification procedure until a stopping criterion is verified. CAMUR includes an ad-hoc knowledge repository (database) and a querying tool.We analyze three different types of RNA-seq data sets (Breast, Head and Neck, and Stomach Cancer) from The Cancer Genome Atlas (TCGA) and we validate CAMUR and its models also on non-TCGA data. Our experimental results show the efficacy of CAMUR: we obtain several reliable equivalent classification models, from which the most frequent genes, their relationships, and the relation with a particular cancer are deduced. AVAILABILITY AND IMPLEMENTATION: dmb.iasi.cnr.it/camur.php CONTACT: emanuel@iasi.cnr.it SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Neoplasias , Algoritmos , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , RNA , Análise de Sequência de RNA
5.
BMC Neurosci ; 16: 28, 2015 Apr 29.
Artigo em Inglês | MEDLINE | ID: mdl-25925689

RESUMO

BACKGROUND: Many approaches exist to integrate protein-protein interaction data with other sources of information, most notably with gene co-expression data, to obtain information on network dynamics. It is of interest to look at groups of interacting gene products that form a protein complex. We were interested in applying new tools to the characterization of pathogenesis and dynamic events of an Alzheimer's-like neurodegenerative model, the AD11 mice, expressing an anti-NGF monoclonal antibody. The goal was to quantify the impact of neurodegeneration on protein complexes, by measuring the correlation between gene expression data by different metrics. RESULTS: Data were extracted from the gene expression profile of AD11 brain, obtained by Agilent microarray, at 1, 3, 6, 15 months of age. For genes coding proteins in complexes, the correlation matrix of pairwise expression was computed. The dynamics between correlation matrices at different time points was evaluated: paired T-test between average correlation levels and a normalized Euclidean distance with z-score. We unveiled a differential wiring of interactions in a set of complexes, whose network structure discriminates between transgenic and control mice. Furthermore, we analyzed the dynamics of gene expression values, by looking at changes in gene-to-gene correlation over time and identified those complexes that exhibit a different timedependent behaviour between transgenic and controls. The most significant changes in correlation dynamics are concentrated in the early stage of disease, with higher correlation in AD11 mice compared to controls. Many complexes go through dynamic changes over time, showing the role of the dysfunctional immunoproteasome, as early neurodegenerative disease event. Furthermore, this analysis shows key events in the neurodegeneration process of the AD11 model, by identifying significant differences in co-expression values of other complexes, such as parvulin complex, with a role in protein misfolding and proteostasis, and of complexes involved in transcriptional mechanisms. CONCLUSIONS: We have proposed a novel approach to analyze the network structure of protein complexes, by two different measures to evaluate the dynamics of gene-gene correlation matrices from gene expression profiles. The methodology was able to investigate the re-organization of interactions within protein complexes in the AD11 model of neurodegeneration.


Assuntos
Doença de Alzheimer/metabolismo , Encéfalo/metabolismo , Envelhecimento/metabolismo , Animais , Bases de Dados de Proteínas , Modelos Animais de Doenças , Feminino , Expressão Gênica , Perfilação da Expressão Gênica/métodos , Camundongos Transgênicos , Análise em Microsséries , Fatores de Tempo
6.
Virol J ; 9: 58, 2012 Mar 02.
Artigo em Inglês | MEDLINE | ID: mdl-22385517

RESUMO

BACKGROUND: Differences in genomic sequences are crucial for the classification of viruses into different species. In this work, viral DNA sequences belonging to the human polyomaviruses BKPyV, JCPyV, KIPyV, WUPyV, and MCPyV are analyzed using a logic data mining method in order to identify the nucleotides which are able to distinguish the five different human polyomaviruses. RESULTS: The approach presented in this work is successful as it discovers several logic rules that effectively characterize the different five studied polyomaviruses. The individuated logic rules are able to separate precisely one viral type from the other and to assign an unknown DNA sequence to one of the five analyzed polyomaviruses. CONCLUSIONS: The data mining analysis is performed by considering the complete sequences of the viruses and the sequences of the different gene regions separately, obtaining in both cases extremely high correct recognition rates.


Assuntos
Biologia Computacional/métodos , DNA Viral/química , Mineração de Dados , Polyomavirus/classificação , Polyomavirus/genética , Sequência de Bases , Humanos
7.
BMC Bioinformatics ; 11: 488, 2010 Sep 29.
Artigo em Inglês | MEDLINE | ID: mdl-20920230

RESUMO

BACKGROUND: A relevant problem in drug design is the comparison and recognition of protein binding sites. Binding sites recognition is generally based on geometry often combined with physico-chemical properties of the site since the conformation, size and chemical composition of the protein surface are all relevant for the interaction with a specific ligand. Several matching strategies have been designed for the recognition of protein-ligand binding sites and of protein-protein interfaces but the problem cannot be considered solved. RESULTS: In this paper we propose a new method for local structural alignment of protein surfaces based on continuous global optimization techniques. Given the three-dimensional structures of two proteins, the method finds the isometric transformation (rotation plus translation) that best superimposes active regions of two structures. We draw our inspiration from the well-known Iterative Closest Point (ICP) method for three-dimensional (3D) shapes registration. Our main contribution is in the adoption of a controlled random search as a more efficient global optimization approach along with a new dissimilarity measure. The reported computational experience and comparison show viability of the proposed approach. CONCLUSIONS: Our method performs well to detect similarity in binding sites when this in fact exists. In the future we plan to do a more comprehensive evaluation of the method by considering large datasets of non-redundant proteins and applying a clustering technique to the results of all comparisons to classify binding sites.


Assuntos
Algoritmos , Biologia Computacional/métodos , Proteínas/química , Sítios de Ligação , Bases de Dados de Proteínas , Desenho de Fármacos , Conformação Proteica , Proteínas/metabolismo , Propriedades de Superfície
8.
BMC Bioinformatics ; 10 Suppl 14: S7, 2009 Nov 10.
Artigo em Inglês | MEDLINE | ID: mdl-19900303

RESUMO

BACKGROUND: According to many field experts, specimens classification based on morphological keys needs to be supported with automated techniques based on the analysis of DNA fragments. The most successful results in this area are those obtained from a particular fragment of mitochondrial DNA, the gene cytochrome c oxidase I (COI) (the "barcode"). Since 2004 the Consortium for the Barcode of Life (CBOL) promotes the collection of barcode specimens and the development of methods to analyze the barcode for several tasks, among which the identification of rules to correctly classify an individual into its species by reading its barcode. RESULTS: We adopt a Logic Mining method based on two optimization models and present the results obtained on two datasets where a number of COI fragments are used to describe the individuals that belong to different species. The method proposed exhibits high correct recognition rates on a training-testing split of the available data using a small proportion of the information available (e.g., correct recognition approx. 97% when only 20 sites of the 648 available are used). The method is able to provide compact formulas on the values (A, C, G, T) at the selected sites that synthesize the characteristic of each species, a relevant information for taxonomists. CONCLUSION: We have presented a Logic Mining technique designed to analyze barcode data and to provide detailed output of interest to the taxonomists and the barcode community represented in the CBOL Consortium. The method has proven to be effective, efficient and precise.


Assuntos
Classificação/métodos , Biologia Computacional/métodos , Processamento Eletrônico de Dados , Análise de Sequência de DNA/métodos , Animais , Humanos
9.
Oncotarget ; 8(61): 103340-103363, 2017 Nov 28.
Artigo em Inglês | MEDLINE | ID: mdl-29262566

RESUMO

Increasing evidence points to a key role played by epithelial-mesenchymal transition (EMT) in cancer progression and drug resistance. In this study, we used wet and in silico approaches to investigate whether EMT phenotypes are associated to resistance to target therapy in a non-small cell lung cancer model system harboring activating mutations of the epidermal growth factor receptor. The combination of different analysis techniques allowed us to describe intermediate/hybrid and complete EMT phenotypes respectively in HCC827- and HCC4006-derived drug-resistant human cancer cell lines. Interestingly, intermediate/hybrid EMT phenotypes, a collective cell migration and increased stem-like ability associate to resistance to the epidermal growth factor receptor inhibitor, erlotinib, in HCC827 derived cell lines. Moreover, the use of three complementary approaches for gene expression analysis supported the identification of a small EMT-related gene list, which may have otherwise been overlooked by standard stand-alone methods for gene expression analysis.

10.
BMC Syst Biol ; 10: 25, 2016 Mar 02.
Artigo em Inglês | MEDLINE | ID: mdl-26935435

RESUMO

BACKGROUND: Recent advances in large datasets analysis offer new insights to modern biology allowing system-level investigation of pathologies. Here we describe a novel computational method that exploits the ever-growing amount of "omics" data to shed light on Alzheimer's and Parkinson's diseases. Neurological disorders exhibit a huge number of molecular alterations due to a complex interplay between genetic and environmental factors. Classical reductionist approaches are focused on a few elements, providing a narrow overview of the etiopathogenic complexity of multifactorial diseases. On the other hand, high-throughput technologies allow the evaluation of many components of biological systems and their behaviors. Analysis of Parkinson's Disease (PD) and Alzheimer's Disease (AD) from a network perspective can highlight proteins or pathways common but differently represented that can be discriminating between the two pathological conditions, thus highlight similarities and differences. RESULTS: In this work we propose a strategy that exploits network community structure identified with a state-of-the-art network community discovery algorithm called InfoMap, which takes advantage of information theory principles. We used two similarity measurements to quantify functional and topological similarities between the two pathologies. We built a Similarity Matrix to highlight similar communities and we analyzed statistically significant GO terms found in clustered areas of the matrix and in network communities. Our strategy allowed us to identify common known and unknown processes including DNA repair, RNA metabolism and glucose metabolism not detected with simple GO enrichment analysis. In particular, we were able to capture the connection between mitochondrial dysfunction and metabolism (glucose and glutamate/glutamine). CONCLUSIONS: This approach allows the identification of communities present in both pathologies which highlight common biological processes. Conversely, the identification of communities without any counterpart can be used to investigate processes that are characteristic of only one of the two pathologies. In general, the same strategy can be applied to compare any pair of biological networks.


Assuntos
Doença de Alzheimer/metabolismo , Doença de Alzheimer/patologia , Gráficos por Computador , Doença de Parkinson/metabolismo , Doença de Parkinson/patologia , Biologia de Sistemas/métodos , Doença de Alzheimer/genética , Reparo do DNA , Glucose/metabolismo , Humanos , Mitocôndrias/metabolismo , Doença de Parkinson/genética , RNA/metabolismo , Transdução de Sinais
11.
BioData Min ; 9: 38, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27980679

RESUMO

BACKGROUND: Continuous improvements in next generation sequencing technologies led to ever-increasing collections of genomic sequences, which have not been easily characterized by biologists, and whose analysis requires huge computational effort. The classification of species emerged as one of the main applications of DNA analysis and has been addressed with several approaches, e.g., multiple alignments-, phylogenetic trees-, statistical- and character-based methods. RESULTS: We propose a supervised method based on a genetic algorithm to identify small genomic subsequences that discriminate among different species. The method identifies multiple subsequences of bounded length with the same information power in a given genomic region. The algorithm has been successfully evaluated through its integration into a rule-based classification framework and applied to three different biological data sets: Influenza, Polyoma, and Rhino virus sequences. CONCLUSIONS: We discover a large number of small subsequences that can be used to identify each virus type with high accuracy and low computational time, and moreover help to characterize different genomic regions. Bounding their length to 20, our method found 1164 characterizing subsequences for all the Influenza virus subtypes, 194 for all the Polyoma viruses, and 11 for Rhino viruses. The abundance of small separating subsequences extracted for each genomic region may be an important support for quick and robust virus identification. Finally, useful biological information can be derived by the relative location and abundance of such subsequences along the different regions.

12.
BMC Res Notes ; 7: 869, 2014 Dec 03.
Artigo em Inglês | MEDLINE | ID: mdl-25465386

RESUMO

BACKGROUND: Next Generation Sequencing (NGS) machines extract from a biological sample a large number of short DNA fragments (reads). These reads are then used for several applications, e.g., sequence reconstruction, DNA assembly, gene expression profiling, mutation analysis. METHODS: We propose a method to evaluate the similarity between reads. This method does not rely on the alignment of the reads and it is based on the distance between the frequencies of their substrings of fixed dimensions (k-mers). We compare this alignment-free distance with the similarity measures derived from two alignment methods: Needleman-Wunsch and Blast. The comparison is based on a simple assumption: the most correct distance is obtained by knowing in advance the reference sequence. Therefore, we first align the reads on the original DNA sequence, compute the overlap between the aligned reads, and use this overlap as an ideal distance. We then verify how the alignment-free and the alignment-based distances reproduce this ideal distance. The ability of correctly reproducing the ideal distance is evaluated over samples of read pairs from Saccharomyces cerevisiae, Escherichia coli, and Homo sapiens. The comparison is based on the correctness of threshold predictors cross-validated over different samples. RESULTS: We exhibit experimental evidence that the proposed alignment-free distance is a potentially useful read-to-read distance measure and performs better than the more time consuming distances based on alignment. CONCLUSIONS: Alignment-free distances may be used effectively for reads comparison, and may provide a significant speed-up in several processes based on NGS sequencing (e.g., DNA assembly, reads classification).


Assuntos
Algoritmos , DNA Bacteriano/genética , DNA Fúngico/genética , Alinhamento de Sequência/estatística & dados numéricos , Análise de Sequência de DNA/estatística & dados numéricos , DNA Bacteriano/química , DNA Fúngico/química , Escherichia coli/genética , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Saccharomyces cerevisiae/genética , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos
13.
OMICS ; 18(2): 155-65, 2014 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-24404838

RESUMO

Experimental co-expression data and protein-protein interaction networks are frequently used to analyze the interactions among genes or proteins. Recent studies have investigated methods to integrate these two sources of information. We propose a new method to integrate co-expression data obtained through DNA microarray analysis (MA) and protein-protein interaction (PPI) network data, and apply it to Arabidopsis thaliana. The proposed method identifies small subsets of highly interacting proteins. Based on the analysis of the basis of co-localization and mRNA developmental expression, we show that these groups provide important biological insights; additionally, these subsets are significantly enriched with respect to KEGG Pathways and can be used to predict successfully whether proteins belong to known pathways. Thus, the method is able to provide relevant biological information and support the functional identification of complex genetic traits of economic value in plant agrigenomics research. The method has been implemented in a prototype software tool named CLAIM (CLuster Analysis Integration Method) and can be downloaded from http://bio.cs.put.poznan.pl/research_fields . CLAIM is based on the separate clustering of MA and PPI data; the clusters are merged in a special graph; cliques of this graph are subsets of strongly connected proteins. The proposed method was successfully compared with existing methods. CLAIM appears to be a useful semi-automated tool for protein functional analysis and warrants further evaluation in agrigenomics research.


Assuntos
Proteínas de Arabidopsis/genética , Arabidopsis/genética , Regulação da Expressão Gênica de Plantas , Redes Reguladoras de Genes , Genoma de Planta , Software , Algoritmos , Arabidopsis/metabolismo , Proteínas de Arabidopsis/metabolismo , Anotação de Sequência Molecular , Família Multigênica , Reconhecimento Automatizado de Padrão , Mapeamento de Interação de Proteínas/métodos
14.
Biotechnol Adv ; 31(2): 274-86, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23228981

RESUMO

A number of interesting issues have been addressed on biological networks about their global and local properties. The connection between the topological properties of proteins in Protein-Protein Interaction (PPI) networks and their biological relevance has been investigated focusing on hubs, i.e. proteins with a large number of interacting partners. We will survey the literature trying to answer the following questions: Do hub proteins have special biological properties? Do they tend to be more essential than non-hub proteins? Are they more evolutionarily conserved? Do they play a central role in modular organization of the protein interaction network? Are there structural properties that characterize hub proteins?


Assuntos
Mapas de Interação de Proteínas , Proteínas/química , Proteínas/metabolismo , Bases de Dados de Proteínas , Evolução Molecular , Mapeamento de Interação de Proteínas , Estrutura Terciária de Proteína
15.
Mol Ecol Resour ; 13(6): 1043-6, 2013 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-23350601

RESUMO

BLOG (Barcoding with LOGic) is a diagnostic and character-based DNA Barcode analysis method. Its aim is to classify specimens to species based on DNA Barcode sequences and on a supervised machine learning approach, using classification rules that compactly characterize species in terms of DNA Barcode locations of key diagnostic nucleotides. The BLOG 2.0 software, its fundamental modules, online/offline user interfaces and recent improvements are described. These improvements affect both methodology and software design, and lead to the availability of different releases on the website http://dmb.iasi.cnr.it/blog-downloads.php. Previous and new experimental tests show that BLOG 2.0 outperforms previous versions as well as other DNA Barcode analysis methods.


Assuntos
Código de Barras de DNA Taxonômico , Software , Classificação/métodos , Especificidade da Espécie , Interface Usuário-Computador
16.
Biotechnol Adv ; 30(1): 185-201, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-21964263

RESUMO

The FAR1 gene encodes an 830 residue bifunctional protein, whose major function is inhibition of cyclin-dependent kinase complexes involved in the G1/S transition. FAR1 transcription is maximal between mitosis and early G1 phase. Enhanced FAR1 transcription is necessary but not sufficient for the pheromone-induced G1 arrest, since FAR1 overexpression itself does not trigger cell cycle arrest. Besides its well established role in the response to pheromone, recent evidences suggest that Far1 may also regulate the mitotic cell cycle progression: in particular, it has been proposed that Far1, together with the G1 cyclin Cln3, may be part of a cell sizer mechanism that controls the entry into S phase. Far1 is an unstable protein throughout the cell cycle except during G1 phase. Far1 levels peak in newborn cells as a consequence of a burst of synthetic activity at the end of the previous cycle, and the amounts per cell remain roughly constant during the G1 phase. Phosphorylation (at serine 87) by Cdk1-Cln complexes primes Far1 for ubiquitin-mediated proteolysis. By coupling a genome-wide transcriptional analysis of FAR1-overexpressing and far1Δ cells grown in ethanol- or glucose-supplemented minimal media with a range of phenotypic analysis, we show that FAR1 overexpression not only coordinately increases RNA and protein accumulation, but induces strong transcriptional remodeling, metabolism being the most affected cellular property, suggesting that the Far1/Cln3 sizer regulates cell growth either directly or indirectly by affecting metabolism and pathways known to modulate ribosome biogenesis. A crucial role in mediating the effect of Far1 overexpression is played by the Sfp1 protein, a key transcriptional regulator of ribosome biogenesis, whose presence is mandatory to allow a coordinated increase in both RNA and protein levels in ethanol-grown cells.


Assuntos
Proteínas Inibidoras de Quinase Dependente de Ciclina/metabolismo , Proteínas de Ligação a DNA/metabolismo , Regulação Fúngica da Expressão Gênica , RNA Fúngico/biossíntese , Proteínas de Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/genética , Ciclo Celular/genética , Tamanho Celular , Análise por Conglomerados , Biologia Computacional , Proteínas Inibidoras de Quinase Dependente de Ciclina/genética , Proteínas de Ligação a DNA/genética , Etanol/metabolismo , Perfilação da Expressão Gênica , Redes Reguladoras de Genes/genética , Glucose/metabolismo , Fenótipo , Proteínas Serina-Treonina Quinases/genética , Proteínas Serina-Treonina Quinases/metabolismo , RNA Fúngico/genética , RNA Fúngico/metabolismo , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/genética , Transdução de Sinais , Transcrição Gênica , Regulação para Cima
17.
Front Physiol ; 3: 362, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22988443

RESUMO

Systems Biology holds that complex cellular functions are generated as system-level properties endowed with robustness, each involving large networks of molecular determinants, generally identified by "omics" analyses. In this paper we describe four basic cancer cell properties that can easily be investigated in vitro: enhanced proliferation, evasion from apoptosis, genomic instability, and inability to undergo oncogene-induced senescence. Focusing our analysis on a K-ras dependent transformation system, we show that enhanced proliferation and evasion from apoptosis are closely linked, and present findings that indicate how a large metabolic remodeling sustains the enhanced growth ability. Network analysis of transcriptional profiling gives the first indication on this remodeling, further supported by biochemical investigations and metabolic flux analysis (MFA). Enhanced glycolysis, down-regulation of TCA cycle, decoupling of glucose and glutamine utilization, with increased reductive carboxylation of glutamine, so to yield a sustained production of growth building blocks and glutathione, are the hallmarks of enhanced proliferation. Low glucose availability specifically induces cell death in K-ras transformed cells, while PKA activation reverts this effect, possibly through at least two mitochondrial targets. The central role of mitochondria in determining the two investigated cancer cell properties is finally discussed. Taken together the findings reported herein indicate that a system-level property is sustained by a cascade of interconnected biochemical pathways that behave differently in normal and in transformed cells.

18.
J Alzheimers Dis ; 24(4): 721-38, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-21321390

RESUMO

The identification of early and stage-specific biomarkers for Alzheimer's disease (AD) is critical, as the development of disease-modification therapies may depend on the discovery and validation of such markers. The identification of early reliable biomarkers depends on the development of new diagnostic algorithms to computationally exploit the information in large biological datasets. To identify potential biomarkers from mRNA expression profile data, we used the Logic Mining method for the unbiased analysis of a large microarray expression dataset from the anti-NGF AD11 transgenic mouse model. The gene expression profile of AD11 brain regions was investigated at different neurodegeneration stages by whole genome microarrays. A new implementation of the Logic Mining method was applied both to early (1-3 months) and late stage (6-15 months) expression data, coupled to standard statistical methods. A small number of "fingerprinting" formulas was isolated, encompassing mRNAs whose expression levels were able to discriminate between diseased and control mice. We selected three differential "signature" genes specific for the early stage (Nudt19, Arl16, Aph1b), five common to both groups (Slc15a2, Agpat5, Sox2ot, 2210015, D19Rik, Wdfy1), and seven specific for late stage (D14Ertd449, Tia1, Txnl4, 1810014B01Rik, Snhg3, Actl6a, Rnf25). We suggest these genes as potential biomarkers for the early and late stage of AD-like neurodegeneration in this model and conclude that Logic Mining is a powerful and reliable approach for large scale expression data analysis. Its application to large expression datasets from brain or peripheral human samples may facilitate the discovery of early and stage-specific AD biomarkers.


Assuntos
Doença de Alzheimer/genética , Química Encefálica/genética , Mineração de Dados/métodos , Modelos Animais de Doenças , Perfilação da Expressão Gênica/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Doença de Alzheimer/metabolismo , Doença de Alzheimer/patologia , Animais , Feminino , Marcadores Genéticos/genética , Camundongos , Camundongos Endogâmicos C57BL , Camundongos Transgênicos
19.
Artigo em Inglês | MEDLINE | ID: mdl-20671321

RESUMO

Haplotype data play a relevant role in several genetic studies, e.g., mapping of complex disease genes, drug design, and evolutionary studies on populations. However, the experimental determination of haplotypes is expensive and time-consuming. This motivates the increasing interest in techniques for inferring haplotype data from genotypes, which can instead be obtained quickly and economically. Several such techniques are based on the maximum parsimony principle, which has been justified by both experimental results and theoretical arguments. However, the problem of haplotype inference by parsimony was shown to be NP-hard, thus limiting the applicability of exact parsimony-based techniques to relatively small data sets. In this paper, we introduce collapse rule, a generalization of the well-known Clark's rule, and describe a new heuristic algorithm for haplotype inference (implemented in a program called CollHaps), based on parsimony and the iterative application of collapse rules. The performance of CollHaps is tested on several data sets. The experiments show that CollHaps enables the user to process large data sets obtaining very "parsimonious" solutions in short processing times. They also show a correlation, especially for large data sets, between parsimony and correct reconstruction, supporting the validity of the parsimony principle to produce accurate solutions.


Assuntos
Algoritmos , Biologia Computacional/métodos , Haplótipos/genética , Software , Genótipo , Humanos , Polimorfismo de Nucleotídeo Único/genética , Análise de Sequência de DNA/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA