Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 116
Filtrar
Mais filtros

Bases de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
BMC Biol ; 19(1): 73, 2021 04 13.
Artigo em Inglês | MEDLINE | ID: mdl-33849527

RESUMO

BACKGROUND: Dinoflagellates in the family Symbiodiniaceae are important photosynthetic symbionts in cnidarians (such as corals) and other coral reef organisms. Breakdown of the coral-dinoflagellate symbiosis due to environmental stress (i.e. coral bleaching) can lead to coral death and the potential collapse of reef ecosystems. However, evolution of Symbiodiniaceae genomes, and its implications for the coral, is little understood. Genome sequences of Symbiodiniaceae remain scarce due in part to their large genome sizes (1-5 Gbp) and idiosyncratic genome features. RESULTS: Here, we present de novo genome assemblies of seven members of the genus Symbiodinium, of which two are free-living, one is an opportunistic symbiont, and the remainder are mutualistic symbionts. Integrating other available data, we compare 15 dinoflagellate genomes revealing high sequence and structural divergence. Divergence among some Symbiodinium isolates is comparable to that among distinct genera of Symbiodiniaceae. We also recovered hundreds of gene families specific to each lineage, many of which encode unknown functions. An in-depth comparison between the genomes of the symbiotic Symbiodinium tridacnidorum (isolated from a coral) and the free-living Symbiodinium natans reveals a greater prevalence of transposable elements, genetic duplication, structural rearrangements, and pseudogenisation in the symbiotic species. CONCLUSIONS: Our results underscore the potential impact of lifestyle on lineage-specific gene-function innovation, genome divergence, and the diversification of Symbiodinium and Symbiodiniaceae. The divergent features we report, and their putative causes, may also apply to other microbial eukaryotes that have undergone symbiotic phases in their evolutionary history.


Assuntos
Antozoários , Dinoflagellida , Animais , Antozoários/genética , Recifes de Corais , Dinoflagellida/genética , Ecossistema , Variação Genética , Genoma/genética
2.
Brief Bioinform ; 20(2): 426-435, 2019 03 22.
Artigo em Inglês | MEDLINE | ID: mdl-28673025

RESUMO

We are amidst an ongoing flood of sequence data arising from the application of high-throughput technologies, and a concomitant fundamental revision in our understanding of how genomes evolve individually and within the biosphere. Workflows for phylogenomic inference must accommodate data that are not only much larger than before, but often more error prone and perhaps misassembled, or not assembled in the first place. Moreover, genomes of microbes, viruses and plasmids evolve not only by tree-like descent with modification but also by incorporating stretches of exogenous DNA. Thus, next-generation phylogenomics must address computational scalability while rethinking the nature of orthogroups, the alignment of multiple sequences and the inference and comparison of trees. New phylogenomic workflows have begun to take shape based on so-called alignment-free (AF) approaches. Here, we review the conceptual foundations of AF phylogenetics for the hierarchical (vertical) and reticulate (lateral) components of genome evolution, focusing on methods based on k-mers. We reflect on what seems to be successful, and on where further development is needed.


Assuntos
Evolução Molecular , Genoma , Filogenia , Algoritmos , Animais , Humanos , Microbiota/genética , Modelos Genéticos , Alinhamento de Sequência , Análise de Sequência de DNA , Vírus/genética
3.
PLoS Pathog ; 15(1): e1007513, 2019 01.
Artigo em Inglês | MEDLINE | ID: mdl-30673782

RESUMO

Mesenteric infection by the parasitic blood fluke Schistosoma bovis is a common veterinary problem in Africa and the Middle East and occasionally in the Mediterranean Region. The species also has the ability to form interspecific hybrids with the human parasite S. haematobium with natural hybridisation observed in West Africa, presenting possible zoonotic transmission. Additionally, this exchange of alleles between species may dramatically influence disease dynamics and parasite evolution. We have generated a 374 Mb assembly of the S. bovis genome using Illumina and PacBio-based technologies. Despite infecting different hosts and organs, the genome sequences of S. bovis and S. haematobium appeared strikingly similar with 97% sequence identity. The two species share 98% of protein-coding genes, with an average sequence identity of 97.3% at the amino acid level. Genome comparison identified large continuous parts of the genome (up to several 100 kb) showing almost 100% sequence identity between S. bovis and S. haematobium. It is unlikely that this is a result of genome conservation and provides further evidence of natural interspecific hybridization between S. bovis and S. haematobium. Our results suggest that foreign DNA obtained by interspecific hybridization was maintained in the population through multiple meiosis cycles and that hybrids were sexually reproductive, producing viable offspring. The S. bovis genome assembly forms a highly valuable resource for studying schistosome evolution and exploring genetic regions that are associated with species-specific phenotypic traits.


Assuntos
Hibridização Genética/genética , Schistosoma/genética , África , África Ocidental , Animais , Sequência de Bases/genética , Bovinos , Mapeamento Cromossômico/métodos , DNA/genética , Genoma/genética , Genoma Mitocondrial/genética , Hibridização Genética/fisiologia , Oriente Médio , Filogenia , Proteoma/genética , Especificidade da Espécie , Trematódeos/genética , Sequenciamento Completo do Genoma/métodos
4.
BMC Biol ; 18(1): 56, 2020 05 24.
Artigo em Inglês | MEDLINE | ID: mdl-32448240

RESUMO

BACKGROUND: Dinoflagellates are taxonomically diverse and ecologically important phytoplankton that are ubiquitously present in marine and freshwater environments. Mostly photosynthetic, dinoflagellates provide the basis of aquatic primary production; most taxa are free-living, while some can form symbiotic and parasitic associations with other organisms. However, knowledge of the molecular mechanisms that underpin the adaptation of these organisms to diverse ecological niches is limited by the scarce availability of genomic data, partly due to their large genome sizes estimated up to 250 Gbp. Currently available dinoflagellate genome data are restricted to Symbiodiniaceae (particularly symbionts of reef-building corals) and parasitic lineages, from taxa that have smaller genome size ranges, while genomic information from more diverse free-living species is still lacking. RESULTS: Here, we present two draft diploid genome assemblies of the free-living dinoflagellate Polarella glacialis, isolated from the Arctic and Antarctica. We found that about 68% of the genomes are composed of repetitive sequence, with long terminal repeats likely contributing to intra-species structural divergence and distinct genome sizes (3.0 and 2.7 Gbp). For each genome, guided using full-length transcriptome data, we predicted > 50,000 high-quality protein-coding genes, of which ~40% are in unidirectional gene clusters and ~25% comprise single exons. Multi-genome comparison unveiled genes specific to P. glacialis and a common, putatively bacterial origin of ice-binding domains in cold-adapted dinoflagellates. CONCLUSIONS: Our results elucidate how selection acts within the context of a complex genome structure to facilitate local adaptation. Because most dinoflagellate genes are constitutively expressed, Polarella glacialis has enhanced transcriptional responses via unidirectional, tandem duplication of single-exon genes that encode functions critical to survival in cold, low-light polar environments. These genomes provide a foundational reference for future research on dinoflagellate evolution.


Assuntos
Dinoflagellida/genética , Éxons , Genoma de Protozoário , Sequências de Repetição em Tandem , Transcriptoma , Adaptação Biológica , Genes de Protozoários
5.
Brief Bioinform ; 16(3): 461-74, 2015 May.
Artigo em Inglês | MEDLINE | ID: mdl-24950687

RESUMO

Breast cancer was traditionally perceived as a single disease; however, recent advances in gene expression and genomic profiling have revealed that breast cancer is in fact a collection of diseases exhibiting distinct anatomical features, responses to treatment and survival outcomes. Consequently, a number of schemes have been proposed for subtyping of breast cancer to bring out the biological and clinically relevant characteristics of the subtypes. Although some of these schemes capture underlying molecular differences, others predict variations in response to treatment and survival patterns. However, despite this diversity in the approaches, it is clear that molecular mechanisms drive clinical outcomes, and therefore an effective scheme should integrate molecular as well as clinical parameters to enable deeper understanding of cancer mechanisms and allow better decision making in the clinic. Here, using a large cohort of ∼550 breast tumours from The Cancer Genome Atlas, we systematically evaluate a number of expression-based schemes including at least eight molecular pathways implicated in breast cancer and three prognostic signatures, across a variety of classification scenarios covering molecular characteristics, biomarker status, tumour stages and survival patterns. We observe that a careful combination of these schemes yields better classification results compared with using them individually, thus confirming that molecular mechanisms and clinical outcomes are related and that an effective scheme should therefore integrate both these parameters to enable a deeper understanding of the cancer.


Assuntos
Biomarcadores Tumorais/metabolismo , Neoplasias da Mama/diagnóstico , Neoplasias da Mama/metabolismo , Perfilação da Expressão Gênica/métodos , Técnicas de Diagnóstico Molecular/métodos , Proteínas de Neoplasias/metabolismo , Neoplasias da Mama/classificação , Feminino , Humanos , Prognóstico , Mapeamento de Interação de Proteínas/métodos , Reprodutibilidade dos Testes , Medição de Risco/métodos , Sensibilidade e Especificidade
6.
Environ Microbiol ; 18(5): 1338-51, 2016 05.
Artigo em Inglês | MEDLINE | ID: mdl-26032777

RESUMO

Diazotrophic bacteria potentially supply substantial amounts of biologically fixed nitrogen to crops, but their occurrence may be suppressed by high nitrogen fertilizer application. Here, we explored the impact of high nitrogen fertilizer rates on the presence of diazotrophs in field-grown sugarcane with industry-standard or reduced nitrogen fertilizer application. Despite large differences in soil microbial communities between test sites, a core sugarcane root microbiome was identified. The sugarcane root-enriched core taxa overlap with those of Arabidopsis thaliana raising the possibility that certain bacterial families have had long association with plants. Reduced nitrogen fertilizer application had remarkably little effect on the core root microbiome and did not increase the relative abundance of root-associated diazotrophs or nif gene counts. Correspondingly, low nitrogen fertilizer crops had lower biomass and nitrogen content, reflecting a lack of major input of biologically fixed nitrogen, indicating that manipulating nitrogen fertilizer rates does not improve sugarcane yields by enriching diazotrophic populations under the test conditions. Standard nitrogen fertilizer crops had improved biomass and nitrogen content, and corresponding soils had higher abundances of nitrification and denitrification genes. These findings highlight that achieving a balance in maximizing crop yields and minimizing nutrient pollution associated with nitrogen fertilizer application requires understanding of how microbial communities respond to fertilizer use.


Assuntos
Fertilizantes , Microbiota , Nitrogênio , Raízes de Plantas/microbiologia , Saccharum/microbiologia , Bactérias/isolamento & purificação , Bactérias/metabolismo , Biomassa , Produtos Agrícolas , Fixação de Nitrogênio , Solo , Microbiologia do Solo
7.
Brief Bioinform ; 15(2): 195-211, 2014 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-23698722

RESUMO

Inference of gene regulatory network from expression data is a challenging task. Many methods have been developed to this purpose but a comprehensive evaluation that covers unsupervised, semi-supervised and supervised methods, and provides guidelines for their practical application, is lacking. We performed an extensive evaluation of inference methods on simulated and experimental expression data. The results reveal low prediction accuracies for unsupervised techniques with the notable exception of the Z-SCORE method on knockout data. In all other cases, the supervised approach achieved the highest accuracies and even in a semi-supervised setting with small numbers of only positive samples, outperformed the unsupervised techniques.


Assuntos
Biologia Computacional/métodos , Redes Reguladoras de Genes , Algoritmos , Inteligência Artificial , Simulação por Computador , Bases de Dados Genéticas/estatística & dados numéricos , Escherichia coli/genética , Perfilação da Expressão Gênica/estatística & dados numéricos , Genes Bacterianos , Genes Fúngicos , Saccharomyces cerevisiae/genética , Software , Máquina de Vetores de Suporte , Biologia de Sistemas
8.
Brief Bioinform ; 15(6): 973-83, 2014 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-23946492

RESUMO

Large quantities of information describing the mechanisms of biological pathways continue to be collected in publicly available databases. At the same time, experiments have increased in scale, and biologists increasingly use pathways defined in online databases to interpret the results of experiments and generate hypotheses. Emerging computational techniques that exploit the rich biological information captured in reaction systems require formal standardized descriptions of pathways to extract these reaction networks and avoid the alternative: time-consuming and largely manual literature-based network reconstruction. Here, we systematically evaluate the effects of commonly used knowledge representations on the seemingly simple task of extracting a reaction network describing signal transduction from a pathway database. We show that this process is in fact surprisingly difficult, and the pathway representations adopted by various knowledge bases have dramatic consequences for reaction network extraction, connectivity, capture of pathway crosstalk and in the modelling of cell-cell interactions. Researchers constructing computational models built from automatically extracted reaction networks must therefore consider the issues we outline in this review to maximize the value of existing pathway knowledge.


Assuntos
Bases de Dados Factuais/estatística & dados numéricos , Modelos Biológicos , Transdução de Sinais , Comunicação Celular , Biologia Computacional , Bases de Dados Factuais/normas , Humanos , Bases de Conhecimento , Sistema de Sinalização das MAP Quinases , Biologia de Sistemas
9.
Nucleic Acids Res ; 42(10): 6106-27, 2014 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-24792170

RESUMO

DNA-damage response machinery is crucial to maintain the genomic integrity of cells, by enabling effective repair of even highly lethal lesions such as DNA double-strand breaks (DSBs). Defects in specific genes acquired through mutations, copy-number alterations or epigenetic changes can alter the balance of these pathways, triggering cancerous potential in cells. Selective killing of cancer cells by sensitizing them to further DNA damage, especially by induction of DSBs, therefore requires careful modulation of DSB-repair pathways. Here, we review the latest knowledge on the two DSB-repair pathways, homologous recombination and non-homologous end joining in human, describing in detail the functions of their components and the key mechanisms contributing to the repair. Such an in-depth characterization of these pathways enables a more mechanistic understanding of how cells respond to therapies, and suggests molecules and processes that can be explored as potential therapeutic targets. One such avenue that has shown immense promise is via the exploitation of synthetic lethal relationships, for which the BRCA1-PARP1 relationship is particularly notable. Here, we describe how this relationship functions and the manner in which cancer cells acquire therapy resistance by restoring their DSB repair potential.


Assuntos
Neoplasias da Mama/terapia , Quebras de DNA de Cadeia Dupla , Reparo do DNA por Junção de Extremidades , Reparo de DNA por Recombinação , Neoplasias da Mama/genética , Neoplasias da Mama/metabolismo , Carcinogênese/genética , Carcinogênese/metabolismo , Feminino , Humanos
10.
Bioinformatics ; 30(9): 1273-9, 2014 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-24407221

RESUMO

MOTIVATION: Cancer is a heterogeneous progressive disease caused by perturbations of the underlying gene regulatory network that can be described by dynamic models. These dynamics are commonly modeled as Boolean networks or as ordinary differential equations. Their inference from data is computationally challenging, and at least partial knowledge of the regulatory network and its kinetic parameters is usually required to construct predictive models. RESULTS: Here, we construct Hopfield networks from static gene-expression data and demonstrate that cancer subtypes can be characterized by different attractors of the Hopfield network. We evaluate the clustering performance of the network and find that it is comparable with traditional methods but offers additional advantages including a dynamic model of the energy landscape and a unification of clustering, feature selection and network inference. We visualize the Hopfield attractor landscape and propose a pruning method to generate sparse networks for feature selection and improved understanding of feature relationships.


Assuntos
Perfilação da Expressão Gênica/métodos , Regulação Neoplásica da Expressão Gênica , Redes Reguladoras de Genes , Neoplasias/genética , Algoritmos , Análise por Conglomerados , Humanos , Cinética , Software
11.
Bioinformatics ; 29(12): 1553-61, 2013 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-23613489

RESUMO

MOTIVATION: Deciphering the modus operandi of dysregulated cellular mechanisms in cancer is critical to implicate novel cancer genes and develop effective anti-cancer therapies. Fundamental to this is meticulous tracking of the behavior of core modules, including complexes and pathways across specific conditions in cancer. RESULTS: Here, we performed a straightforward yet systematic identification and comparison of modules across pancreatic normal and cancer tissue conditions by integrating PPI, gene-expression and mutation data. Our analysis revealed interesting change-patterns in gene composition and expression correlation particularly affecting modules responsible for genome stability. Although in most cases these changes indicated impairment of essential functions (e.g., of DNA damage repair), in several other cases we noticed strengthening of modules possibly abetting cancer. Some of these compensatory modules showed switches in transcription regulation and recruitment of tumor inducers (e.g., SOX2 through overexpression). In-depth analysis revealed novel genes in pancreatic cancer, which showed susceptibility to copy-number alterations (e.g., for USP15 in 17 of 67 cases), supported by literature evidence for their involvement in other tumors (e.g., USP15 in glioblastoma). Two of the identified genes, YWHAE and DISC1, further supported the nexus between neural genes and pancreatic carcinogenesis. Extension of this assessment to BRCA1 and BRCA2 breast tumors showed specific differences even across the two sub-types and revealed novel genes involved therein (e.g., TRIM5 and NCOA6). AVAILABILITY: Our software CONTOURv1 is available at: http://bioinformatics.org.au/tools-data/.


Assuntos
Regulação Neoplásica da Expressão Gênica , Genes Neoplásicos , Proteína BRCA2/genética , Neoplasias da Mama/genética , Feminino , Expressão Gênica , Genes BRCA1 , Genes BRCA2 , Humanos , Mutação , Neoplasias/genética , Neoplasias Pancreáticas/genética , Neoplasias Pancreáticas/metabolismo , Mapeamento de Interação de Proteínas , Proteínas de Saccharomyces cerevisiae/metabolismo
12.
RNA Biol ; 11(3): 176-85, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24572375

RESUMO

From 1971 to 1985, Carl Woese and colleagues generated oligonucleotide catalogs of 16S/18S rRNAs from more than 400 organisms. Using these incomplete and imperfect data, Carl and his colleagues developed unprecedented insights into the structure, function, and evolution of the large RNA components of the translational apparatus. They recognized a third domain of life, revealed the phylogenetic backbone of bacteria (and its limitations), delineated taxa, and explored the tempo and mode of microbial evolution. For these discoveries to have stood the test of time, oligonucleotide catalogs must carry significant phylogenetic signal; they thus bear re-examination in view of the current interest in alignment-free phylogenetics based on k-mers. Here we consider the aims, successes, and limitations of this early phase of molecular phylogenetics. We computationally generate oligonucleotide sets (e-catalogs) from 16S/18S rRNA sequences, calculate pairwise distances between them based on D 2 statistics, compute distance trees, and compare their performance against alignment-based and k-mer trees. Although the catalogs themselves were superseded by full-length sequences, this stage in the development of computational molecular biology remains instructive for us today.


Assuntos
Biologia Computacional/métodos , Oligonucleotídeos , Filogenia , RNA Ribossômico/genética , Archaea/classificação , Archaea/genética , Bactérias/classificação , Bactérias/genética , Bases de Dados Genéticas , Evolução Molecular
13.
BMC Bioinformatics ; 14: 120, 2013 Apr 08.
Artigo em Inglês | MEDLINE | ID: mdl-23566217

RESUMO

BACKGROUND: Clustering sequences into groups of putative homologs (families) is a critical first step in many areas of comparative biology and bioinformatics. The performance of clustering approaches in delineating biologically meaningful families depends strongly on characteristics of the data, including content bias and degree of divergence. New, highly scalable methods have recently been introduced to cluster the very large datasets being generated by next-generation sequencing technologies. However, there has been little systematic investigation of how characteristics of the data impact the performance of these approaches. RESULTS: Using clusters from a manually curated dataset as reference, we examined the performance of a widely used graph-based Markov clustering algorithm (MCL) and a greedy heuristic approach (UCLUST) in delineating protein families coded by three sets of bacterial genomes of different G+C content. Both MCL and UCLUST generated clusters that are comparable to the reference sets at specific parameter settings, although UCLUST tends to under-cluster compositionally biased sequences (G+C content 33% and 66%). Using simulated data, we sought to assess the individual effects of sequence divergence, rate heterogeneity, and underlying G+C content. Performance decreased with increasing sequence divergence, decreasing among-site rate variation, and increasing G+C bias. Two MCL-based methods recovered the simulated families more accurately than did UCLUST. MCL using local alignment distances is more robust across the investigated range of sequence features than are greedy heuristics using distances based on global alignment. CONCLUSIONS: Our results demonstrate that sequence divergence, rate heterogeneity and content bias can individually and in combination affect the accuracy with which MCL and UCLUST can recover homologous protein families. For application to data that are more divergent, and exhibit higher among-site rate variation and/or content bias, MCL may often be the better choice, especially if computational resources are not limiting.


Assuntos
Proteínas/classificação , Homologia de Sequência de Aminoácidos , Algoritmos , Proteínas de Bactérias/química , Proteínas de Bactérias/classificação , Proteínas de Bactérias/genética , Composição de Bases , Análise por Conglomerados , DNA Bacteriano/química , Evolução Molecular , Genoma Bacteriano , Cadeias de Markov , Análise de Sequência de Proteína/métodos
14.
BMC Bioinformatics ; 14 Suppl 16: S14, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24564496

RESUMO

BACKGROUND: Cell survival and development are orchestrated by complex interlocking programs of gene activation and repression. Understanding how this gene regulatory network (GRN) functions in normal states, and is altered in cancers subtypes, offers fundamental insight into oncogenesis and disease progression, and holds great promise for guiding clinical decisions. Inferring a GRN from empirical microarray gene expression data is a challenging task in cancer systems biology. In recent years, module-based approaches for GRN inference have been proposed to address this challenge. Despite the demonstrated success of module-based approaches in uncovering biologically meaningful regulatory interactions, their application remains limited a single condition, without supporting the comparison of multiple disease subtypes/conditions. Also, their use remains unnecessarily restricted to computational biologists, as accurate inference of modules and their regulators requires integration of diverse tools and heterogeneous data sources, which in turn requires scripting skills, data infrastructure and powerful computational facilities. New analytical frameworks are required to make module-based GRN inference approach more generally useful to the research community. RESULTS: We present the RMaNI (Regulatory Module Network Inference) framework, which supports cancer subtype-specific or condition specific GRN inference and differential network analysis. It combines both transcriptomic as well as genomic data sources, and integrates heterogeneous knowledge resources and a set of complementary bioinformatic methods for automated inference of modules, their condition specific regulators and facilitates downstream network analyses and data visualization. To demonstrate its utility, we applied RMaNI to a hepatocellular microarray data containing normal and three disease conditions. We demonstrate that how RMaNI can be employed to understand the genetic architecture underlying three disease conditions. RMaNI is freely available at http://inspect.braembl.org.au/bi/inspect/rmani CONCLUSION: RMaNI makes available a workflow with comprehensive set of tools that would otherwise be challenging for non-expert users to install and apply. The framework presented in this paper is flexible and can be easily extended to analyse any dataset with multiple disease conditions.


Assuntos
Carcinoma Hepatocelular/genética , Biologia Computacional/métodos , Redes Reguladoras de Genes , Neoplasias Hepáticas/genética , Análise por Conglomerados , Expressão Gênica , Humanos , Internet , Biologia de Sistemas/métodos
15.
J Mol Evol ; 77(1-2): 1-2, 2013 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-23877343

RESUMO

A recent editorial in Journal of Molecular Evolution highlights opportunities and challenges facing molecular evolution in the era of next-generation sequencing. Abundant sequence data should allow more-complex models to be fit at higher confidence, making phylogenetic inference more reliable and improving our understanding of evolution at the molecular level. However, concern that approaches based on multiple sequence alignment may be computationally infeasible for large datasets is driving the development of so-called alignment-free methods for sequence comparison and phylogenetic inference. The recent editorial characterized these approaches as model-free, not based on the concept of homology, and lacking in biological intuition. We argue here that alignment-free methods have not abandoned models or homology, and can be biologically intuitive.


Assuntos
Evolução Molecular , Modelos Genéticos , Filogenia , Animais , Humanos
16.
Nat Methods ; 7(3 Suppl): S26-41, 2010 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-20195255

RESUMO

Advances in imaging techniques and high-throughput technologies are providing scientists with unprecedented possibilities to visualize internal structures of cells, organs and organisms and to collect systematic image data characterizing genes and proteins on a large scale. To make the best use of these increasingly complex and large image data resources, the scientific community must be provided with methods to query, analyze and crosslink these resources to give an intuitive visual representation of the data. This review gives an overview of existing methods and tools for this purpose and highlights some of their limitations and challenges.


Assuntos
Processamento de Imagem Assistida por Computador , Imageamento por Ressonância Magnética , Microscopia/métodos
17.
Bioinformatics ; 28(6): 851-7, 2012 Mar 15.
Artigo em Inglês | MEDLINE | ID: mdl-22219205

RESUMO

MOTIVATION: Phylogenetic profiling methods can achieve good accuracy in predicting protein-protein interactions, especially in prokaryotes. Recent studies have shown that the choice of reference taxa (RT) is critical for accurate prediction, but with more than 2500 fully sequenced taxa publicly available, identifying the most-informative RT is becoming increasingly difficult. Previous studies on the selection of RT have provided guidelines for manual taxon selection, and for eliminating closely related taxa. However, no general strategy for automatic selection of RT is currently available. RESULTS: We present three novel methods for automating the selection of RT, using machine learning based on known protein-protein interaction networks. One of these methods in particular, Tree-Based Search, yields greatly improved prediction accuracies. We further show that different methods for constituting phylogenetic profiles often require very different RT sets to support high prediction accuracy.


Assuntos
Archaea/genética , Inteligência Artificial , Bactérias/genética , Eucariotos/genética , Filogenia , Mapas de Interação de Proteínas , Proteínas/genética , Archaea/classificação , Archaea/metabolismo , Bactérias/classificação , Bactérias/metabolismo , Eucariotos/classificação , Eucariotos/metabolismo , Proteínas/química , Proteínas/metabolismo
18.
Bioinformatics ; 28(1): 69-75, 2012 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-22057159

RESUMO

MOTIVATION: Protein-protein interactions (PPIs) are pivotal for many biological processes and similarity in Gene Ontology (GO) annotation has been found to be one of the strongest indicators for PPI. Most GO-driven algorithms for PPI inference combine machine learning and semantic similarity techniques. We introduce the concept of inducers as a method to integrate both approaches more effectively, leading to superior prediction accuracies. RESULTS: An inducer (ULCA) in combination with a Random Forest classifier compares favorably to several sequence-based methods, semantic similarity measures and multi-kernel approaches. On a newly created set of high-quality interaction data, the proposed method achieves high cross-species prediction accuracies (Area under the ROC curve ≤ 0.88), rendering it a valuable companion to sequence-based methods. AVAILABILITY: Software and datasets are available at http://bioinformatics.org.au/go2ppi/ CONTACT: m.ragan@uq.edu.au.


Assuntos
Algoritmos , Anotação de Sequência Molecular , Proteínas/genética , Software , Vocabulário Controlado , Bases de Dados de Proteínas , Humanos , Mapas de Interação de Proteínas , Curva ROC , Leveduras/genética , Leveduras/metabolismo
19.
BMC Evol Biol ; 12: 140, 2012 Aug 07.
Artigo em Inglês | MEDLINE | ID: mdl-22871040

RESUMO

BACKGROUND: Proteins of the mammalian PYHIN (IFI200/HIN-200) family are involved in defence against infection through recognition of foreign DNA. The family member absent in melanoma 2 (AIM2) binds cytosolic DNA via its HIN domain and initiates inflammasome formation via its pyrin domain. AIM2 lies within a cluster of related genes, many of which are uncharacterised in mouse. To better understand the evolution, orthology and function of these genes, we have documented the range of PYHIN genes present in representative mammalian species, and undertaken phylogenetic and expression analyses. RESULTS: No PYHIN genes are evident in non-mammals or monotremes, with a single member found in each of three marsupial genomes. Placental mammals show variable family expansions, from one gene in cow to four in human and 14 in mouse. A single HIN domain appears to have evolved in the common ancestor of marsupials and placental mammals, and duplicated to give rise to three distinct forms (HIN-A, -B and -C) in the placental mammal ancestor. Phylogenetic analyses showed that AIM2 HIN-C and pyrin domains clearly diverge from the rest of the family, and it is the only PYHIN protein with orthology across many species. Interestingly, although AIM2 is important in defence against some bacteria and viruses in mice, AIM2 is a pseudogene in cow, sheep, llama, dolphin, dog and elephant. The other 13 mouse genes have arisen by duplication and rearrangement within the lineage, which has allowed some diversification in expression patterns. CONCLUSIONS: The role of AIM2 in forming the inflammasome is relatively well understood, but molecular interactions of other PYHIN proteins involved in defence against foreign DNA remain to be defined. The non-AIM2 PYHIN protein sequences are very distinct from AIM2, suggesting they vary in effector mechanism in response to foreign DNA, and may bind different DNA structures. The PYHIN family has highly varied gene composition between mammalian species due to lineage-specific duplication and loss, which probably indicates different adaptations for fighting infectious disease. Non-genomic DNA can indicate infection, or a mutagenic threat. We hypothesise that defence of the genome against endogenous retroelements has been an additional evolutionary driver for PYHIN proteins.


Assuntos
Evolução Molecular , Mamíferos/genética , Proteínas Nucleares/genética , Animais , Teorema de Bayes , Proteínas de Ligação a DNA , Humanos , Inflamassomos/metabolismo , Camundongos , Camundongos Endogâmicos C57BL , Proteínas Nucleares/química , Proteínas Nucleares/imunologia , Filogenia , Ratos , Transcriptoma
20.
RNA ; 16(9): 1760-8, 2010 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-20651029

RESUMO

The heterogeneous nuclear ribonucleoproteins (hnRNPs) A/B are a family of RNA-binding proteins that participate in various aspects of nucleic acid metabolism, including mRNA trafficking, telomere maintenance, and splicing. They are both regulators and targets of alternative splicing, and the patterns of alternative splicing of their transcripts have diverged between paralogs and between orthologs in different species. Surprisingly, the extent of this splicing variation and its implications for post-transcriptional regulation have remained largely unexplored. Here, we conducted a detailed analysis of hnRNP A/B sequences and expression patterns across six vertebrates. Alternative exons emerged via the introduction of new splice sites, changes in the strengths of existing splice sites, and the accumulation of auxiliary splicing regulatory motifs. Observed isoform expression patterns could be attributed to the frequency and strength of cis-elements. We found a trend toward increased splicing variation in mammals and identified novel alternatively spliced isoforms in human and chicken. Pulldown and translational assays demonstrated that the inclusion of alternative exons altered the affinity of hnRNP A/B proteins for their cognate nucleic acids and modified protein expression levels. As the hnRNPs A/B regulate several key steps in mRNA processing, the involvement of diverse hnRNP isoforms in multiple cellular contexts and species implies concomitant differences in the transcriptional output of these systems. We conclude that the emergence of alternative splicing in the hnRNPs A/B has contributed to the diversification of their roles in the regulation of alternative splicing and has thus added an unexpected layer of regulatory complexity to transcription in vertebrates.


Assuntos
Processamento Alternativo , Ribonucleoproteínas Nucleares Heterogêneas Grupo A-B/metabolismo , Animais , Evolução Molecular , Células HeLa , Ribonucleoproteínas Nucleares Heterogêneas Grupo A-B/genética , Humanos , Camundongos , Sítios de Splice de RNA , RNA Mensageiro/metabolismo , Ratos , Sequências Reguladoras de Ácido Ribonucleico
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA