Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Sensors (Basel) ; 23(6)2023 Mar 09.
Artigo em Inglês | MEDLINE | ID: mdl-36991682

RESUMO

Electroencephalogram (EEG) interpretation plays a critical role in the clinical assessment of neurological conditions, most notably epilepsy. However, EEG recordings are typically analyzed manually by highly specialized and heavily trained personnel. Moreover, the low rate of capturing abnormal events during the procedure makes interpretation time-consuming, resource-hungry, and overall an expensive process. Automatic detection offers the potential to improve the quality of patient care by shortening the time to diagnosis, managing big data and optimizing the allocation of human resources towards precision medicine. Here, we present MindReader, a novel unsupervised machine-learning method comprised of the interplay between an autoencoder network, a hidden Markov model (HMM), and a generative component: after dividing the signal into overlapping frames and performing a fast Fourier transform, MindReader trains an autoencoder neural network for dimensionality reduction and compact representation of different frequency patterns for each frame. Next, we processed the temporal patterns using a HMM, while a third and generative component hypothesized and characterized the different phases that were then fed back to the HMM. MindReader then automatically generates labels that the physician can interpret as pathological and non-pathological phases, thus effectively reducing the search space for trained personnel. We evaluated MindReader's predictive performance on 686 recordings, encompassing more than 980 h from the publicly available Physionet database. Compared to manual annotations, MindReader identified 197 of 198 epileptic events (99.45%), and is, as such, a highly sensitive method, which is a prerequisite for clinical use.


Assuntos
Eletroencefalografia , Epilepsia , Humanos , Eletroencefalografia/métodos , Epilepsia/diagnóstico , Redes Neurais de Computação , Análise de Fourier , Aprendizado de Máquina não Supervisionado
2.
BMC Bioinformatics ; 23(1): 443, 2022 Oct 25.
Artigo em Inglês | MEDLINE | ID: mdl-36284273

RESUMO

BACKGROUND: Generating polygenic risk scores for diseases and complex traits requires high quality GWAS summary statistic files. Often, these files can be difficult to acquire either as a result of unshared or incomplete data. To date, bioinformatics tools which focus on restoring missing columns containing identification and association data are limited, which has the potential to increase the number of usable GWAS summary statistics files. RESULTS: SumStatsRehab was able to restore rsID, effect/other alleles, chromosome, base pair position, effect allele frequencies, beta, standard error, and p-values to a better extent than any other currently available tool, with minimal loss. CONCLUSIONS: SumStatsRehab offers a unique tool utilizing both functional programming and pipeline-like architecture, allowing users to generate accurate data restorations for incomplete summary statistics files. This in turn, increases the number of usable GWAS summary statistics files, which may be invaluable for less researched health traits.


Assuntos
Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Herança Multifatorial , Fenótipo , Algoritmos
3.
PLoS One ; 17(5): e0259327, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35533190

RESUMO

The vast majority of human traits, including many disease phenotypes, are affected by alleles at numerous genomic loci. With a continually increasing set of variants with published clinical disease or biomarker associations, an easy-to-use tool for non-programmers to rapidly screen VCF files for risk alleles is needed. We have developed EZTraits as a tool to quickly evaluate genotype data against a set of rules defined by the user. These rules can be defined directly in the scripting language Lua, for genotype calls using variant ID (RS number) or chromosomal position. Alternatively, EZTraits can parse simple and intuitive text including concepts like 'any' or 'all'. Thus, EZTraits is designed to support rapid genetic analysis and hypothesis-testing by researchers, regardless of programming experience or technical background. The software is implemented in C++ and compiles and runs on Linux and MacOS. The source code is available under the MIT license from https://github.com/selfdecode/rd-eztraits.


Assuntos
Genômica , Software , Alelos , Genótipo , Fenótipo
4.
BMC Genomics ; 19(1): 964, 2018 Dec 27.
Artigo em Inglês | MEDLINE | ID: mdl-30587115

RESUMO

BACKGROUND: Studies that aim at explaining phenotypes or disease susceptibility by genetic or epigenetic variants often rely on clustering methods to stratify individuals or samples. While statistical associations may point at increased risk for certain parts of the population, the ultimate goal is to make precise predictions for each individual. This necessitates tools that allow for the rapid inspection of each data point, in particular to find explanations for outliers. RESULTS: ACES is an integrative cluster- and phenotype-browser, which implements standard clustering methods, as well as multiple visualization methods in which all sample information can be displayed quickly. In addition, ACES can automatically mine a list of phenotypes for cluster enrichment, whereby the number of clusters and their boundaries are estimated by a novel method. For visual data browsing, ACES provides a 2D or 3D PCA or Heat Map view. ACES is implemented in Java, with a focus on a user-friendly, interactive, graphical interface. CONCLUSIONS: ACES has been proven an invaluable tool for analyzing large, pre-filtered DNA methylation data sets and RNA-Sequencing data, due to its ease to link molecular markers to complex phenotypes. The source code is available from https://github.com/GrabherrGroup/ACES .


Assuntos
Interface Usuário-Computador , Análise por Conglomerados , Metilação de DNA , Diabetes Mellitus Tipo 1/genética , Diabetes Mellitus Tipo 1/patologia , Humanos , Acesso à Internet , Análise de Componente Principal , RNA/química , RNA/metabolismo
5.
Genes (Basel) ; 9(10)2018 Oct 18.
Artigo em Inglês | MEDLINE | ID: mdl-30340386

RESUMO

The massive increase in computational power over the recent years and wider applicationsof machine learning methods, coincidental or not, were paralleled by remarkable advances inhigh-throughput DNA sequencing technologies.[...].

6.
BioData Min ; 10: 30, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28878825

RESUMO

BACKGROUND: Measuring how gene expression changes in the course of an experiment assesses how an organism responds on a molecular level. Sequencing of RNA molecules, and their subsequent quantification, aims to assess global gene expression changes on the RNA level (transcriptome). While advances in high-throughput RNA-sequencing (RNA-seq) technologies allow for inexpensive data generation, accurate post-processing and normalization across samples is required to eliminate any systematic noise introduced by the biochemical and/or technical processes. Existing methods thus either normalize on selected known reference genes that are invariant in expression across the experiment, assume that the majority of genes are invariant, or that the effects of up- and down-regulated genes cancel each other out during the normalization. RESULTS: Here, we present a novel method, moose2 , which predicts invariant genes in silico through a dynamic programming (DP) scheme and applies a quadratic normalization based on this subset. The method allows for specifying a set of known or experimentally validated invariant genes, which guides the DP. We experimentally verified the predictions of this method in the bacterium Escherichia coli, and show how moose2 is able to (i) estimate the expression value distances between RNA-seq samples, (ii) reduce the variation of expression values across all samples, and (iii) to subsequently reveal new functional groups of genes during the late stages of DNA damage. We further applied the method to three eukaryotic data sets, on which its performance compares favourably to other methods. The software is implemented in C++ and is publicly available from http://grabherr.github.io/moose2/. CONCLUSIONS: The proposed RNA-seq normalization method, moose2 , is a valuable alternative to existing methods, with two major advantages: (i) in silico prediction of invariant genes provides a list of potential reference genes for downstream analyses, and (ii) non-linear artefacts in RNA-seq data are handled adequately to minimize variations between replicates.

7.
BMC Bioinformatics ; 17(1): 393, 2016 Sep 23.
Artigo em Inglês | MEDLINE | ID: mdl-27663458

RESUMO

BACKGROUND: DNA methylation plays a key role in developmental processes, which is reflected in changing methylation patterns at specific CpG sites over the lifetime of an individual. The underlying mechanisms are complex and possibly affect multiple genes or entire pathways. RESULTS: We applied a multivariate approach to identify combinations of CpG sites that undergo modifications when transitioning between developmental stages. Monte Carlo feature selection produced a list of ranked and statistically significant CpG sites, while rule-based models allowed for identifying particular methylation changes in these sites. Our rule-based classifier reports combinations of CpG sites, together with changes in their methylation status in the form of easy-to-read IF-THEN rules, which allows for identification of the genes associated with the underlying sites. CONCLUSION: We utilized machine learning and statistical methods to discretize decision class (age) values to get a general pattern of methylation changes over the lifespan. The CpG sites present in the significant rules were annotated to genes involved in brain formation, general development, as well as genes linked to cancer and Alzheimer's disease.

8.
PLoS One ; 10(9): e0139080, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26413905

RESUMO

After performing de novo transcript assembly of >1 billion RNA-Sequencing reads obtained from 22 samples of different Norway spruce (Picea abies) tissues that were not surface sterilized, we found that assembled sequences captured a mix of plant, lichen, and fungal transcripts. The latter were likely expressed by endophytic and epiphytic symbionts, indicating that these organisms were present, alive, and metabolically active. Here, we show that these serendipitously sequenced transcripts need not be considered merely as contamination, as is common, but that they provide insight into the plant's phyllosphere. Notably, we could classify these transcripts as originating predominantly from Dothideomycetes and Leotiomycetes species, with functional annotation of gene families indicating active growth and metabolism, with particular regards to glucose intake and processing, as well as gene regulation.


Assuntos
Fungos/genética , Picea/genética , Picea/microbiologia , Transcriptoma/genética , Composição de Bases/genética , Regulação Fúngica da Expressão Gênica , Regulação da Expressão Gênica de Plantas , RNA Mensageiro/genética , RNA Mensageiro/metabolismo
9.
Bioinformatics ; 31(12): 2054-5, 2015 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-25661541

RESUMO

UNLABELLED: Whiteboard is a class library implemented in C++ that enables visualization to be tightly coupled with computation when analyzing large and complex datasets. AVAILABILITY AND IMPLEMENTATION: the C++ source code, coding samples and documentation are freely available under the Lesser General Public License from http://whiteboard-class.sourceforge.net/.


Assuntos
Biologia Computacional/métodos , Gráficos por Computador , Linguagens de Programação , Software , Bases de Dados Factuais , Humanos , Armazenamento e Recuperação da Informação
10.
BMC Bioinformatics ; 15: 227, 2014 Jun 30.
Artigo em Inglês | MEDLINE | ID: mdl-24976580

RESUMO

BACKGROUND: Genomic duplications constitute major events in the evolution of species, allowing paralogous copies of genes to take on fine-tuned biological roles. Unambiguously identifying the orthology relationship between copies across multiple genomes can be resolved by synteny, i.e. the conserved order of genomic sequences. However, a comprehensive analysis of duplication events and their contributions to evolution would require all-to-all genome alignments, which increases at N2 with the number of available genomes, N. RESULTS: Here, we introduce Kraken, software that omits the all-to-all requirement by recursively traversing a graph of pairwise alignments and dynamically re-computing orthology. Kraken scales linearly with the number of targeted genomes, N, which allows for including large numbers of genomes in analyses. We first evaluated the method on the set of 12 Drosophila genomes, finding that orthologous correspondence computed indirectly through a graph of multiple synteny maps comes at minimal cost in terms of sensitivity, but reduces overall computational runtime by an order of magnitude. We then used the method on three well-annotated mammalian genomes, human, mouse, and rat, and show that up to 93% of protein coding transcripts have unambiguous pairwise orthologous relationships across the genomes. On a nucleotide level, 70 to 83% of exons match exactly at both splice junctions, and up to 97% on at least one junction. We last applied Kraken to an RNA-sequencing dataset from multiple vertebrates and diverse tissues, where we confirmed that brain-specific gene family members, i.e. one-to-many or many-to-many homologs, are more highly correlated across species than single-copy (i.e. one-to-one homologous) genes. Not limited to protein coding genes, Kraken also identifies thousands of newly identified transcribed loci, likely non-coding RNAs that are consistently transcribed in human, chimpanzee and gorilla, and maintain significant correlation of expression levels across species. CONCLUSIONS: Kraken is a computational genome coordinate translator that facilitates cross-species comparisons, distinguishes orthologs from paralogs, and does not require costly all-to-all whole genome mappings. Kraken is freely available under LPGL from http://github.com/nedaz/kraken.


Assuntos
Genômica/métodos , Software , Animais , Mapeamento Cromossômico , Drosophila melanogaster/genética , Evolução Molecular , Genoma/genética , Humanos , Camundongos , Anotação de Sequência Molecular , Ratos , Sintenia/genética , Transcrição Gênica
11.
Source Code Biol Med ; 9: 12, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24976859

RESUMO

BACKGROUND: The fundamental challenge in optimally aligning homologous sequences is to define a scoring scheme that best reflects the underlying biological processes. Maximising the overall number of matches in the alignment does not always reflect the patterns by which nucleotides mutate. Efficiently implemented algorithms that can be parameterised to accommodate more complex non-linear scoring schemes are thus desirable. RESULTS: We present Cola, alignment software that implements different optimal alignment algorithms, also allowing for scoring contiguous matches of nucleotides in a nonlinear manner. The latter places more emphasis on short, highly conserved motifs, and less on the surrounding nucleotides, which can be more diverged. To illustrate the differences, we report results from aligning 14,100 sequences from 3' untranslated regions of human genes to 25 of their mammalian counterparts, where we found that a nonlinear scoring scheme is more consistent than a linear scheme in detecting short, conserved motifs. CONCLUSIONS: Cola is freely available under LPGL from https://github.com/nedaz/cola.

12.
PLoS One ; 9(3): e91172, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24625832

RESUMO

The domestic dog, Canis familiaris, is a well-established model system for mapping trait and disease loci. While the original draft sequence was of good quality, gaps were abundant particularly in promoter regions of the genome, negatively impacting the annotation and study of candidate genes. Here, we present an improved genome build, canFam3.1, which includes 85 MB of novel sequence and now covers 99.8% of the euchromatic portion of the genome. We also present multiple RNA-Sequencing data sets from 10 different canine tissues to catalog ∼175,000 expressed loci. While about 90% of the coding genes previously annotated by EnsEMBL have measurable expression in at least one sample, the number of transcript isoforms detected by our data expands the EnsEMBL annotations by a factor of four. Syntenic comparison with the human genome revealed an additional ∼3,000 loci that are characterized as protein coding in human and were also expressed in the dog, suggesting that those were previously not annotated in the EnsEMBL canine gene set. In addition to ∼20,700 high-confidence protein coding loci, we found ∼4,600 antisense transcripts overlapping exons of protein coding genes, ∼7,200 intergenic multi-exon transcripts without coding potential, likely candidates for long intergenic non-coding RNAs (lincRNAs) and ∼11,000 transcripts were reported by two different library construction methods but did not fit any of the above categories. Of the lincRNAs, about 6,000 have no annotated orthologs in human or mouse. Functional analysis of two novel transcripts with shRNA in a mouse kidney cell line altered cell morphology and motility. All in all, we provide a much-improved annotation of the canine genome and suggest regulatory functions for several of the novel non-coding transcripts.


Assuntos
Cães/genética , Genoma , Polimorfismo de Nucleotídeo Único , Animais , Linhagem Celular , Éxons , Perfilação da Expressão Gênica , Humanos , Camundongos , Proteínas do Tecido Nervoso/metabolismo , Oligonucleotídeos Antissenso/química , Podócitos/citologia , RNA Mensageiro/metabolismo , RNA Interferente Pequeno/metabolismo , RNA não Traduzido , Análise de Sequência de RNA
13.
BMC Genomics ; 14: 347, 2013 May 24.
Artigo em Inglês | MEDLINE | ID: mdl-23706020

RESUMO

BACKGROUND: Phenomena such as incomplete lineage sorting, horizontal gene transfer, gene duplication and subsequent sub- and neo-functionalisation can result in distinct local phylogenetic relationships that are discordant with species phylogeny. In order to assess the possible biological roles for these subdivisions, they must first be identified and characterised, preferably on a large scale and in an automated fashion. RESULTS: We developed Saguaro, a combination of a Hidden Markov Model (HMM) and a Self Organising Map (SOM), to characterise local phylogenetic relationships among aligned sequences using cacti, matrices of pair-wise distance measures. While the HMM determines the genomic boundaries from aligned sequences, the SOM hypothesises new cacti in an unsupervised and iterative fashion based on the regions that were modelled least well by existing cacti. After testing the software on simulated data, we demonstrate the utility of Saguaro by testing two different data sets: (i) 181 Dengue virus strains, and (ii) 5 primate genomes. Saguaro identifies regions under lineage-specific constraint for the first set, and genomic segments that we attribute to incomplete lineage sorting in the second dataset. Intriguingly for the primate data, Saguaro also classified an additional ~3% of the genome as most incompatible with the expected species phylogeny. A substantial fraction of these regions was found to overlap genes associated with both the innate and adaptive immune systems. CONCLUSIONS: Saguaro detects distinct cacti describing local phylogenetic relationships without requiring any a priori hypotheses. We have successfully demonstrated Saguaro's utility with two contrasting data sets, one containing many members with short sequences (Dengue viral strains: n = 181, genome size = 10,700 nt), and the other with few members but complex genomes (related primate species: n = 5, genome size = 3 Gb), suggesting that the software is applicable to a wide variety of experimental populations. Saguaro is written in C++, runs on the Linux operating system, and can be downloaded from http://saguarogw.sourceforge.net/.


Assuntos
Genômica/métodos , Algoritmos , Animais , Vírus da Dengue/genética , Surtos de Doenças , Humanos , Imunidade/genética , Cadeias de Markov , Modelos Genéticos , Filogenia , Primatas/genética , Primatas/imunologia , Software , Especificidade da Espécie
14.
Nature ; 484(7392): 55-61, 2012 Apr 04.
Artigo em Inglês | MEDLINE | ID: mdl-22481358

RESUMO

Marine stickleback fish have colonized and adapted to thousands of streams and lakes formed since the last ice age, providing an exceptional opportunity to characterize genomic mechanisms underlying repeated ecological adaptation in nature. Here we develop a high-quality reference genome assembly for threespine sticklebacks. By sequencing the genomes of twenty additional individuals from a global set of marine and freshwater populations, we identify a genome-wide set of loci that are consistently associated with marine-freshwater divergence. Our results indicate that reuse of globally shared standing genetic variation, including chromosomal inversions, has an important role in repeated evolution of distinct marine and freshwater sticklebacks, and in the maintenance of divergent ecotypes during early stages of reproductive isolation. Both coding and regulatory changes occur in the set of loci underlying marine-freshwater evolution, but regulatory changes appear to predominate in this well known example of repeated adaptive evolution in nature.


Assuntos
Adaptação Fisiológica/genética , Evolução Biológica , Genoma/genética , Smegmamorpha/genética , Alaska , Animais , Organismos Aquáticos/genética , Inversão Cromossômica/genética , Cromossomos/genética , Sequência Conservada/genética , Ecótipo , Feminino , Água Doce , Variação Genética/genética , Genômica , Dados de Sequência Molecular , Água do Mar , Análise de Sequência de DNA
15.
Bioeng Bugs ; 3(2): 120-3, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22095054

RESUMO

The promoter is a key element in gene transcription and regulation. We previously reported that artificial sequences rich in the dinucleotide CpG are sufficient to drive expression in vitro in mammalian cell lines, without requiring canonical binding sites for transcription factor proteins. Here, we report that introducing a promoter organization that alternates in CpGs and regions rich in A and T further increases expression strength, as well as how insertion of specific binding sites makes such sequences respond to induced levels of the transcription factor NFκB. Our findings further contribute to the mechanistic understanding of promoters, as well as how these sequences might be shaped by evolutionary pressure in living organisms.


Assuntos
Ilhas de CpG , NF-kappa B/metabolismo , Regiões Promotoras Genéticas , Sequência de Bases , Sítios de Ligação/genética , Linhagem Celular , Fosfatos de Dinucleosídeos/genética , Fosfatos de Dinucleosídeos/metabolismo , Regulação da Expressão Gênica , Células HEK293 , Humanos , Dados de Sequência Molecular , NF-kappa B/genética
16.
PLoS One ; 6(5): e20136, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-21625601

RESUMO

The choice of promoter is a critical step in optimizing the efficiency and stability of recombinant protein production in mammalian cell lines. Artificial promoters that provide stable expression across cell lines and can be designed to the desired strength constitute an alternative to the use of viral promoters. Here, we show how the nucleotide characteristics of highly active human promoters can be modelled via the genome-wide frequency distribution of short motifs: by overlapping motifs that occur infrequently in the genome, we constructed contiguous sequence that is rich in GC and CpGs, both features of known promoters, but lacking homology to real promoters. We show that snippets from this sequence, at 100 base pairs or longer, drive gene expression in vitro in a number of mammalian cells, and are thus candidates for use in protein production. We further show that expression is driven by the general transcription factors TFIIB and TFIID, both being ubiquitously present across cell types, which results in less tissue- and species-specific regulation compared to the viral promoter SV40. We lastly found that the strength of a promoter can be tuned up and down by modulating the counts of GC and CpGs in localized regions. These results constitute a "proof-of-concept" for custom-designing promoters that are suitable for biotechnological and medical applications.


Assuntos
Engenharia Genética , Nucleotídeos/química , Regiões Promotoras Genéticas
17.
Nat Biotechnol ; 29(7): 644-52, 2011 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-21572440

RESUMO

Massively parallel sequencing of cDNA has enabled deep and efficient probing of transcriptomes. Current approaches for transcript reconstruction from such data often rely on aligning reads to a reference genome, and are thus unsuitable for samples with a partial or missing reference genome. Here we present the Trinity method for de novo assembly of full-length transcripts and evaluate it on samples from fission yeast, mouse and whitefly, whose reference genome is not yet available. By efficiently constructing and analyzing sets of de Bruijn graphs, Trinity fully reconstructs a large fraction of transcripts, including alternatively spliced isoforms and transcripts from recently duplicated genes. Compared with other de novo transcriptome assemblers, Trinity recovers more full-length transcripts across a broad range of expression levels, with a sensitivity similar to methods that rely on genome alignments. Our approach provides a unified solution for transcriptome reconstruction in any sample, especially in the absence of a reference genome.


Assuntos
Algoritmos , Perfilação da Expressão Gênica/métodos , RNA/química , RNA/genética , Análise de Sequência de RNA/métodos , Sequência de Bases , Dados de Sequência Molecular , Valores de Referência , Análise de Sequência de RNA/normas , Transcriptoma
18.
Methods Mol Biol ; 722: 1-9, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-21590409

RESUMO

Decoding the genome sequence is becoming a fundamental tool for molecular, genetic, and genomic studies. This chapter reviews the history of DNA sequencing and technical principles of different sequencing platforms, and compares the strengths and weaknesses of different techniques for high-throughput genome sequencing applications are compared. It also covers brief descriptions on genome assembly and its validation.


Assuntos
Mapeamento Cromossômico/métodos , Biologia Computacional/métodos , Genoma Humano/genética , Análise de Sequência de DNA/métodos , Variação Genética/genética , Genômica/métodos , Haplótipos/genética , Humanos
19.
Nat Methods ; 8(6): 469-77, 2011 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-21623353

RESUMO

High-throughput RNA sequencing (RNA-seq) promises a comprehensive picture of the transcriptome, allowing for the complete annotation and quantification of all genes and their isoforms across samples. Realizing this promise requires increasingly complex computational methods. These computational challenges fall into three main categories: (i) read mapping, (ii) transcriptome reconstruction and (iii) expression quantification. Here we explain the major conceptual and practical challenges, and the general classes of solutions for each category. Finally, we highlight the interdependence between these categories and discuss the benefits for different biological applications.


Assuntos
Perfilação da Expressão Gênica/estatística & dados numéricos , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Análise de Sequência de RNA/estatística & dados numéricos , Animais , Biologia Computacional/métodos , Genômica/estatística & dados numéricos , Humanos , Alinhamento de Sequência/estatística & dados numéricos
20.
Proc Natl Acad Sci U S A ; 108(22): 9166-71, 2011 May 31.
Artigo em Inglês | MEDLINE | ID: mdl-21536894

RESUMO

Rust fungi are some of the most devastating pathogens of crop plants. They are obligate biotrophs, which extract nutrients only from living plant tissues and cannot grow apart from their hosts. Their lifestyle has slowed the dissection of molecular mechanisms underlying host invasion and avoidance or suppression of plant innate immunity. We sequenced the 101-Mb genome of Melampsora larici-populina, the causal agent of poplar leaf rust, and the 89-Mb genome of Puccinia graminis f. sp. tritici, the causal agent of wheat and barley stem rust. We then compared the 16,399 predicted proteins of M. larici-populina with the 17,773 predicted proteins of P. graminis f. sp tritici. Genomic features related to their obligate biotrophic lifestyle include expanded lineage-specific gene families, a large repertoire of effector-like small secreted proteins, impaired nitrogen and sulfur assimilation pathways, and expanded families of amino acid and oligopeptide membrane transporters. The dramatic up-regulation of transcripts coding for small secreted proteins, secreted hydrolytic enzymes, and transporters in planta suggests that they play a role in host infection and nutrient acquisition. Some of these genomic hallmarks are mirrored in the genomes of other microbial eukaryotes that have independently evolved to infect plants, indicating convergent adaptation to a biotrophic existence inside plant cells.


Assuntos
Basidiomycota/genética , Fungos/genética , Triticum/microbiologia , Perfilação da Expressão Gênica , Genes Fúngicos , Genoma , Genoma Fúngico , Modelos Genéticos , Nitratos/química , Análise de Sequência com Séries de Oligonucleotídeos , Filogenia , Doenças das Plantas/microbiologia , Folhas de Planta/microbiologia , Análise de Sequência de DNA , Sulfatos/química
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...