Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
BMC Genomics ; 18(1): 326, 2017 04 26.
Artigo em Inglês | MEDLINE | ID: mdl-28441938

RESUMO

BACKGROUND: Mitochondrial dysfunction is linked to numerous pathological states, in particular related to metabolism, brain health and ageing. Nuclear encoded gene polymorphisms implicated in mitochondrial functions can be analyzed in the context of classical genome wide association studies. By contrast, mitochondrial DNA (mtDNA) variants are more challenging to identify and analyze for several reasons. First, contrary to the diploid nuclear genome, each cell carries several hundred copies of the circular mitochondrial genome. Mutations can therefore be present in only a subset of the mtDNA molecules, resulting in a heterogeneous pool of mtDNA, a situation referred to as heteroplasmy. Consequently, detection and quantification of variants requires extremely accurate tools, especially when this proportion is small. Additionally, the mitochondrial genome has pseudogenized into numerous copies within the nuclear genome over the course of evolution. These nuclear pseudogenes, named NUMTs, must be distinguished from genuine mtDNA sequences and excluded from the analysis. RESULTS: Here we describe a novel method, named MitoRS, in which the entire mitochondrial genome is amplified in a single reaction using rolling circle amplification. This approach is easier to setup and of higher throughput when compared to classical PCR amplification. Sequencing libraries are generated at high throughput exploiting a tagmentation-based method. Fine-tuned parameters are finally applied in the analysis to allow detection of variants even of low frequency heteroplasmy. The method was thoroughly benchmarked in a set of experiments designed to demonstrate its robustness, accuracy and sensitivity. The MitoRS method requires 5 ng total DNA as starting material. More than 96 samples can be processed in less than a day of laboratory work and sequenced in a single lane of an Illumina HiSeq flow cell. The lower limit for accurate quantification of single nucleotide variants has been measured at 1% frequency. CONCLUSIONS: The MitoRS method enables the robust, accurate, and sensitive analysis of a large number of samples. Because it is cost effective and simple to setup, we anticipate this method will promote the analysis of mtDNA variants in large cohorts, and may help assessing the impact of mtDNA heteroplasmy on metabolic health, brain function, cancer progression, or ageing.


Assuntos
DNA Mitocondrial/análise , Técnicas de Amplificação de Ácido Nucleico/métodos , DNA Mitocondrial/metabolismo , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Mutação INDEL , Polimorfismo de Nucleotídeo Único , Reação em Cadeia da Polimerase em Tempo Real , Análise de Sequência de DNA
2.
Mol Biol Evol ; 29(11): 3371-84, 2012 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-22628532

RESUMO

Most fungal plant pathogens secrete effector proteins during pathogenesis to manipulate their host's defense and promote disease. These are so highly diverse in sequence and distribution, they are essentially considered as species-specific. However, we have recently shown the presence of homologous effectors in fungal species of the Dothideomycetes class. One such example is Ecp2, an effector originally described in the tomato pathogen Cladosporium fulvum but later detected in the plant pathogenic fungi Mycosphaerella fijiensis and Mycosphaerella graminicola as well. Here, using in silico sequence-similarity searches against a database of 135 fungal genomes and GenBank, we extend our queries for homologs of Ecp2 to the fungal kingdom and beyond, and further study their history of diversification. Our analyses show that Ecp2 homologs are members of an ancient and widely distributed superfamily of putative fungal effectors, which we term Hce2 for Homologs of C. fulvum Ecp2. Molecular evolutionary analyses show that the superfamily originated and diversified within the fungal kingdom, experiencing multiple lineage-specific expansions and losses that are consistent with the birth-and-death model of gene family evolution. Newly formed paralogs appear to be subject to diversification early after gene duplication events, whereas at later stages purifying selection acts to preserve diversity and the newly evolved putative functions. Some members of the Hce2 superfamily are fused to fungal Glycoside Hydrolase family 18 chitinases that show high similarity to the Zymocin killer toxin from the dairy yeast Kluyveromyces lactis, suggesting an analogous role in antagonistic interactions. The observed high rates of gene duplication and loss in the Hce2 superfamily, combined with diversification in both sequence and possibly functions within and between species, suggest that Hce2s are involved in adaptation to stresses and new ecological niches. Such findings address the need to rationalize effector biology and evolution beyond the perspective of solely host-microbe interactions.


Assuntos
Biologia Computacional/métodos , Evolução Molecular , Proteínas Fúngicas/genética , Família Multigênica , Sequência de Aminoácidos , Proteínas Fúngicas/química , Fungos/classificação , Fungos/genética , Duplicação Gênica/genética , Especiação Genética , Genoma Fúngico/genética , Modelos Genéticos , Anotação de Sequência Molecular , Dados de Sequência Molecular , Filogenia , Estrutura Terciária de Proteína , Especificidade da Espécie
3.
Plant J ; 69(3): 475-88, 2012 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-21967390

RESUMO

Sireviruses are one of the three genera of Copia long terminal repeat (LTR) retrotransposons, exclusive to and highly abundant in plants, and with a unique, among retrotransposons, genome structure. Yet, perhaps due to the few references to the Sirevirus origin of some families, compounded by the difficulty in correctly assigning retrotransposon families into genera, Sireviruses have hardly featured in recent research. As a result, analysis at this key level of classification and details of their colonization and impact on plant genomes are currently lacking. Recently, however, it became possible to accurately assign elements from diverse families to this genus in one step, based on highly conserved sequence motifs. Hence, Sirevirus dynamics in the relatively obese maize genome can now be comprehensively studied. Overall, we identified >10 600 intact and approximately 28 000 degenerate Sirevirus elements from a plethora of families, some brought into the genus for the first time. Sireviruses make up approximately 90% of the Copia population and it is the only genus that has successfully infiltrated the genome, possibly by experiencing intense amplification during the last 600 000 years, while being constantly recycled by host mechanisms. They accumulate in chromosome-distal gene-rich areas, where they insert in between gene islands, mainly in preferred zones within their own genomes. Sirevirus LTRs are heavily methylated, while there is evidence for a palindromic consensus target sequence. This work brings Sireviruses in the spotlight, elucidating their lifestyle and history, and suggesting their crucial role in the current genomic make-up of maize, and possibly other plant hosts.


Assuntos
Evolução Molecular , Genoma de Planta , Retroelementos , Zea mays/genética , Algoritmos , Metilação de DNA , DNA de Plantas/genética , Filogenia , Análise de Sequência de DNA
4.
Plant Physiol ; 155(1): 271-81, 2011 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-21098674

RESUMO

Although Arabidopsis (Arabidopsis thaliana) is the best studied plant species, the biological role of one-third of its proteins is still unknown. We developed a probabilistic protein function prediction method that integrates information from sequences, protein-protein interactions, and gene expression. The method was applied to proteins from Arabidopsis. Evaluation of prediction performance showed that our method has improved performance compared with single source-based prediction approaches and two existing integration approaches. An innovative feature of our method is that it enables transfer of functional information between proteins that are not directly associated with each other. We provide novel function predictions for 5,807 proteins. Recent experimental studies confirmed several of the predictions. We highlight these in detail for proteins predicted to be involved in flowering and floral organ development.


Assuntos
Proteínas de Arabidopsis/genética , Proteínas de Arabidopsis/metabolismo , Arabidopsis/genética , Biologia Computacional/métodos , Bases de Dados Genéticas , Genoma de Planta/genética , Animais , Área Sob a Curva , Teorema de Bayes , Flores/embriologia , Flores/genética , Cadeias de Markov , Modelos Genéticos , Anotação de Sequência Molecular , Organogênese/genética , Reprodutibilidade dos Testes
5.
PLoS One ; 5(12): e14147, 2010 Dec 20.
Artigo em Inglês | MEDLINE | ID: mdl-21188141

RESUMO

A major challenge in the field of systems biology consists of predicting gene regulatory networks based on different training data. Within the DREAM4 initiative, we took part in the multifactorial sub-challenge that aimed to predict gene regulatory networks of size 100 from training data consisting of steady-state levels obtained after applying multifactorial perturbations to the original in silico network. Due to the static character of the challenge data, we tackled the problem via a sparse Gaussian Markov Random Field, which relates network topology with the covariance inverse generated by the gene measurements. As for the computations, we used the Graphical Lasso algorithm which provided a large range of candidate network topologies. The main task was to select the optimal network topology and for that, different model selection criteria were explored. The selected networks were compared with the golden standards and the results ranked using the scoring metrics applied in the challenge, giving a better insight in our submission and the way to improve it.Our approach provides an easy statistical and computational framework to infer gene regulatory networks that is suitable for large networks, even if the number of the observations (perturbations) is greater than the number of variables (genes).


Assuntos
Redes Reguladoras de Genes , Algoritmos , Teorema de Bayes , Biologia Computacional/métodos , Perfilação da Expressão Gênica , Cadeias de Markov , Modelos Estatísticos , Análise Multivariada , Distribuição Normal , Reprodutibilidade dos Testes
6.
PLoS One ; 5(2): e9293, 2010 Feb 24.
Artigo em Inglês | MEDLINE | ID: mdl-20195360

RESUMO

Inference of protein functions is one of the most important aims of modern biology. To fully exploit the large volumes of genomic data typically produced in modern-day genomic experiments, automated computational methods for protein function prediction are urgently needed. Established methods use sequence or structure similarity to infer functions but those types of data do not suffice to determine the biological context in which proteins act. Current high-throughput biological experiments produce large amounts of data on the interactions between proteins. Such data can be used to infer interaction networks and to predict the biological process that the protein is involved in. Here, we develop a probabilistic approach for protein function prediction using network data, such as protein-protein interaction measurements. We take a Bayesian approach to an existing Markov Random Field method by performing simultaneous estimation of the model parameters and prediction of protein functions. We use an adaptive Markov Chain Monte Carlo algorithm that leads to more accurate parameter estimates and consequently to improved prediction performance compared to the standard Markov Random Fields method. We tested our method using a high quality S. cereviciae validation network with 1622 proteins against 90 Gene Ontology terms of different levels of abstraction. Compared to three other protein function prediction methods, our approach shows very good prediction performance. Our method can be directly applied to protein-protein interaction or coexpression networks, but also can be extended to use multiple data sources. We apply our method to physical protein interaction data from S. cerevisiae and provide novel predictions, using 340 Gene Ontology terms, for 1170 unannotated proteins and we evaluate the predictions using the available literature.


Assuntos
Teorema de Bayes , Cadeias de Markov , Mapeamento de Interação de Proteínas/métodos , Proteínas/metabolismo , Algoritmos , Bases de Dados de Proteínas , Redes Reguladoras de Genes , Método de Monte Carlo , Ligação Proteica , Proteínas/genética , Reprodutibilidade dos Testes , Proteínas de Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/metabolismo
7.
In Silico Biol ; 7(6): 575-82, 2007.
Artigo em Inglês | MEDLINE | ID: mdl-18467770

RESUMO

The Gene Ontology (GO) is a widely used controlled vocabulary for the description of gene function. In this study we quantify the usage of multiple and hierarchically independent GO terms in the curated genome annotations of seven well-studied species. In most genomes, significant proportions (6-60%) of genes have been annotated with multiple and hierarchically independent terms. This may be necessary to attain adequate specificity of description. One noticeable exception is Arabidopsis thaliana, in which genes are much less frequently annotated with multiple terms (6-14%). In contrast, an analysis of the occurrence of InterPro hits in the proteomes of the seven species, followed by a mapping of the hits to GO terms, did not reveal an aberrant pattern for the A. thaliana genome. This study shows the widespread usage of multiple hierarchically independent GO terms in the functional annotation of genes. By consequence, probabilistic methods that aim to predict gene function automatically through integration of diverse genomic datasets, and that employ the GO, must be able to predict such multiple terms. We attribute the low frequency with which multiple GO terms are used in Arabidopsis to deviating practices in the genome annotation and curation process between communities of annotators. This may bias genome-scale comparisons of gene function between different species. GO term assignment should therefore be performed according to strictly similar rules and standards.


Assuntos
Regulação da Expressão Gênica , Genes/fisiologia , Genoma , Modelos Genéticos , Animais , Arabidopsis/genética , Genoma de Planta , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA