Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Mais filtros

Base de dados
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Brief Bioinform ; 21(2): 458-472, 2020 03 23.
Artigo em Inglês | MEDLINE | ID: mdl-30698641

RESUMO

There are multiple definitions for low complexity regions (LCRs) in protein sequences, with all of them broadly considering LCRs as regions with fewer amino acid types compared to an average composition. Following this view, LCRs can also be defined as regions showing composition bias. In this critical review, we focus on the definition of sequence complexity of LCRs and their connection with structure. We present statistics and methodological approaches that measure low complexity (LC) and related sequence properties. Composition bias is often associated with LC and disorder, but repeats, while compositionally biased, might also induce ordered structures. We illustrate this dichotomy, and more generally the overlaps between different properties related to LCRs, using examples. We argue that statistical measures alone cannot capture all structural aspects of LCRs and recommend the combined usage of a variety of predictive tools and measurements. While the methodologies available to study LCRs are already very advanced, we foresee that a more comprehensive annotation of sequences in the databases will enable the improvement of predictions and a better understanding of the evolution and the connection between structure and function of LCRs. This will require the use of standards for the generation and exchange of data describing all aspects of LCRs. SHORT ABSTRACT: There are multiple definitions for low complexity regions (LCRs) in protein sequences. In this critical review, we focus on the definition of sequence complexity of LCRs and their connection with structure. We present statistics and methodological approaches that measure low complexity (LC) and related sequence properties. Composition bias is often associated with LC and disorder, but repeats, while compositionally biased, might also induce ordered structures. We illustrate this dichotomy, plus overlaps between different properties related to LCRs, using examples.


Assuntos
Proteínas/química , Algoritmos , Sequência de Aminoácidos , Bases de Dados de Proteínas , Evolução Molecular , Conformação Proteica , Domínios Proteicos
2.
Int J Mol Sci ; 23(24)2022 Dec 14.
Artigo em Inglês | MEDLINE | ID: mdl-36555557

RESUMO

Several types of haemoglobinopathies are caused by copy number variants (CNVs). While diagnosis is often based on haematological and biochemical parameters, a definitive diagnosis requires molecular DNA analysis. In some cases, the molecular characterisation of large deletions/duplications is challenging and inconclusive and often requires the use of specific diagnostic procedures, such as multiplex ligation-dependent probe amplification (MLPA). Herein, we collected and comprehensively analysed all known CNVs associated with haemoglobinopathies. The dataset of 291 CNVs was retrieved from the IthaGenes database and was further manually annotated to specify genomic locations, breakpoints and MLPA probes relevant for each CNV. We developed IthaCNVs, a publicly available and easy-to-use online tool that can facilitate the diagnosis of rare and diagnostically challenging haemoglobinopathy cases attributed to CNVs. Importantly, it facilitates the filtering of available entries based on the type of breakpoint information, on specific chromosomal and locus positions, on MLPA probes, and on affected gene(s). IthaCNVs brings together manually curated information about CNV genomic locations, functional effects, and information that can facilitate CNV characterisation through MLPA. It can help laboratory staff and clinicians confirm suspected diagnosis of CNVs based on molecular DNA screening and analysis.


Assuntos
Variações do Número de Cópias de DNA , Genoma , Humanos , Variações do Número de Cópias de DNA/genética , Reação em Cadeia da Polimerase Multiplex/métodos , DNA , Genômica
3.
Glycobiology ; 29(5): 385-396, 2019 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-30835280

RESUMO

Despite the controversy regarding the importance of protein N-linked glycosylation in species of the genus Plasmodium, genes potentially encoding core subunits of the oligosaccharyltransferase (OST) complex have already been characterized in completely sequenced genomes of malaria parasites. Nevertheless, the currently established notion is that only four out of eight subunits of the OST complex-which is considered conserved across eukaryotes-are present in Plasmodium species. In this study, we carefully conduct computational analysis to provide unequivocal evidence that all components of the OST complex, with the exception of Swp1/Ribophorin II, can be reliably identified within completely sequenced plasmodial genomes. In fact, most of the subunits currently considered as absent from Plasmodium refer to uncharacterized protein sequences already existing in sequence databases. Interestingly, the main reason why the unusually short Ost4 subunit (36 residues long in yeast) has not been identified so far in plasmodia (and possibly other species) is the failure of gene-prediction pipelines to detect such a short coding sequence. We further identify elusive OST subunits in select protist species with completely sequenced genomes. Thus, our work highlights the necessity of a systematic approach towards the characterization of OST subunits across eukaryotes. This is necessary both for obtaining a concrete picture of the evolution of the OST complex but also for elucidating its possible role in eukaryotic pathogens.


Assuntos
Biologia Computacional , Hexosiltransferases/metabolismo , Proteínas de Membrana/metabolismo , Plasmodium/enzimologia , Animais , Bases de Dados de Proteínas , Drosophila melanogaster , Eucariotos/metabolismo , Humanos , Camundongos
4.
Elife ; 112022 12 01.
Artigo em Inglês | MEDLINE | ID: mdl-36453528

RESUMO

Haemoglobinopathies are the commonest monogenic diseases worldwide and are caused by variants in the globin gene clusters. With over 2400 variants detected to date, their interpretation using the American College of Medical Genetics and Genomics (ACMG)/Association for Molecular Pathology (AMP) guidelines is challenging and computational evidence can provide valuable input about their functional annotation. While many in silico predictors have already been developed, their performance varies for different genes and diseases. In this study, we evaluate 31 in silico predictors using a dataset of 1627 variants in HBA1, HBA2, and HBB. By varying the decision threshold for each tool, we analyse their performance (a) as binary classifiers of pathogenicity and (b) by using different non-overlapping pathogenic and benign thresholds for their optimal use in the ACMG/AMP framework. Our results show that CADD, Eigen-PC, and REVEL are the overall top performers, with the former reaching moderate strength level for pathogenic prediction. Eigen-PC and REVEL achieve the highest accuracies for missense variants, while CADD is also a reliable predictor of non-missense variants. Moreover, SpliceAI is the top performing splicing predictor, reaching strong level of evidence, while GERP++ and phyloP are the most accurate conservation tools. This study provides evidence about the optimal use of computational tools in globin gene clusters under the ACMG/AMP framework.


Assuntos
Genômica , Nucleotídeos , Humanos , Patologia Molecular , Universidades
5.
Sci Rep ; 10(1): 9505, 2020 06 11.
Artigo em Inglês | MEDLINE | ID: mdl-32528034

RESUMO

To assess the role of core metabolism genes in bacterial virulence - independently of their effect on growth - we correlated the genome, the transcriptome and the pathogenicity in flies and mice of 30 fully sequenced Pseudomonas strains. Gene presence correlates robustly with pathogenicity differences among all Pseudomonas species, but not among the P. aeruginosa strains. However, gene expression differences are evident between highly and lowly pathogenic P. aeruginosa strains in multiple virulence factors and a few metabolism genes. Moreover, 16.5%, a noticeable fraction of the core metabolism genes of P. aeruginosa strain PA14 (compared to 8.5% of the non-metabolic genes tested), appear necessary for full virulence when mutated. Most of these virulence-defective core metabolism mutants are compromised in at least one key virulence mechanism independently of auxotrophy. A pathway level analysis of PA14 core metabolism, uncovers beta-oxidation and the biosynthesis of amino-acids, succinate, citramalate, and chorismate to be important for full virulence. Strikingly, the relative expression among P. aeruginosa strains of genes belonging in these metabolic pathways is indicative of their pathogenicity. Thus, P. aeruginosa strain-to-strain virulence variation, remains largely obscure at the genome level, but can be dissected at the pathway level via functional transcriptomics of core metabolism.


Assuntos
Pseudomonas aeruginosa/metabolismo , Pseudomonas aeruginosa/patogenicidade , Animais , Regulação Bacteriana da Expressão Gênica , Genes Bacterianos/genética , Interações Hospedeiro-Patógeno , Masculino , Mutação , Pseudomonas aeruginosa/genética , Pseudomonas aeruginosa/crescimento & desenvolvimento , Virulência
6.
J Clin Med ; 8(11)2019 Nov 09.
Artigo em Inglês | MEDLINE | ID: mdl-31717530

RESUMO

Haemoglobinopathies are common monogenic disorders with diverse clinical manifestations, partly attributed to the influence of modifier genes. Recent years have seen enormous growth in the amount of genetic data, instigating the need for ranking methods to identify candidate genes with strong modifying effects. Here, we present the first evidence-based gene ranking metric (IthaScore) for haemoglobinopathy-specific phenotypes by utilising curated data in the IthaGenes database. IthaScore successfully reflects current knowledge for well-established disease modifiers, while it can be dynamically updated with emerging evidence. Protein-protein interaction (PPI) network analysis and functional enrichment analysis were employed to identify new potential disease modifiers and to evaluate the biological profiles of selected phenotypes. The most relevant gene ontology (GO) and pathway gene annotations for (a) haemoglobin (Hb) F levels/Hb F response to hydroxyurea included urea cycle, arginine metabolism and vascular endothelial growth factor receptor (VEGFR) signalling, (b) response to iron chelators included xenobiotic metabolism and glucuronidation, and (c) stroke included cytokine signalling and inflammatory reactions. Our findings demonstrate the capacity of IthaGenes, together with dynamic gene ranking, to expand knowledge on the genetic and molecular basis of phenotypic variation in haemoglobinopathies and to identify additional candidate genes to potentially inform and improve diagnosis, prognosis and therapeutic management.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA