Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 83
Filtrar
Más filtros












Base de datos
Intervalo de año de publicación
1.
EBioMedicine ; 103: 105111, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38583260

RESUMEN

BACKGROUND: Lynch syndrome (LS) is one of the most common hereditary cancer syndromes worldwide. Dominantly inherited mutation in one of four DNA mismatch repair genes combined with somatic events leads to mismatch repair deficiency and microsatellite instability (MSI) in tumours. Due to a high lifetime risk of cancer, regular surveillance plays a key role in cancer prevention; yet the observation of frequent interval cancers points to insufficient cancer prevention by colonoscopy-based methods alone. This study aimed to identify precancerous functional changes in colonic mucosa that could facilitate the monitoring and prevention of cancer development in LS. METHODS: The study material comprised colon biopsy specimens (n = 71) collected during colonoscopy examinations from LS carriers (tumour-free, or diagnosed with adenoma, or diagnosed with carcinoma) and a control group, which included sporadic cases without LS or neoplasia. The majority (80%) of LS carriers had an inherited genetic MLH1 mutation. The remaining 20% included MSH2 mutation carriers (13%) and MSH6 mutation carriers (7%). The transcriptomes were first analysed with RNA-sequencing and followed up with Gorilla Ontology analysis and Reactome Knowledgebase and Ingenuity Pathway Analyses to detect functional changes that might be associated with the initiation of the neoplastic process in LS individuals. FINDINGS: With pathway and gene ontology analyses combined with measurement of mitotic perimeters from colonic mucosa and tumours, we found an increased tendency to chromosomal instability (CIN), already present in macroscopically normal LS mucosa. Our results suggest that CIN is an earlier aberration than MSI and may be the initial cancer driving aberration, whereas MSI accelerates tumour formation. Furthermore, our results suggest that MLH1 deficiency plays a significant role in the development of CIN. INTERPRETATION: The results validate our previous findings from mice and highlight early mitotic abnormalities as an important contributor and precancerous marker of colorectal tumourigenesis in LS. FUNDING: This work was supported by grants from the Jane and Aatos Erkko Foundation, the Academy of Finland (330606 and 331284), Cancer Foundation Finland sr, and the Sigrid Jusélius Foundation. Open access is funded by Helsinki University Library.


Asunto(s)
Neoplasias Colorrectales Hereditarias sin Poliposis , Inestabilidad de Microsatélites , Mitosis , Humanos , Neoplasias Colorrectales Hereditarias sin Poliposis/genética , Neoplasias Colorrectales Hereditarias sin Poliposis/patología , Neoplasias Colorrectales Hereditarias sin Poliposis/complicaciones , Femenino , Masculino , Mitosis/genética , Persona de Mediana Edad , Mutación , Adulto , Anciano , Homólogo 1 de la Proteína MutL/genética , Perfilación de la Expresión Génica , Neoplasias Colorrectales/genética , Neoplasias Colorrectales/patología , Neoplasias Colorrectales/etiología , Carcinogénesis/genética , Reparación de la Incompatibilidad de ADN/genética , Transcriptoma
2.
Mol Cell ; 83(18): 3360-3376.e11, 2023 09 21.
Artículo en Inglés | MEDLINE | ID: mdl-37699397

RESUMEN

Aging is associated with progressive phenotypic changes. Virtually all cellular phenotypes are produced by proteins, and their structural alterations can lead to age-related diseases. However, we still lack comprehensive knowledge of proteins undergoing structural-functional changes during cellular aging and their contributions to age-related phenotypes. Here, we conducted proteome-wide analysis of early age-related protein structural changes in budding yeast using limited proteolysis-mass spectrometry (LiP-MS). The results, compiled in online ProtAge catalog, unraveled age-related functional changes in regulators of translation, protein folding, and amino acid metabolism. Mechanistically, we found that folded glutamate synthase Glt1 polymerizes into supramolecular self-assemblies during aging, causing breakdown of cellular amino acid homeostasis. Inhibiting Glt1 polymerization by mutating the polymerization interface restored amino acid levels in aged cells, attenuated mitochondrial dysfunction, and led to lifespan extension. Altogether, this comprehensive map of protein structural changes enables identifying mechanisms of age-related phenotypes and offers opportunities for their reversal.


Asunto(s)
Senescencia Celular , Longevidad , Longevidad/genética , Polimerizacion , Aminoácidos
3.
Nature ; 615(7953): 652-659, 2023 03.
Artículo en Inglés | MEDLINE | ID: mdl-36890232

RESUMEN

Increasing the proportion of locally produced plant protein in currently meat-rich diets could substantially reduce greenhouse gas emissions and loss of biodiversity1. However, plant protein production is hampered by the lack of a cool-season legume equivalent to soybean in agronomic value2. Faba bean (Vicia faba L.) has a high yield potential and is well suited for cultivation in temperate regions, but genomic resources are scarce. Here, we report a high-quality chromosome-scale assembly of the faba bean genome and show that it has expanded to a massive 13 Gb in size through an imbalance between the rates of amplification and elimination of retrotransposons and satellite repeats. Genes and recombination events are evenly dispersed across chromosomes and the gene space is remarkably compact considering the genome size, although with substantial copy number variation driven by tandem duplication. Demonstrating practical application of the genome sequence, we develop a targeted genotyping assay and use high-resolution genome-wide association analysis to dissect the genetic basis of seed size and hilum colour. The resources presented constitute a genomics-based breeding platform for faba bean, enabling breeders and geneticists to accelerate the improvement of sustainable protein production across the Mediterranean, subtropical and northern temperate agroecological zones.


Asunto(s)
Productos Agrícolas , Diploidia , Variación Genética , Genoma de Planta , Genómica , Fitomejoramiento , Proteínas de Plantas , Vicia faba , Cromosomas de las Plantas/genética , Productos Agrícolas/genética , Productos Agrícolas/metabolismo , Variaciones en el Número de Copia de ADN/genética , ADN Satélite/genética , Amplificación de Genes/genética , Genes de Plantas/genética , Variación Genética/genética , Genoma de Planta/genética , Estudio de Asociación del Genoma Completo , Geografía , Fitomejoramiento/métodos , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Recombinación Genética , Retroelementos/genética , Semillas/anatomía & histología , Semillas/genética , Vicia faba/anatomía & histología , Vicia faba/genética , Vicia faba/metabolismo
4.
Curr Biol ; 33(6): 1009-1018.e7, 2023 03 27.
Artículo en Inglés | MEDLINE | ID: mdl-36822202

RESUMEN

In the face of the human-caused biodiversity crisis, understanding the theoretical basis of conservation efforts of endangered species and populations has become increasingly important. According to population genetics theory, population subdivision helps organisms retain genetic diversity, crucial for adaptation in a changing environment. Habitat topography is thought to be important for generating and maintaining population subdivision, but empirical cases are needed to test this assumption. We studied Saimaa ringed seals, landlocked in a labyrinthine lake and recovering from a drastic bottleneck, with additional samples from three other ringed seal subspecies. Using whole-genome sequences of 145 seals, we analyzed the distribution of variation and genetic relatedness among the individuals in relation to the habitat shape. Despite a severe history of genetic bottlenecks with prevalent homozygosity in Saimaa ringed seals, we found evidence for the population structure mirroring the subregions of the lake. Our genome-wide analyses showed that the subpopulations had retained unique variation and largely complementary patterns of homozygosity, highlighting the significance of habitat connectivity in conservation biology and the power of genomic tools in understanding its impact. The central role of the population substructure in preserving genetic diversity at the metapopulation level was confirmed by simulations. Integration of genetic analyses in conservation decisions gives hope to Saimaa ringed seals and other endangered species in fragmented habitats.


Asunto(s)
Caniformia , Phocidae , Animales , Humanos , Estudio de Asociación del Genoma Completo , Genética de Población , Ecosistema , Phocidae/genética , Especies en Peligro de Extinción , Caniformia/genética , Variación Genética
5.
Protein Sci ; 32(1): e4519, 2023 01.
Artículo en Inglés | MEDLINE | ID: mdl-36419248

RESUMEN

Structural comparison reveals remote homology that often fails to be detected by sequence comparison. The DALI web server (http://ekhidna2.biocenter.helsinki.fi/dali) is a platform for structural analysis that provides database searches and interactive visualization, including structural alignments annotated with secondary structure, protein families and sequence logos, and 3D structure superimposition supported by color-coded sequence and structure conservation. Here, we are using DALI to mine the AlphaFold Database version 1, which increased the structural coverage of protein families by 20%. We found 100 remote homologous relationships hitherto unreported in the current reference database for protein domains, Pfam 35.0. In particular, we linked 35 domains of unknown function (DUFs) to the previously characterized families, generating a functional hypothesis that can be explored downstream in structural biology studies. Other findings include gene fusions, tandem duplications, and adjustments to domain boundaries. The evidence for homology can be browsed interactively through live examples on DALI's website.


Asunto(s)
Proteínas , Bases de Datos de Proteínas , Alineación de Secuencia , Proteínas/química , Dominios Proteicos , Estructura Secundaria de Proteína
6.
PLoS Comput Biol ; 18(6): e1010249, 2022 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-35679225

RESUMEN

[This corrects the article DOI: 10.1371/journal.pcbi.1007419.].

7.
Nucleic Acids Res ; 50(W1): W210-W215, 2022 07 05.
Artículo en Inglés | MEDLINE | ID: mdl-35610055

RESUMEN

Protein structure is key to understanding biological function. Structure comparison deciphers deep phylogenies, providing insight into functional conservation and functional shifts during evolution. Until recently, structural coverage of the protein universe was limited by the cost and labour involved in experimental structure determination. Recent breakthroughs in deep learning revolutionized structural bioinformatics by providing accurate structural models of numerous protein families for which no structural information existed. The Dali server for 3D protein structure comparison is widely used by crystallographers to relate new structures to pre-existing ones. Here, we report two most recent upgrades to the web server: (i) the foldomes of key organisms in the AlphaFold Database (version 1) are searchable by Dali, (ii) structural alignments are annotated with protein families. Using these new features, we discovered a novel functionally diverse subgroup within the WRKY/GCM1 clan. This was accomplished by linking the structurally characterized SWI/SNF and NAM families as well as the structural models of the CG-1 family and uncharacterized proteins to the structure of Gti1/Pac2, a previously known member of the WRKY/GCM1 clan. The Dali server is available at http://ekhidna2.biocenter.helsinki.fi/dali. This website is free and open to all users and there is no login requirement.


Asunto(s)
Bases de Datos de Proteínas , Proteínas , Programas Informáticos , Computadores , Internet , Proteínas/química , Conformación Proteica
8.
Protein Sci ; 31(1): 118-128, 2022 01.
Artículo en Inglés | MEDLINE | ID: mdl-34562305

RESUMEN

The facility of next-generation sequencing has led to an explosion of gene catalogs for novel genomes, transcriptomes and metagenomes, which are functionally uncharacterized. Computational inference has emerged as a necessary substitute for first-hand experimental evidence. PANNZER (Protein ANNotation with Z-scoRE) is a high-throughput functional annotation web server that stands out among similar publically accessible web servers in supporting submission of up to 100,000 protein sequences at once and providing both Gene Ontology (GO) annotations and free text description predictions. Here, we demonstrate the use of PANNZER and discuss future plans and challenges. We present two case studies to illustrate problems related to data quality and method evaluation. Some commonly used evaluation metrics and evaluation datasets promote methods that favor unspecific and broad functional classes over more informative and specific classes. We argue that this can bias the development of automated function prediction methods. The PANNZER web server and source code are available at http://ekhidna2.biocenter.helsinki.fi/sanspanz/.


Asunto(s)
Algoritmos , Biología Computacional , Bases de Datos de Proteínas , Anotación de Secuencia Molecular , Proteínas , Programas Informáticos , Proteínas/química , Proteínas/genética
10.
Biosci Rep ; 40(7)2020 07 31.
Artículo en Inglés | MEDLINE | ID: mdl-32583859

RESUMEN

Smoking as a major risk factor for morbidity affects numerous regulatory systems of the human body including DNA methylation. Most of the previous studies with genome-wide methylation data are based on conventional association analysis and earliest threshold-based gene set analysis that lacks sensitivity to be able to reveal all the relevant effects of smoking. The aim of the present study was to investigate the impact of active smoking on DNA methylation at three biological levels: 5'-C-phosphate-G-3' (CpG) sites, genes and functionally related genes (gene sets). Gene set analysis was done with mGSZ, a modern threshold-free method previously developed by us that utilizes all the genes in the experiment and their differential methylation scores. Application of such method in DNA methylation study is novel. Epigenome-wide methylation levels were profiled from Young Finns Study (YFS) participants' whole blood from 2011 follow-up using Illumina Infinium HumanMethylation450 BeadChips. We identified three novel smoking related CpG sites and replicated 57 of the previously identified ones. We found that smoking is associated with hypomethylation in shore (genomic regions 0-2 kilobases from CpG island). We identified smoking related methylation changes in 13 gene sets with false discovery rate (FDR) ≤ 0.05, among which is olfactory receptor activity, the flagship novel finding of the present study. Overall, we extended the current knowledge by identifying: (i) three novel smoking related CpG sites, (ii) similar effects as aging on average methylation in shore, and (iii) a novel finding that olfactory receptor activity pathway responds to tobacco smoke and toxin exposure through epigenetic mechanisms.


Asunto(s)
Fumar Cigarrillos/efectos adversos , Metilación de ADN , Epigénesis Genética , Adulto , Envejecimiento/genética , Fumar Cigarrillos/sangre , Fumar Cigarrillos/genética , Islas de CpG/genética , Epigenoma/genética , Femenino , Finlandia , Estudios de Seguimiento , Estudio de Asociación del Genoma Completo , Humanos , Estudios Longitudinales , Masculino , Persona de Mediana Edad , No Fumadores , Estudios Prospectivos , Receptores Odorantes/metabolismo , Transducción de Señal/genética , Olfato/genética , Humo/efectos adversos , Fumadores , Nicotiana/efectos adversos
11.
Methods Mol Biol ; 2112: 29-42, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32006276

RESUMEN

The exponential growth in the number of newly solved protein structures makes correlating and classifying the data an important task. Distance matrix alignment (Dali) is used routinely by crystallographers worldwide to screen the database of known structures for similarity to newly determined structures. Dali is easily accessible through the web server ( http://ekhidna.biocenter.helsinki.fi/dali ). Alternatively, the program may be downloaded and pairwise comparisons performed locally on Linux computers.


Asunto(s)
Conformación Proteica , Proteínas/química , Análisis de Secuencia de Proteína , Homología Estructural de Proteína , Algoritmos , Bases de Datos de Proteínas , Alineación de Secuencia , Programas Informáticos
12.
Virus Evol ; 6(2): veaa091, 2020 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-33408878

RESUMEN

The study of the microbiome data holds great potential for elucidating the biological and metabolic functioning of living organisms and their role in the environment. Metagenomic analyses have shown that humans, along with for example, domestic animals, wildlife and arthropods, are colonized by an immense community of viruses. The current Coronavirus pandemic (COVID-19) heightens the need to rapidly detect previously unknown viruses in an unbiased way. The increasing availability of metagenomic data in this era of next-generation sequencing (NGS), along with increasingly affordable sequencing technologies, highlight the need for reliable and comprehensive methods to manage such data. In this article, we present a novel bioinformatics pipeline called LAZYPIPE for identifying both previously known and novel viruses in host associated or environmental samples and give examples of virus discovery based on it. LAZYPIPE is a Unix-based pipeline for automated assembling and taxonomic profiling of NGS libraries implemented as a collection of C++, Perl, and R scripts.

13.
Protein Sci ; 29(1): 128-140, 2020 01.
Artículo en Inglés | MEDLINE | ID: mdl-31606894

RESUMEN

DALI is a popular resource for comparing protein structures. The software is based on distance-matrix alignment. The associated web server provides tools to navigate, integrate and organize some data pushed out by genomics and structural genomics. The server has been running continuously for the past 25 years. Structural biologists routinely use DALI to compare a new structure against previously known protein structures. If significant similarities are discovered, it may indicate a distant homology, that is, that the structures are of shared origin. This may be significant in determining the molecular mechanisms, as these may remain very similar from a distant predecessor to the present day, for example, from the last common ancestor of humans and bacteria. Meta-analysis of independent reference-based evaluations of alignment accuracy and fold discrimination shows DALI at top rank in six out of 12 studies. The web server and standalone software are available from http://ekhidna2.biocenter.helsinki.fi/dali.


Asunto(s)
Biología Computacional/métodos , Proteínas/química , Proteínas/genética , Bases de Datos de Proteínas , Humanos , Internet , Modelos Moleculares , Conformación Proteica , Pliegue de Proteína , Análisis de Secuencia de Proteína , Programas Informáticos , Homología Estructural de Proteína
14.
PLoS Comput Biol ; 15(11): e1007419, 2019 11.
Artículo en Inglés | MEDLINE | ID: mdl-31682632

RESUMEN

Automated protein annotation using the Gene Ontology (GO) plays an important role in the biosciences. Evaluation has always been considered central to developing novel annotation methods, but little attention has been paid to the evaluation metrics themselves. Evaluation metrics define how well an annotation method performs and allows for them to be ranked against one another. Unfortunately, most of these metrics were adopted from the machine learning literature without establishing whether they were appropriate for GO annotations. We propose a novel approach for comparing GO evaluation metrics called Artificial Dilution Series (ADS). Our approach uses existing annotation data to generate a series of annotation sets with different levels of correctness (referred to as their signal level). We calculate the evaluation metric being tested for each annotation set in the series, allowing us to identify whether it can separate different signal levels. Finally, we contrast these results with several false positive annotation sets, which are designed to expose systematic weaknesses in GO assessment. We compared 37 evaluation metrics for GO annotation using ADS and identified drastic differences between metrics. We show that some metrics struggle to differentiate between different signal levels, while others give erroneously high scores to the false positive data sets. Based on our findings, we provide guidelines on which evaluation metrics perform well with the Gene Ontology and propose improvements to several well-known evaluation metrics. In general, we argue that evaluation metrics should be tested for their performance and we provide software for this purpose (https://bitbucket.org/plyusnin/ads/). ADS is applicable to other areas of science where the evaluation of prediction results is non-trivial.


Asunto(s)
Biología Computacional/métodos , Anotación de Secuencia Molecular/clasificación , Anotación de Secuencia Molecular/métodos , Algoritmos , Benchmarking/métodos , Bases de Datos Genéticas , Bases de Datos de Proteínas , Ontología de Genes/tendencias , Reproducibilidad de los Resultados , Programas Informáticos
15.
Bioinformatics ; 35(24): 5326-5327, 2019 12 15.
Artículo en Inglés | MEDLINE | ID: mdl-31263867

RESUMEN

MOTIVATION: Protein structure comparison plays a fundamental role in understanding the evolutionary relationships between proteins. Here, we release a new version of the DaliLite standalone software. The novelties are hierarchical search of the structure database organized into sequence based clusters, and remote access to our knowledge base of structural neighbors. The detection of fold, superfamily and family level similarities by DaliLite and state-of-the-art competitors was benchmarked against a manually curated structural classification. RESULTS: Database search strategies were evaluated using Fmax with query-specific thresholds. DaliLite and DeepAlign outperformed TM-score based methods at all levels of the benchmark, and DaliLite outperformed DeepAlign at fold level. Hierarchical and knowledge-based searches got close to the performance of systematic pairwise comparison. The knowledge-based search was four times as efficient as the hierarchical search. The knowledge-based search dynamically adjusts the depth of the search, enabling a trade-off between speed and recall. AVAILABILITY AND IMPLEMENTATION: http://ekhidna2.biocenter.helsinki.fi/dali/README.v5.html. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Benchmarking , Algoritmos , Bases de Datos Factuales , Proteínas , Análisis de Secuencia de Proteína , Programas Informáticos
16.
R Soc Open Sci ; 5(11): 180903, 2018 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-30564397

RESUMEN

An increasing number of mammalian species have been shown to have a history of hybridization and introgression based on genetic analyses. Only relatively few fossils, however, preserve genetic material, and morphology must be used to identify the species and determine whether morphologically intermediate fossils could represent hybrids. Because dental and cranial fossils are typically the key body parts studied in mammalian palaeontology, here we bracket the potential for phenotypically extreme hybridizations by examining uniquely preserved cranio-dental material of a captive hybrid between grey and ringed seals. We analysed how distinct these species are genetically and morphologically, how easy it is to identify the hybrids using morphology and whether comparable hybridizations happen in the wild. We show that the genetic distance between these species is more than twice the modern human-Neanderthal distance, but still within that of morphologically similar species pairs known to hybridize. By contrast, morphological and developmental analyses show grey and ringed seals to be highly disparate, and that the hybrid is a predictable intermediate. Genetic analyses of the parent populations reveal introgression in the wild, suggesting that grey-ringed seal hybridization is not limited to captivity. Taken together, we postulate that there is considerable potential for mammalian hybridization between phenotypically disparate taxa.

17.
BMC Bioinformatics ; 19(1): 278, 2018 07 31.
Artículo en Inglés | MEDLINE | ID: mdl-30064374

RESUMEN

BACKGROUND: Protein homology search is an important, yet time-consuming, step in everything from protein annotation to metagenomics. Its application, however, has become increasingly challenging, due to the exponential growth of protein databases. In order to perform homology search at the required scale, many methods have been proposed as alternatives to BLAST that make an explicit trade-off between sensitivity and speed. One such method, SANSparallel, uses a parallel implementation of the suffix array neighbourhood search (SANS) technique to achieve high speed and provides several modes to allow for greater sensitivity at the expense of performance. RESULTS: We present a new approach called asymmetric SANS together with scored seeds and an alternative suffix array ordering scheme called optimal substitution ordering. These techniques dramatically improve both the sensitivity and speed of the SANS approach. Our implementation, TOPAZ, is one of the top performing methods in terms of speed, sensitivity and scalability. In our benchmark, searching UniProtKB for homologous proteins to the Dickeya solani proteome, TOPAZ took less than 3 minutes to achieve a sensitivity of 0.84 compared to BLAST. CONCLUSIONS: Despite the trade-off homology search methods have to make between sensitivity and speed, TOPAZ stands out as one of the most sensitive and highest performance methods currently available.


Asunto(s)
Bases de Datos de Proteínas , Programas Informáticos , Algoritmos , Secuencia de Aminoácidos , Proteínas Bacterianas/química , Enterobacteriaceae/metabolismo , Alineación de Secuencia
18.
BMC Bioinformatics ; 19(1): 257, 2018 07 05.
Artículo en Inglés | MEDLINE | ID: mdl-29976145

RESUMEN

BACKGROUND: Current high-throughput sequencing platforms provide capacity to sequence multiple samples in parallel. Different samples are labeled by attaching a short sample specific nucleotide sequence, barcode, to each DNA molecule prior pooling them into a mix containing a number of libraries to be sequenced simultaneously. After sequencing, the samples are binned by identifying the barcode sequence within each sequence read. In order to tolerate sequencing errors, barcodes should be sufficiently apart from each other in sequence space. An additional constraint due to both nucleotide usage and basecalling accuracy is that the proportion of different nucleotides should be in balance in each barcode position. The number of samples to be mixed in each sequencing run may vary and this introduces a problem how to select the best subset of available barcodes at sequencing core facility for each sequencing run. There are plenty of tools available for de novo barcode design, but they are not suitable for subset selection. RESULTS: We have developed a tool which can be used for three different tasks: 1) selecting an optimal barcode set from a larger set of candidates, 2) checking the compatibility of user-defined set of barcodes, e.g. whether two or more libraries with existing barcodes can be combined in a single sequencing pool, and 3) augmenting an existing set of barcodes. In our approach the selection process is formulated as a minimization problem. We define the cost function and a set of constraints and use integer programming to solve the resulting combinatorial problem. Based on the desired number of barcodes to be selected and the set of candidate sequences given by user, the necessary constraints are automatically generated and the optimal solution can be found. The method is implemented in C programming language and web interface is available at http://ekhidna2.biocenter.helsinki.fi/barcosel . CONCLUSIONS: Increasing capacity of sequencing platforms raises the challenge of mixing barcodes. Our method allows the user to select a given number of barcodes among the larger existing barcode set so that both sequencing errors are tolerated and the nucleotide balance is optimized. The tool is easy to access via web browser.


Asunto(s)
Código de Barras del ADN Taxonómico/métodos , Ensayos Analíticos de Alto Rendimiento/métodos , Humanos
19.
Nucleic Acids Res ; 46(W1): W84-W88, 2018 07 02.
Artículo en Inglés | MEDLINE | ID: mdl-29741643

RESUMEN

The unprecedented growth of high-throughput sequencing has led to an ever-widening annotation gap in protein databases. While computational prediction methods are available to make up the shortfall, a majority of public web servers are hindered by practical limitations and poor performance. Here, we introduce PANNZER2 (Protein ANNotation with Z-scoRE), a fast functional annotation web server that provides both Gene Ontology (GO) annotations and free text description predictions. PANNZER2 uses SANSparallel to perform high-performance homology searches, making bulk annotation based on sequence similarity practical. PANNZER2 can output GO annotations from multiple scoring functions, enabling users to see which predictions are robust across predictors. Finally, PANNZER2 predictions scored within the top 10 methods for molecular function and biological process in the CAFA2 NK-full benchmark. The PANNZER2 web server is updated on a monthly schedule and is accessible at http://ekhidna2.biocenter.helsinki.fi/sanspanz/. The source code is available under the GNU Public Licence v3.


Asunto(s)
Biología Computacional/tendencias , Ontología de Genes/tendencias , Internet , Programas Informáticos , Algoritmos , Bases de Datos de Proteínas/tendencias , Secuenciación de Nucleótidos de Alto Rendimiento , Anotación de Secuencia Molecular
20.
Nucleic Acids Res ; 46(W1): W479-W485, 2018 07 02.
Artículo en Inglés | MEDLINE | ID: mdl-29762724

RESUMEN

We present AAI-profiler, a web server for exploratory analysis and quality control in comparative genomics. AAI-profiler summarizes proteome-wide sequence search results to identify novel species, assess the need for taxonomic reclassification and detect multi-isolate and contaminated samples. AAI-profiler visualises results using a scatterplot that shows the Average Amino-acid Identity (AAI) from the query proteome to all similar species in the sequence database. Taxonomic groups are indicated by colour and marker styles, making outliers easy to spot. AAI-profiler uses SANSparallel to perform high-performance homology searches, making proteome-wide analysis possible. We demonstrate the efficacy of AAI-profiler in the discovery of a close relationship between two bacterial symbionts of an omnivorous pirate bug (Orius) and a thrip (Frankliniella occidentalis), an important pest in agriculture. The symbionts represent novel species within the genus Rosenbergiella so far described only in floral nectar. AAI-profiler is easy to use, the analysis presented only required two mouse clicks and was completed in a few minutes. AAI-profiler is available at http://ekhidna2.biocenter.helsinki.fi/AAI.


Asunto(s)
Proteínas Bacterianas/genética , Chlamydia trachomatis/clasificación , Erwinia/clasificación , Filogenia , Proteoma/genética , Programas Informáticos , Secuencia de Aminoácidos , Animales , Proteínas Bacterianas/clasificación , Proteínas Bacterianas/metabolismo , Chlamydia trachomatis/genética , Chlamydia trachomatis/aislamiento & purificación , Erwinia/genética , Erwinia/aislamiento & purificación , Expresión Génica , Genómica/métodos , Heterópteros/microbiología , Internet , Anotación de Secuencia Molecular , Proteoma/clasificación , Proteoma/metabolismo , Homología de Secuencia de Aminoácido , Simbiosis/fisiología , Thysanoptera/microbiología
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...