Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
1.
Am J Hum Genet ; 109(12): 2105-2109, 2022 12 01.
Artigo em Inglês | MEDLINE | ID: mdl-36459978

RESUMO

Synonymous mutations change the DNA sequence of a gene without affecting the amino acid sequence of the encoded protein. Although some synonymous mutations can affect RNA splicing, translational efficiency, and mRNA stability, studies in human genetics, mutagenesis screens, and other experiments and evolutionary analyses have repeatedly shown that most synonymous variants are neutral or only weakly deleterious, with some notable exceptions. Based on a recent study in yeast, there have been claims that synonymous mutations could be as important as nonsynonymous mutations in causing disease, assuming the yeast findings hold up and translate to humans. Here, we argue that there is insufficient evidence to overturn the large, coherent body of knowledge establishing the predominant neutrality of synonymous variants in the human genome.


Assuntos
Evolução Biológica , Saccharomyces cerevisiae , Humanos , Mutação/genética , Sequência de Aminoácidos , Genoma Humano/genética
2.
Bioinformatics ; 37(19): 3152-3159, 2021 Oct 11.
Artigo em Inglês | MEDLINE | ID: mdl-33970232

RESUMO

MOTIVATION: The annotation of small open reading frames (smORFs) of <100 codons (<300 nucleotides) is challenging due to the large number of such sequences in the genome. RESULTS: In this study, we developed a computational pipeline, which we have named ORFLine, that stringently identifies smORFs and classifies them according to their position within transcripts. We identified a total of 5744 unique smORFs in datasets from mouse B and T lymphocytes and systematically characterized them using ORFLine. We further searched smORFs for the presence of a signal peptide, which predicted known secreted chemokines as well as novel micropeptides. Four novel micropeptides show evidence of secretion and are therefore candidate mediators of immunoregulatory functions. AVAILABILITY AND IMPLEMENTATION: Freely available on the web at https://github.com/boboppie/ORFLine. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

4.
Blood ; 127(23): 2791-803, 2016 06 09.
Artigo em Inglês | MEDLINE | ID: mdl-27084890

RESUMO

Inherited bleeding, thrombotic, and platelet disorders (BPDs) are diseases that affect ∼300 individuals per million births. With the exception of hemophilia and von Willebrand disease patients, a molecular analysis for patients with a BPD is often unavailable. Many specialized tests are usually required to reach a putative diagnosis and they are typically performed in a step-wise manner to control costs. This approach causes delays and a conclusive molecular diagnosis is often never reached, which can compromise treatment and impede rapid identification of affected relatives. To address this unmet diagnostic need, we designed a high-throughput sequencing platform targeting 63 genes relevant for BPDs. The platform can call single nucleotide variants, short insertions/deletions, and large copy number variants (though not inversions) which are subjected to automated filtering for diagnostic prioritization, resulting in an average of 5.34 candidate variants per individual. We sequenced 159 and 137 samples, respectively, from cases with and without previously known causal variants. Among the latter group, 61 cases had clinical and laboratory phenotypes indicative of a particular molecular etiology, whereas the remainder had an a priori highly uncertain etiology. All previously detected variants were recapitulated and, when the etiology was suspected but unknown or uncertain, a molecular diagnosis was reached in 56 of 61 and only 8 of 76 cases, respectively. The latter category highlights the need for further research into novel causes of BPDs. The ThromboGenomics platform thus provides an affordable DNA-based test to diagnose patients suspected of having a known inherited BPD.


Assuntos
Transtornos Plaquetários/genética , Predisposição Genética para Doença , Hemorragia/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Trombose/genética , Estudos de Casos e Controles , Variações do Número de Cópias de DNA , Feminino , Estudos de Associação Genética/métodos , Humanos , Masculino , Mutação , Polimorfismo de Nucleotídeo Único , Análise de Sequência de DNA/métodos
5.
Brief Bioinform ; 16(6): 932-40, 2015 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-25788326

RESUMO

Three principal approaches have been proposed for inferring the set of transcripts expressed in RNA samples using RNA-seq. The simplest approach uses curated annotations, which assumes the transcripts in a sample are a subset of the transcripts listed in a curated database. A more ambitious method involves aligning reads to a reference genome and using the alignments to infer the transcript structures, possibly with the aid of a curated transcript database. The most challenging approach is to assemble reads into putative transcripts de novo without the aid of reference data. We have systematically assessed the properties of these three approaches through a simulation study. We have found that the sensitivity of computational transcript set estimation is severely limited. Computational approaches (both genome-guided and de novo assembly) produce a large number of artefacts, which are assigned large expression estimates and absorb a substantial proportion of the signal when performing expression analysis. The approach using curated annotations shows good expression correlation even when the annotations are incomplete. Furthermore, any incorrect transcripts present in a curated set do not absorb much signal, so it is preferable to have a curation set with high sensitivity than high precision. Software to simulate transcript sets, expression values and sequence reads under a wider range of parameter values and to compare sensitivity, precision and signal-to-noise ratios of different methods is freely available online (https://github.com/boboppie/RSSS) and can be expanded by interested parties to include methods other than the exemplars presented in this article.


Assuntos
Análise de Sequência de RNA/métodos , Bases de Dados Genéticas , RNA Mensageiro/genética
6.
Nucleic Acids Res ; 42(Web Server issue): W468-72, 2014 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-24753429

RESUMO

InterMine (www.intermine.org) is a biological data warehousing system providing extensive automatically generated and configurable RESTful web services that underpin the web interface and can be re-used in many other applications: to find and filter data; export it in a flexible and structured way; to upload, use, manipulate and analyze lists; to provide services for flexible retrieval of sequence segments, and for other statistical and analysis tools. Here we describe these features and discuss how they can be used separately or in combinations to support integrative and comparative analysis.


Assuntos
Bases de Dados Factuais , Software , Animais , Cromossomos/química , Humanos , Internet , Camundongos , Análise de Sequência de DNA , Interface Usuário-Computador
7.
Genesis ; 53(8): 547-60, 2015 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-26097192

RESUMO

InterMine is a data integration warehouse and analysis software system developed for large and complex biological data sets. Designed for integrative analysis, it can be accessed through a user-friendly web interface. For bioinformaticians, extensive web services as well as programming interfaces for most common scripting languages support access to all features. The web interface includes a useful identifier look-up system, and both simple and sophisticated search options. Interactive results tables enable exploration, and data can be filtered, summarized, and browsed. A set of graphical analysis tools provide a rich environment for data exploration including statistical enrichment of sets of genes or other entities. InterMine databases have been developed for the major model organisms, budding yeast, nematode worm, fruit fly, zebrafish, mouse, and rat together with a newly developed human database. Here, we describe how this has facilitated interoperation and development of cross-organism analysis tools and reports. InterMine as a data exploration and analysis tool is also described. All the InterMine-based systems described in this article are resources freely available to the scientific community.


Assuntos
Bases de Dados Factuais , Software , Animais , Biologia Computacional/métodos , Bases de Dados Genéticas , Genômica , Humanos , Internet , Integração de Sistemas , Interface Usuário-Computador
8.
Nucleic Acids Res ; 40(Database issue): D1082-8, 2012 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-22080565

RESUMO

In an effort to comprehensively characterize the functional elements within the genomes of the important model organisms Drosophila melanogaster and Caenorhabditis elegans, the NHGRI model organism Encyclopaedia of DNA Elements (modENCODE) consortium has generated an enormous library of genomic data along with detailed, structured information on all aspects of the experiments. The modMine database (http://intermine.modencode.org) described here has been built by the modENCODE Data Coordination Center to allow the broader research community to (i) search for and download data sets of interest among the thousands generated by modENCODE; (ii) access the data in an integrated form together with non-modENCODE data sets; and (iii) facilitate fine-grained analysis of the above data. The sophisticated search features are possible because of the collection of extensive experimental metadata by the consortium. Interfaces are provided to allow both biologists and bioinformaticians to exploit these rich modENCODE data sets now available via modMine.


Assuntos
Caenorhabditis elegans/genética , Bases de Dados Genéticas , Drosophila melanogaster/genética , Animais , Expressão Gênica , Genoma Helmíntico , Genoma de Inseto , Genômica , Internet , Interface Usuário-Computador
9.
Bioinformatics ; 28(23): 3163-5, 2012 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-23023984

RESUMO

SUMMARY: InterMine is an open-source data warehouse system that facilitates the building of databases with complex data integration requirements and a need for a fast customizable query facility. Using InterMine, large biological databases can be created from a range of heterogeneous data sources, and the extensible data model allows for easy integration of new data types. The analysis tools include a flexible query builder, genomic region search and a library of 'widgets' performing various statistical analyses. The results can be exported in many commonly used formats. InterMine is a fully extensible framework where developers can add new tools and functionality. Additionally, there is a comprehensive set of web services, for which client libraries are provided in five commonly used programming languages. AVAILABILITY: Freely available from http://www.intermine.org under the LGPL license. CONTACT: g.micklem@gen.cam.ac.uk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional/métodos , Sistemas de Gerenciamento de Base de Dados , Bases de Dados Factuais , Algoritmos , Mineração de Dados , Genômica , Internet , Linguagens de Programação
10.
Front Oncol ; 13: 1172670, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37346071

RESUMO

Introduction: The occurrence of metastasis is a threat to patients with colon cancer (CC), and the liver is the most common metastasis organ. However, the role of the extrahepatic organs in patients with liver metastasis (LM) has not been distinctly demonstrated. Therefore, this research aimed to explore the prognostic value of extrahepatic metastases (EHMs). Methods: In this retrospective study, a total of 13,662 colon patients with LM between 2010 and 2015 were selected from the Surveillance, Epidemiology, and End Results database (SEER). Fine and Gray's analysis and K-M survival analysis were utilized to explore the impacts of the number of sites of EHMs and different sites of EHMs on prognosis. Finally, a prognostic nomogram model based on the number of sites of EHMs was constructed, and a string of validation methods was conducted, including concordance index (C-index), receiver operating characteristic curves (ROC), and decision curve analysis (DCA). Results: Patients without EHMs had better prognoses in cancer-specific survival (CSS) and overall survival (OS) than patients with EHMs (p < 0.001). Varied EHM sites of patients had different characteristics of primary location site, grade, and histology. Cumulative incidence rates for CSS surpassed that for other causes in patients with 0, 1, 2, ≥ 3 EHMs, and the patients with more numbers of sites of EHMs revealed worse prognosis in CSS (p < 0.001). However, patients with different EHM sites had a minor difference in cumulative incidence rates for CSS (p = 0.106). Finally, a nomogram was constructed to predict the survival probability of patients with EHMs, which is based on the number of sites of EHMs and has been proven an excellent predictive ability. Conclusion: The number of sites of EHMs was a significant prognostic factor of CC patients with LM. However, the sites of EHMs showed limited impact on survival. Furthermore, a nomogram based on the number of sites of EHMs was constructed to predict the OS of patients with EHMs accurately.

11.
Sci Adv ; 8(34): eabo6371, 2022 08 26.
Artigo em Inglês | MEDLINE | ID: mdl-36026442

RESUMO

Large reference datasets of protein-coding variation in human populations have allowed us to determine which genes and genic subregions are intolerant to germline genetic variation. There is also a growing number of genes implicated in severe Mendelian diseases that overlap with genes implicated in cancer. We hypothesized that cancer-driving mutations might be enriched in genic subregions that are depleted of germline variation relative to somatic variation. We introduce a new metric, OncMTR (oncology missense tolerance ratio), which uses 125,748 exomes in the Genome Aggregation Database (gnomAD) to identify these genic subregions. We demonstrate that OncMTR can significantly predict driver mutations implicated in hematologic malignancies. Divergent OncMTR regions were enriched for cancer-relevant protein domains, and overlaying OncMTR scores on protein structures identified functionally important protein residues. Last, we performed a rare variant, gene-based collapsing analysis on an independent set of 394,694 exomes from the UK Biobank and find that OncMTR markedly improves genetic signals for hematologic malignancies.


Assuntos
Mutação em Linhagem Germinativa , Neoplasias Hematológicas , Células Germinativas , Neoplasias Hematológicas/genética , Humanos
12.
Database (Oxford) ; 20222022 07 12.
Artigo em Inglês | MEDLINE | ID: mdl-35820040

RESUMO

HumanMine (www.humanmine.org) is an integrated database of human genomics and proteomics data that provides a powerful interface to support sophisticated exploration and analysis of data compiled from experimental, computational and curated data sources. Built using the InterMine data integration platform, HumanMine includes genes, proteins, pathways, expression levels, Single nucleotide polymorphism (SNP), diseases and more, integrated into a single searchable database. HumanMine promotes integrative analysis, a powerful approach in modern biology that allows many sources of evidence to be analysed together. The data can be accessed through a user-friendly web interface as well as a powerful, scriptable web service Application programming interface (API) to allow programmatic access to data. The web interface includes a useful identifier resolution system, sophisticated query options and interactive results tables that enable powerful exploration of data, including data summaries, filtering, browsing and export. A set of graphical analysis tools provide a rich environment for data exploration including statistical enrichment of sets of genes or other biological entities. HumanMine can be used for integrative multistaged analysis that can lead to new insights and uncover previously unknown relationships. Database URL: https://www.humanmine.org.


Assuntos
Genoma Humano , Armazenamento e Recuperação da Informação , Bases de Dados Factuais , Humanos , Proteômica
13.
Database (Oxford) ; 2013: bat060, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23935057

RESUMO

Common metabolic and endocrine diseases such as diabetes affect millions of people worldwide and have a major health impact, frequently leading to complications and mortality. In a search for better prevention and treatment, there is ongoing research into the underlying molecular and genetic bases of these complex human diseases, as well as into the links with risk factors such as obesity. Although an increasing number of relevant genomic and proteomic data sets have become available, the quantity and diversity of the data make their efficient exploitation challenging. Here, we present metabolicMine, a data warehouse with a specific focus on the genomics, genetics and proteomics of common metabolic diseases. Developed in collaboration with leading UK metabolic disease groups, metabolicMine integrates data sets from a range of experiments and model organisms alongside tools for exploring them. The current version brings together information covering genes, proteins, orthologues, interactions, gene expression, pathways, ontologies, diseases, genome-wide association studies and single nucleotide polymorphisms. Although the emphasis is on human data, key data sets from mouse and rat are included. These are complemented by interoperation with the RatMine rat genomics database, with a corresponding mouse version under development by the Mouse Genome Informatics (MGI) group. The web interface contains a number of features including keyword search, a library of Search Forms, the QueryBuilder and list analysis tools. This provides researchers with many different ways to analyse, view and flexibly export data. Programming interfaces and automatic code generation in several languages are supported, and many of the features of the web interface are available through web services. The combination of diverse data sets integrated with analysis tools and a powerful query system makes metabolicMine a valuable research resource. The web interface makes it accessible to first-time users, whereas the Application Programming Interface (API) and web services provide convenient data access and tools for bioinformaticians. metabolicMine is freely available online at http://www.metabolicmine.org Database URL: http://www.metabolicmine.org.


Assuntos
Bases de Dados Genéticas , Bases de Dados de Proteínas , Doenças Metabólicas/genética , Doenças Metabólicas/metabolismo , Proteômica , Pesquisa , Animais , Estudos de Associação Genética , Humanos , Internet , Camundongos , Ratos
14.
PLoS One ; 7(7): e39396, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22808034

RESUMO

Understanding cellular regulation of metabolism is a major challenge in systems biology. Thus far, the main assumption was that enzyme levels are key regulators in metabolic networks. However, regulation analysis recently showed that metabolism is rarely controlled via enzyme levels only, but through non-obvious combinations of hierarchical (gene and enzyme levels) and metabolic regulation (mass action and allosteric interaction). Quantitative analyses relating changes in metabolic fluxes to changes in transcript or protein levels have revealed a remarkable lack of understanding of the regulation of these networks. We study metabolic regulation via feasibility analysis (FA). Inspired by the constraint-based approach of Flux Balance Analysis, FA incorporates a model describing kinetic interactions between molecules. We enlarge the portfolio of objectives for the cell by defining three main physiologically relevant objectives for the cell: function, robustness and temporal responsiveness. We postulate that the cell assumes one or a combination of these objectives and search for enzyme levels necessary to achieve this. We call the subspace of feasible enzyme levels the feasible enzyme space. Once this space is constructed, we can study how different objectives may (if possible) be combined, or evaluate the conditions at which the cells are faced with a trade-off among those. We apply FA to the experimental scenario of long-term carbon limited chemostat cultivation of yeast cells, studying how metabolism evolves optimally. Cells employ a mixed strategy composed of increasing enzyme levels for glucose uptake and hexokinase and decreasing levels of the remaining enzymes. This trade-off renders the cells specialized in this low-carbon flux state to compete for the available glucose and get rid of over-overcapacity. Overall, we show that FA is a powerful tool for systems biologists to study regulation of metabolism, interpret experimental data and evaluate hypotheses.


Assuntos
Carbono/metabolismo , Regulação Fúngica da Expressão Gênica , Glucose/metabolismo , Hexoquinase/genética , Proteínas de Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/genética , Algoritmos , Evolução Molecular , Retroalimentação Fisiológica , Hexoquinase/metabolismo , Cinética , Redes e Vias Metabólicas , Modelos Biológicos , Teoria da Probabilidade , Saccharomyces cerevisiae/enzimologia , Proteínas de Saccharomyces cerevisiae/metabolismo , Biologia de Sistemas , Transcrição Gênica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA