Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 39
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
J Virol ; 92(2)2018 01 15.
Artigo em Inglês | MEDLINE | ID: mdl-29093097

RESUMO

Epstein-Barr virus (EBV) is a causative agent of a variety of lymphomas, nasopharyngeal carcinoma (NPC), and ∼9% of gastric carcinomas (GCs). An important question is whether particular EBV variants are more oncogenic than others, but conclusions are currently hampered by the lack of sequenced EBV genomes. Here, we contribute to this question by mining whole-genome sequences of 201 GCs to identify 13 EBV-positive GCs and by assembling 13 new EBV genome sequences, almost doubling the number of available GC-derived EBV genome sequences and providing the first non-Asian EBV genome sequences from GC. Whole-genome sequence comparisons of all EBV isolates sequenced to date (85 from tumors and 57 from healthy individuals) showed that most GC and NPC EBV isolates were closely related although American Caucasian GC samples were more distant, suggesting a geographical component. However, EBV GC isolates were found to contain some consistent changes in protein sequences regardless of geographical origin. In addition, transcriptome data available for eight of the EBV-positive GCs were analyzed to determine which EBV genes are expressed in GC. In addition to the expected latency proteins (EBNA1, LMP1, and LMP2A), specific subsets of lytic genes were consistently expressed that did not reflect a typical lytic or abortive lytic infection, suggesting a novel mechanism of EBV gene regulation in the context of GC. These results are consistent with a model in which a combination of specific latent and lytic EBV proteins promotes tumorigenesis.IMPORTANCE Epstein-Barr virus (EBV) is a widespread virus that causes cancer, including gastric carcinoma (GC), in a small subset of individuals. An important question is whether particular EBV variants are more cancer associated than others, but more EBV sequences are required to address this question. Here, we have generated 13 new EBV genome sequences from GC, almost doubling the number of EBV sequences from GC isolates and providing the first EBV sequences from non-Asian GC. We further identify sequence changes in some EBV proteins common to GC isolates. In addition, gene expression analysis of eight of the EBV-positive GCs showed consistent expression of both the expected latency proteins and a subset of lytic proteins that was not consistent with typical lytic or abortive lytic expression. These results suggest that novel mechanisms activate expression of some EBV lytic proteins and that their expression may contribute to oncogenesis.


Assuntos
Adenocarcinoma/etiologia , Infecções por Vírus Epstein-Barr/complicações , Infecções por Vírus Epstein-Barr/virologia , Regulação Viral da Expressão Gênica , Genoma Viral , Herpesvirus Humano 4/fisiologia , Neoplasias Gástricas/etiologia , Adenocarcinoma/patologia , Substituição de Aminoácidos , Biologia Computacional/métodos , Epitopos de Linfócito T , Infecções por Vírus Epstein-Barr/imunologia , Humanos , Mutação , Filogenia , Neoplasias Gástricas/patologia , Sequenciamento Completo do Genoma
2.
Blood ; 130(4): 453-459, 2017 07 27.
Artigo em Inglês | MEDLINE | ID: mdl-28600341

RESUMO

The National Cancer Institute Genomic Data Commons (GDC) is an information system for storing, analyzing, and sharing genomic and clinical data from patients with cancer. The recent high-throughput sequencing of cancer genomes and transcriptomes has produced a big data problem that precludes many cancer biologists and oncologists from gleaning knowledge from these data regarding the nature of malignant processes and the relationship between tumor genomic profiles and treatment response. The GDC aims to democratize access to cancer genomic data and to foster the sharing of these data to promote precision medicine approaches to the diagnosis and treatment of cancer.


Assuntos
Bases de Dados Genéticas , Neoplasias/genética , Medicina de Precisão , Software , Humanos , National Cancer Institute (U.S.) , Estados Unidos
3.
Bioinformatics ; 32(3): 453-5, 2016 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-26454281

RESUMO

SUMMARY: Sequence comparison of genetic material between known and unknown organisms plays a crucial role in genomics, metagenomics and phylogenetic analysis. The emerging long-read sequencing technologies can now produce reads of tens of kilobases in length that promise a more accurate assessment of their origin. To facilitate the classification of long and short DNA sequences, we have developed a Python package that implements a new sequence classification model that we have demonstrated to improve the classification accuracy when compared with other state of the art classification methods. For the purpose of validation, and to demonstrate its usefulness, we test the combined sequence similarity score classifier (CSSSCL) using three different datasets, including a metagenomic dataset composed of short reads. AVAILABILITY AND IMPLEMENTATION: Package's source code and test datasets are available under the GPLv3 license at https://github.com/oicr-ibc/cssscl. CONTACT: ivan.borozan@oicr.on.ca SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Bactérias/classificação , Metagenômica/métodos , Modelos Teóricos , Alinhamento de Sequência , Software , Vírus/classificação , Bactérias/genética , Filogenia , Análise de Sequência de DNA , Vírus/genética
4.
Bioinformatics ; 31(9): 1396-404, 2015 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-25573913

RESUMO

MOTIVATION: Alignment-based sequence similarity searches, while accurate for some type of sequences, can produce incorrect results when used on more divergent but functionally related sequences that have undergone the sequence rearrangements observed in many bacterial and viral genomes. Here, we propose a classification model that exploits the complementary nature of alignment-based and alignment-free similarity measures with the aim to improve the accuracy with which DNA and protein sequences are characterized. RESULTS: Our model classifies sequences using a combined sequence similarity score calculated by adaptively weighting the contribution of different sequence similarity measures. Weights are determined independently for each sequence in the test set and reflect the discriminatory ability of individual similarity measures in the training set. Because the similarity between some sequences is determined more accurately with one type of measure rather than another, our classifier allows different sets of weights to be associated with different sequences. Using five different similarity measures, we show that our model significantly improves the classification accuracy over the current composition- and alignment-based models, when predicting the taxonomic lineage for both short viral sequence fragments and complete viral sequences. We also show that our model can be used effectively for the classification of reads from a real metagenome dataset as well as protein sequences. AVAILABILITY AND IMPLEMENTATION: All the datasets and the code used in this study are freely available at https://collaborators.oicr.on.ca/vferretti/borozan_csss/csss.html. CONTACT: ivan.borozan@gmail.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Alinhamento de Sequência , Análise de Sequência de DNA/métodos , Análise de Sequência de Proteína/métodos , Algoritmos , Classificação/métodos , DNA Viral , Metagenômica , Modelos Teóricos , Vírus/classificação
5.
Nat Genet ; 39(8): 989-94, 2007 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-17618283

RESUMO

Using a multistage genetic association approach comprising 7,480 affected individuals and 7,779 controls, we identified markers in chromosomal region 8q24 associated with colorectal cancer. In stage 1, we genotyped 99,632 SNPs in 1,257 affected individuals and 1,336 controls from Ontario. In stages 2-4, we performed serial replication studies using 4,024 affected individuals and 4,042 controls from Seattle, Newfoundland and Scotland. We identified one locus on chromosome 8q24 and another on 9p24 having combined odds ratios (OR) for stages 1-4 of 1.18 (trend; P = 1.41 x 10(-8)) and 1.14 (trend; P = 1.32 x 10(-5)), respectively. Additional analyses in 2,199 affected individuals and 2,401 controls from France and Europe supported the association at the 8q24 locus (OR = 1.16, trend; 95% confidence interval (c.i.): 1.07-1.26; P = 5.05 x 10(-4)). A summary across all seven studies at the 8q24 locus was highly significant (OR = 1.17, c.i.: 1.12-1.23; P = 3.16 x 10(-11)). This locus has also been implicated in prostate cancer.


Assuntos
Cromossomos Humanos Par 8 , Neoplasias Colorretais/genética , Predisposição Genética para Doença , Polimorfismo de Nucleotídeo Único , Estudos de Casos e Controles , Mapeamento Cromossômico , Humanos , Desequilíbrio de Ligação , Pessoa de Meia-Idade
8.
BMC Endocr Disord ; 14: 9, 2014 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-24484869

RESUMO

BACKGROUND: Not all obese subjects have an adverse metabolic profile predisposing them to developing type 2 diabetes or cardiovascular disease. The BioSHaRE-EU Healthy Obese Project aims to gain insights into the consequences of (healthy) obesity using data on risk factors and phenotypes across several large-scale cohort studies. Aim of this study was to describe the prevalence of obesity, metabolic syndrome (MetS) and metabolically healthy obesity (MHO) in ten participating studies. METHODS: Ten different cohorts in seven countries were combined, using data transformed into a harmonized format. All participants were of European origin, with age 18-80 years. They had participated in a clinical examination for anthropometric and blood pressure measurements. Blood samples had been drawn for analysis of lipids and glucose. Presence of MetS was assessed in those with obesity (BMI ≥ 30 kg/m2) based on the 2001 NCEP ATP III criteria, as well as an adapted set of less strict criteria. MHO was defined as obesity, having none of the MetS components, and no previous diagnosis of cardiovascular disease. RESULTS: Data for 163,517 individuals were available; 17% were obese (11,465 men and 16,612 women). The prevalence of obesity varied from 11.6% in the Italian CHRIS cohort to 26.3% in the German KORA cohort. The age-standardized percentage of obese subjects with MetS ranged in women from 24% in CHRIS to 65% in the Finnish Health2000 cohort, and in men from 43% in CHRIS to 78% in the Finnish DILGOM cohort, with elevated blood pressure the most frequently occurring factor contributing to the prevalence of the metabolic syndrome. The age-standardized prevalence of MHO varied in women from 7% in Health2000 to 28% in NCDS, and in men from 2% in DILGOM to 19% in CHRIS. MHO was more prevalent in women than in men, and decreased with age in both sexes. CONCLUSIONS: Through a rigorous harmonization process, the BioSHaRE-EU consortium was able to compare key characteristics defining the metabolically healthy obese phenotype across ten cohort studies. There is considerable variability in the prevalence of healthy obesity across the different European populations studied, even when unified criteria were used to classify this phenotype.

9.
Genomics ; 102(3): 140-7, 2013 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-23603536

RESUMO

Using sequencing information to guide clinical decision-making requires coordination of a diverse set of people and activities. In clinical genomics, the process typically includes sample acquisition, template preparation, genome data generation, analysis to identify and confirm variant alleles, interpretation of clinical significance, and reporting to clinicians. We describe a software application developed within a clinical genomics study, to support this entire process. The software application tracks patients, samples, genomic results, decisions and reports across the cohort, monitors progress and sends reminders, and works alongside an electronic data capture system for the trial's clinical and genomic data. It incorporates systems to read, store, analyze and consolidate sequencing results from multiple technologies, and provides a curated knowledge base of tumor mutation frequency (from the COSMIC database) annotated with clinical significance and drug sensitivity to generate reports for clinicians. By supporting the entire process, the application provides deep support for clinical decision making, enabling the generation of relevant guidance in reports for verification by an expert panel prior to forwarding to the treating physician.


Assuntos
Genética Médica/métodos , Genoma Humano , Genômica/métodos , Gestão da Informação , Neoplasias/genética , Medicina de Precisão , Software , Variação Genética , Genômica/economia , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Análise de Sequência de DNA , Análise de Sequência de RNA
10.
Nat Genet ; 37(10): 1108-12, 2005 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-16186814

RESUMO

Genetic susceptibility to multiple sclerosis is associated with genes of the major histocompatibility complex (MHC), particularly HLA-DRB1 and HLA-DQB1 (ref. 1). Both locus and allelic heterogeneity have been reported in this genomic region. To clarify whether HLA-DRB1 itself, nearby genes in the region encoding the MHC or combinations of these loci underlie susceptibility to multiple sclerosis, we genotyped 1,185 Canadian and Finnish families with multiple sclerosis (n = 4,203 individuals) with a high-density SNP panel spanning the genes encoding the MHC and flanking genomic regions. Strong associations in Canadian and Finnish samples were observed with blocks in the HLA class II genomic region (P < 4.9 x 10(-13) and P < 2.0 x 10(-16), respectively), but the strongest association was with HLA-DRB1 (P < 4.4 x 10(-17)). Conditioning on either HLA-DRB1 or the most significant HLA class II haplotype block found no additional block or SNP association independent of the HLA class II genomic region. This study therefore indicates that MHC-associated susceptibility to multiple sclerosis is determined by HLA class II alleles, their interactions and closely neighboring variants.


Assuntos
Antígenos HLA-DR/genética , Antígenos de Histocompatibilidade Classe II/genética , Esclerose Múltipla/genética , Polimorfismo de Nucleotídeo Único , Canadá , Finlândia , Predisposição Genética para Doença , Cadeias HLA-DRB1 , Humanos , Complexo Principal de Histocompatibilidade/genética , Esclerose Múltipla/etnologia , População Branca
11.
Int J Cancer ; 132(7): 1547-55, 2013 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-22948899

RESUMO

The successes of targeted drugs with companion predictive biomarkers and the technological advances in gene sequencing have generated enthusiasm for evaluating personalized cancer medicine strategies using genomic profiling. We assessed the feasibility of incorporating real-time analysis of somatic mutations within exons of 19 genes into patient management. Blood, tumor biopsy and archived tumor samples were collected from 50 patients recruited from four cancer centers. Samples were analyzed using three technologies: targeted exon sequencing using Pacific Biosciences PacBio RS, multiplex somatic mutation genotyping using Sequenom MassARRAY and Sanger sequencing. An expert panel reviewed results prior to reporting to clinicians. A clinical laboratory verified actionable mutations. Fifty patients were recruited. Nineteen actionable mutations were identified in 16 (32%) patients. Across technologies, results were in agreement in 100% of biopsy specimens and 95% of archival specimens. Profiling results from paired archival/biopsy specimens were concordant in 30/34 (88%) patients. We demonstrated that the use of next generation sequencing for real-time genomic profiling in advanced cancer patients is feasible. Additionally, actionable mutations identified in this study were relatively stable between archival and biopsy samples, implying that cancer mutations that are good predictors of drug response may remain constant across clinical stages.


Assuntos
Antineoplásicos/farmacologia , Ensaios Clínicos como Assunto , Genes Neoplásicos/genética , Sequenciamento de Nucleotídeos em Larga Escala , Neoplasias/genética , Medicina de Precisão , Adulto , Idoso , Biologia Computacional , Estudos de Viabilidade , Feminino , Seguimentos , Humanos , Masculino , Pessoa de Meia-Idade , Mutação/genética , Metástase Neoplásica , Neoplasias/tratamento farmacológico
12.
Emerg Themes Epidemiol ; 10(1): 12, 2013 Nov 21.
Artigo em Inglês | MEDLINE | ID: mdl-24257327

RESUMO

BACKGROUND: Individual-level data pooling of large population-based studies across research centres in international research projects faces many hurdles. The BioSHaRE (Biobank Standardisation and Harmonisation for Research Excellence in the European Union) project aims to address these issues by building a collaborative group of investigators and developing tools for data harmonization, database integration and federated data analyses. METHODS: Eight population-based studies in six European countries were recruited to participate in the BioSHaRE project. Through workshops, teleconferences and electronic communications, participating investigators identified a set of 96 variables targeted for harmonization to answer research questions of interest. Using each study's questionnaires, standard operating procedures, and data dictionaries, harmonization potential was assessed. Whenever harmonization was deemed possible, processing algorithms were developed and implemented in an open-source software infrastructure to transform study-specific data into the target (i.e. harmonized) format. Harmonized datasets located on server in each research centres across Europe were interconnected through a federated database system to perform statistical analysis. RESULTS: Retrospective harmonization led to the generation of common format variables for 73% of matches considered (96 targeted variables across 8 studies). Authenticated investigators can now perform complex statistical analyses of harmonized datasets stored on distributed servers without actually sharing individual-level data using the DataSHIELD method. CONCLUSION: New Internet-based networking technologies and database management systems are providing the means to support collaborative, multi-center research in an efficient and secure manner. The results from this pilot project show that, given a strong collaborative relationship between participating studies, it is possible to seamlessly co-analyse internationally harmonized research databases while allowing each study to retain full control over individual-level data. We encourage additional collaborative research networks in epidemiology, public health, and the social sciences to make use of the open source tools presented herein.

13.
BMC Bioinformatics ; 13: 206, 2012 Aug 17.
Artigo em Inglês | MEDLINE | ID: mdl-22901030

RESUMO

BACKGROUND: It is now well established that nearly 20% of human cancers are caused by infectious agents, and the list of human oncogenic pathogens will grow in the future for a variety of cancer types. Whole tumor transcriptome and genome sequencing by next-generation sequencing technologies presents an unparalleled opportunity for pathogen detection and discovery in human tissues but requires development of new genome-wide bioinformatics tools. RESULTS: Here we present CaPSID (Computational Pathogen Sequence IDentification), a comprehensive bioinformatics platform for identifying, querying and visualizing both exogenous and endogenous pathogen nucleotide sequences in tumor genomes and transcriptomes. CaPSID includes a scalable, high performance database for data storage and a web application that integrates the genome browser JBrowse. CaPSID also provides useful metrics for sequence analysis of pre-aligned BAM files, such as gene and genome coverage, and is optimized to run efficiently on multiprocessor computers with low memory usage. CONCLUSIONS: To demonstrate the usefulness and efficiency of CaPSID, we carried out a comprehensive analysis of both a simulated dataset and transcriptome samples from ovarian cancer. CaPSID correctly identified all of the human and pathogen sequences in the simulated dataset, while in the ovarian dataset CaPSID's predictions were successfully validated in vitro.


Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Genoma Humano , Software , Transcriptoma , Algoritmos , Linhagem Celular Tumoral , Simulação por Computador , Feminino , Humanos , Internet , Vírus Oncogênicos/genética , Neoplasias Ovarianas/genética , Sensibilidade e Especificidade
14.
Hum Mutat ; 33(5): 867-73, 2012 May.
Artigo em Inglês | MEDLINE | ID: mdl-22416047

RESUMO

Genetic and epidemiological research increasingly employs large collections of phenotypic and molecular observation data from high quality human and model organism samples. Standardization efforts have produced a few simple formats for exchange of these various data, but a lightweight and convenient data representation scheme for all data modalities does not exist, hindering successful data integration, such as assignment of mouse models to orphan diseases and phenotypic clustering for pathways. We report a unified system to integrate and compare observation data across experimental projects, disease databases, and clinical biobanks. The core object model (Observ-OM) comprises only four basic concepts to represent any kind of observation: Targets, Features, Protocols (and their Applications), and Values. An easy-to-use file format (Observ-TAB) employs Excel to represent individual and aggregate data in straightforward spreadsheets. The systems have been tested successfully on human biobank, genome-wide association studies, quantitative trait loci, model organism, and patient registry data using the MOLGENIS platform to quickly setup custom data portals. Our system will dramatically lower the barrier for future data sharing and facilitate integrated search across panels and species. All models, formats, documentation, and software are available for free and open source (LGPLv3) at http://www.observ-om.org.


Assuntos
Disseminação de Informação/métodos , Gestão da Informação , Animais , Gráficos por Computador , Bases de Dados Genéticas , Epidermólise Bolhosa Distrófica/genética , Estudos de Associação Genética , Humanos , Informática Médica , Camundongos , Fenótipo , Locos de Características Quantitativas
15.
Cancer Epidemiol Biomarkers Prev ; 31(1): 210-220, 2022 01.
Artigo em Inglês | MEDLINE | ID: mdl-34737207

RESUMO

BACKGROUND: Fusobacterium nucleatum (F. nucleatum) activates oncogenic signaling pathways and induces inflammation to promote colorectal carcinogenesis. METHODS: We characterized F. nucleatum and its subspecies in colorectal tumors and examined associations with tumor characteristics and colorectal cancer-specific survival. We conducted deep sequencing of nusA, nusG, and bacterial 16s rRNA genes in tumors from 1,994 patients with colorectal cancer and assessed associations between F. nucleatum presence and clinical characteristics, colorectal cancer-specific mortality, and somatic mutations. RESULTS: F. nucleatum, which was present in 10.3% of tumors, was detected in a higher proportion of right-sided and advanced-stage tumors, particularly subspecies animalis. Presence of F. nucleatum was associated with higher colorectal cancer-specific mortality (HR, 1.97; P = 0.0004). This association was restricted to nonhypermutated, microsatellite-stable tumors (HR, 2.13; P = 0.0002) and those who received chemotherapy [HR, 1.92; confidence interval (CI), 1.07-3.45; P = 0.029). Only F. nucleatum subspecies animalis, the main subspecies detected (65.8%), was associated with colorectal cancer-specific mortality (HR, 2.16; P = 0.0016), subspecies vincentii and nucleatum were not (HR, 1.07; P = 0.86). Additional adjustment for tumor stage suggests that the effect of F. nucleatum on mortality is partly driven by a stage shift. Presence of F. nucleatum was associated with microsatellite instable tumors, tumors with POLE exonuclease domain mutations, and ERBB3 mutations, and suggestively associated with TP53 mutations. CONCLUSIONS: F. nucleatum, and particularly subspecies animalis, was associated with a higher colorectal cancer-specific mortality and specific somatic mutated genes. IMPACT: Our findings identify the F. nucleatum subspecies animalis as negatively impacting colorectal cancer mortality, which may occur through a stage shift and its effect on chemoresistance.


Assuntos
Neoplasias Colorretais , Fusobacterium nucleatum , Carcinogênese , Neoplasias Colorretais/genética , Humanos , RNA Ribossômico 16S
16.
Nat Genet ; 52(3): 320-330, 2020 03.
Artigo em Inglês | MEDLINE | ID: mdl-32025001

RESUMO

Here, as part of the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, for which whole-genome and-for a subset-whole-transcriptome sequencing data from 2,658 cancers across 38 tumor types was aggregated, we systematically investigated potential viral pathogens using a consensus approach that integrated three independent pipelines. Viruses were detected in 382 genome and 68 transcriptome datasets. We found a high prevalence of known tumor-associated viruses such as Epstein-Barr virus (EBV), hepatitis B virus (HBV) and human papilloma virus (HPV; for example, HPV16 or HPV18). The study revealed significant exclusivity of HPV and driver mutations in head-and-neck cancer and the association of HPV with APOBEC mutational signatures, which suggests that impaired antiviral defense is a driving force in cervical, bladder and head-and-neck carcinoma. For HBV, HPV16, HPV18 and adeno-associated virus-2 (AAV2), viral integration was associated with local variations in genomic copy numbers. Integrations at the TERT promoter were associated with high telomerase expression evidently activating this tumor-driving process. High levels of endogenous retrovirus (ERV1) expression were linked to a worse survival outcome in patients with kidney cancer.


Assuntos
Vírus de DNA Tumorais/genética , Genoma Humano/genética , Neoplasias/virologia , Transcriptoma , Infecções Tumorais por Vírus/virologia , Integração Viral , Variações do Número de Cópias de DNA , Vírus da Hepatite B/genética , Herpesvirus Humano 4/genética , Humanos , Mutação , Neoplasias/genética , Infecções por Papillomavirus/genética , Regiões Promotoras Genéticas/genética , Telomerase/genética
17.
Nat Commun ; 11(1): 3400, 2020 07 07.
Artigo em Inglês | MEDLINE | ID: mdl-32636365

RESUMO

The Pan-Cancer Analysis of Whole Genomes (PCAWG) project generated a vast amount of whole-genome cancer sequencing resource data. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2658 cancers across 38 tumor types, we provide a user's guide to the five publicly available online data exploration and visualization tools introduced in the PCAWG marker paper. These tools are ICGC Data Portal, UCSC Xena, Chromothripsis Explorer, Expression Atlas, and PCAWG-Scout. We detail use cases and analyses for each tool, show how they incorporate outside resources from the larger genomics ecosystem, and demonstrate how the tools can be used together to understand the biology of cancers more deeply. Together, the tools enable researchers to query the complex genomic PCAWG data dynamically and integrate external information, enabling and enhancing interpretation.


Assuntos
Biologia Computacional/métodos , Genoma Humano , Neoplasias/genética , Cromotripsia , Análise de Dados , Bases de Dados Genéticas , Genômica , Humanos , Internet , Mutação , Software , Interface Usuário-Computador , Sequenciamento Completo do Genoma
18.
Hum Genet ; 125(4): 445-59, 2009 May.
Artigo em Inglês | MEDLINE | ID: mdl-19247692

RESUMO

Asthma, atopy, and related phenotypes are heterogeneous complex traits, with both genetic and environmental risk factors. Extensive research has been conducted and over hundred genes have been associated with asthma and atopy phenotypes, but many of these findings have failed to replicate in subsequent studies. To separate true associations from false positives, candidate genes need to be examined in large well-characterized samples, using standardized designs (genotyping, phenotyping and analysis). In an attempt to replicate previous associations we amalgamated the power and resources of four studies and genotyped 5,565 individuals to conduct a genetic association study of 93 previously associated candidate genes for asthma and related phenotypes using the same set of 861 single-nucleotide polymorphisms (SNPs), a common genotyping platform, and relatively harmonized phenotypes. We tested for association between SNPs and the dichotomous outcomes of asthma, atopy, atopic asthma, and airway hyperresponsiveness using a general allelic likelihood ratio test. No SNP in any gene reached significance levels that survived correction for all tested SNPs, phenotypes, and genes. Even after relaxing the usual stringent multiple testing corrections by performing a gene-based analysis (one gene at a time as if no other genes were typed) and by stratifying SNPs based on their prior evidence of association, no genes gave strong evidence of replication. There was weak evidence to implicate the following: IL13, IFNGR2, EDN1, and VDR in asthma; IL18, TBXA2R, IFNGR2, and VDR in atopy; TLR9, TBXA2R, VDR, NOD2, and STAT6 in airway hyperresponsiveness; TLR10, IFNGR2, STAT6, VDR, and NPSR1 in atopic asthma. Additionally we found an excess of SNPs with small effect sizes (OR < 1.4). The low rate of replication may be due to small effect size, differences in phenotypic definition, differential environmental effects, and/or genetic heterogeneity. To aid in future replication studies of asthma genes a comprehensive database was compiled and is available to the scientific community at http://genapha.icapture.ubc.ca/.


Assuntos
Asma/genética , Polimorfismo de Nucleotídeo Único , Alelos , Austrália , Hiper-Reatividade Brônquica/genética , Canadá , Estudos de Casos e Controles , Família , Feminino , Frequência do Gene , Genética Populacional , Estudo de Associação Genômica Ampla , Genótipo , Humanos , Hipersensibilidade Imediata/genética , Masculino , Fenótipo
19.
Nucleic Acids Res ; 35(Database issue): D122-6, 2007 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-17148480

RESUMO

We describe PReMod, a new database of genome-wide cis-regulatory module (CRM) predictions for both the human and the mouse genomes. The prediction algorithm, described previously in Blanchette et al. (2006) Genome Res., 16, 656-668, exploits the fact that many known CRMs are made of clusters of phylogenetically conserved and repeated transcription factors (TF) binding sites. Contrary to other existing databases, PReMod is not restricted to modules located proximal to genes, but in fact mostly contains distal predicted CRMs (pCRMs). Through its web interface, PReMod allows users to (i) identify pCRMs around a gene of interest; (ii) identify pCRMs that have binding sites for a given TF (or a set of TFs) or (iii) download the entire dataset for local analyses. Queries can also be refined by filtering for specific chromosomal regions, for specific regions relative to genes or for the presence of CpG islands. The output includes information about the binding sites predicted within the selected pCRMs, and a graphical display of their distribution within the pCRMs. It also provides a visual depiction of the chromosomal context of the selected pCRMs in terms of neighboring pCRMs and genes, all of which are linked to the UCSC Genome Browser and the NCBI. PReMod: http://genomequebec.mcgill.ca/PReMod.


Assuntos
Bases de Dados de Ácidos Nucleicos , Elementos Reguladores de Transcrição , Fatores de Transcrição/metabolismo , Algoritmos , Animais , Sítios de Ligação , Genômica , Humanos , Internet , Camundongos , Interface Usuário-Computador
20.
PLoS One ; 13(7): e0200926, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30040866

RESUMO

BACKGROUND: The lack of accessible and structured documentation creates major barriers for investigators interested in understanding, properly interpreting and analyzing cohort data and biological samples. Providing the scientific community with open information is essential to optimize usage of these resources. A cataloguing toolkit is proposed by Maelstrom Research to answer these needs and support the creation of comprehensive and user-friendly study- and network-specific web-based metadata catalogues. METHODS: Development of the Maelstrom Research cataloguing toolkit was initiated in 2004. It was supported by the exploration of existing catalogues and standards, and guided by input from partner initiatives having used or pilot tested incremental versions of the toolkit. RESULTS: The cataloguing toolkit is built upon two main components: a metadata model and a suite of open-source software applications. The model sets out specific fields to describe study profiles; characteristics of the subpopulations of participants; timing and design of data collection events; and datasets/variables collected at each data collection event. It also includes the possibility to annotate variables with different classification schemes. When combined, the model and software support implementation of study and variable catalogues and provide a powerful search engine to facilitate data discovery. CONCLUSIONS: The Maelstrom Research cataloguing toolkit already serves several national and international initiatives and the suite of software is available to new initiatives through the Maelstrom Research website. With the support of new and existing partners, we hope to ensure regular improvements of the toolkit.


Assuntos
Estudos de Coortes , Análise de Dados , Bases de Dados Factuais , Estudos Epidemiológicos , Humanos , Modelos Estatísticos , Software , Interface Usuário-Computador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA