Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
2.
Bioinformatics ; 36(3): 660-665, 2020 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-31397839

RESUMO

MOTIVATION: DNA methylation plays an important role in regulating gene expression. DNA methylation is commonly analyzed using bisulfite sequencing (BS-seq)-based designs, such as whole-genome bisulfite sequencing (WGBS), reduced representation bisulfite sequencing (RRBS) and oxidative bisulfite sequencing (oxBS-seq). Furthermore, there has been growing interest in investigating the roles that genetic variants play in changing the methylation levels (i.e. methylation quantitative trait loci or meQTLs), how methylation regulates the imprinting of gene expression (i.e. allele-specific methylation or ASM) and the differentially methylated regions (DMRs) among different cell types. However, none of the current simulation tools can generate different BS-seq data types (e.g. WGBS, RRBS and oxBS-seq) while modeling meQTLs, ASM and DMRs. RESULTS: We developed profile-based whole-genome bisulfite sequencing data simulator (pWGBSSimla), a profile-based bisulfite sequencing data simulator, which simulates WGBS, RRBS and oxBS-seq data for different cell types based on real data. meQTLs and ASM are modeled based on the block structures of the methylation status at CpGs, whereas the simulation of DMRs is based on observations of methylation rates in real data. We demonstrated that pWGBSSimla adequately simulates data and allows performance comparisons among different methylation analysis methods. AVAILABILITY AND IMPLEMENTATION: pWGBSSimla is available at https://omicssimla.sourceforge.io. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Locos de Características Quantitativas , Sulfitos , Alelos , Metilação de DNA , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência de DNA , Sequenciamento Completo do Genoma
3.
Gigascience ; 8(5)2019 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-31029063

RESUMO

BACKGROUND: An integrative multi-omics analysis approach that combines multiple types of omics data including genomics, epigenomics, transcriptomics, proteomics, metabolomics, and microbiomics has become increasing popular for understanding the pathophysiology of complex diseases. Although many multi-omics analysis methods have been developed for complex disease studies, only a few simulation tools that simulate multiple types of omics data and model their relationships with disease status are available, and these tools have their limitations in simulating the multi-omics data. RESULTS: We developed the multi-omics data simulator OmicsSIMLA, which simulates genomics (i.e., single-nucleotide polymorphisms [SNPs] and copy number variations), epigenomics (i.e., bisulphite sequencing), transcriptomics (i.e., RNA sequencing), and proteomics (i.e., normalized reverse phase protein array) data at the whole-genome level. Furthermore, the relationships between different types of omics data, such as methylation quantitative trait loci (SNPs influencing methylation), expression quantitative trait loci (SNPs influencing gene expression), and expression quantitative trait methylations (methylations influencing gene expression), were modeled. More importantly, the relationships between these multi-omics data and the disease status were modeled as well. We used OmicsSIMLA to simulate a multi-omics dataset for breast cancer under a hypothetical disease model and used the data to compare the performance among existing multi-omics analysis methods in terms of disease classification accuracy and runtime. We also used OmicsSIMLA to simulate a multi-omics dataset with a scale similar to an ovarian cancer multi-omics dataset. The neural network-based multi-omics analysis method ATHENA was applied to both the real and simulated data and the results were compared. Our results demonstrated that complex disease mechanisms can be simulated by OmicsSIMLA, and ATHENA showed the highest prediction accuracy when the effects of multi-omics features (e.g., SNPs, copy number variations, and gene expression levels) on the disease were strong. Furthermore, similar results can be obtained from ATHENA when analyzing the simulated and real ovarian multi-omics data. CONCLUSIONS: OmicsSIMLA will be useful to evaluate the performace of different multi-omics analysis methods. Sample sizes and power can also be calculated by OmicsSIMLA when planning a new multi-omics disease study.


Assuntos
Biologia Computacional , Doenças Genéticas Inatas/genética , Genômica , Locos de Características Quantitativas/genética , Algoritmos , Variações do Número de Cópias de DNA/genética , Metilação de DNA/genética , Epigenômica , Doenças Genéticas Inatas/classificação , Humanos , Metabolômica , Polimorfismo de Nucleotídeo Único/genética , Proteômica , Transcriptoma/genética
4.
Front Genet ; 8: 228, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-29358944

RESUMO

Next-generation sequencing (NGS) has been widely used in genetic association studies to identify both common and rare variants associated with complex diseases. Various statistical association tests have been developed to analyze NGS data; however, most focus on identifying the marginal effects of a set of genetic variants on the disease. Only a few association tests for NGS data analysis have considered the interaction effects between genes. We developed three powerful gene-based gene-gene interaction tests for testing both the main effects and the interaction effects of common, low-frequency, and common with low-frequency variant pairs between two genes (the IGOF tests) in case-control studies using NGS data. We performed a comprehensive simulation study to verify that the proposed tests had appropriate type I error rates and significantly higher power than did other interaction tests for analyzing NGS data. The tests were applied to a whole-exome sequencing dataset for autism spectrum disorder (ASD) and the significant results were evaluated in another independent ASD cohort. The IGOF tests were implemented in C++ and are available at http://igof.sourceforge.net.

5.
PLoS Comput Biol ; 12(6): e1004980, 2016 06.
Artigo em Inglês | MEDLINE | ID: mdl-27272119

RESUMO

In disease studies, family-based designs have become an attractive approach to analyzing next-generation sequencing (NGS) data for the identification of rare mutations enriched in families. Substantial research effort has been devoted to developing pipelines for automating sequence alignment, variant calling, and annotation. However, fewer pipelines have been designed specifically for disease studies. Most of the current analysis pipelines for family-based disease studies using NGS data focus on a specific function, such as identifying variants with Mendelian inheritance or identifying shared chromosomal regions among affected family members. Consequently, some other useful family-based analysis tools, such as imputation, linkage, and association tools, have yet to be integrated and automated. We developed FamPipe, a comprehensive analysis pipeline, which includes several family-specific analysis modules, including the identification of shared chromosomal regions among affected family members, prioritizing variants assuming a disease model, imputation of untyped variants, and linkage and association tests. We used simulation studies to compare properties of some modules implemented in FamPipe, and based on the results, we provided suggestions for the selection of modules to achieve an optimal analysis strategy. The pipeline is under the GNU GPL License and can be downloaded for free at http://fampipe.sourceforge.net.


Assuntos
Predisposição Genética para Doença/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Software , Biologia Computacional , Simulação por Computador , Estudos de Associação Genética , Humanos , Internet
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...