Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
1.
Sci Rep ; 14(1): 7028, 2024 03 25.
Artigo em Inglês | MEDLINE | ID: mdl-38528062

RESUMO

Accurate indel calling plays an important role in precision medicine. A benchmarking indel set is essential for thoroughly evaluating the indel calling performance of bioinformatics pipelines. A reference sample with a set of known-positive variants was developed in the FDA-led Sequencing Quality Control Phase 2 (SEQC2) project, but the known indels in the known-positive set were limited. This project sought to provide an enriched set of known indels that would be more translationally relevant by focusing on additional cancer related regions. A thorough manual review process completed by 42 reviewers, two advisors, and a judging panel of three researchers significantly enriched the known indel set by an additional 516 indels. The extended benchmarking indel set has a large range of variant allele frequencies (VAFs), with 87% of them having a VAF below 20% in reference Sample A. The reference Sample A and the indel set can be used for comprehensive benchmarking of indel calling across a wider range of VAF values in the lower range. Indel length was also variable, but the majority were under 10 base pairs (bps). Most of the indels were within coding regions, with the remainder in the gene regulatory regions. Although high confidence can be derived from the robust study design and meticulous human review, this extensive indel set has not undergone orthogonal validation. The extended benchmarking indel set, along with the indels in the previously published known-positive set, was the truth set used to benchmark indel calling pipelines in a community challenge hosted on the precisionFDA platform. This benchmarking indel set and reference samples can be utilized for a comprehensive evaluation of indel calling pipelines. Additionally, the insights and solutions obtained during the manual review process can aid in improving the performance of these pipelines.


Assuntos
Benchmarking , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Biologia Computacional , Controle de Qualidade , Mutação INDEL , Polimorfismo de Nucleotídeo Único
2.
J Alzheimers Dis ; 87(2): 583-594, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35311706

RESUMO

BACKGROUND: Structural brain imaging metrics and gene expression biomarkers have previously been used for Alzheimer's disease (AD) diagnosis and prognosis, but none of these studies explored integration of imaging and gene expression biomarkers for predicting mild cognitive impairment (MCI)-to-AD conversion 1-2 years into the future. OBJECTIVE: We investigated advantages of combining gene expression and structural brain imaging features for predicting MCI-to-AD conversion. Selection of the differentially expressed genes (DEGs) for classifying cognitively normal (CN) controls and AD patients was benchmarked against previously reported results. METHODS: The current work proposes integrating brain imaging and blood gene expression data from two public datasets (ADNI and ANM) to predict MCI-to-AD conversion. A novel pipeline for combining gene expression data from multiple platforms is proposed and evaluated in the two independents patient cohorts. RESULTS: Combining DEGs and imaging biomarkers for predicting MCI-to-AD conversion yielded 0.832-0.876 receiver operating characteristic (ROC) area under the curve (AUC), which exceeded the 0.808-0.840 AUC from using the imaging features alone. With using only three DEGs, the CN versus AD predictive model achieved 0.718, 0.858, and 0.873 cross-validation AUC for the ADNI, ANM1, and ANM2 datasets. CONCLUSION: For the first time we show that combining gene expression and imaging biomarkers yields better predictive performance than using imaging metrics alone. A novel pipeline for combining gene expression data from multiple platforms is proposed and evaluated to produce consistent results in the two independents patient cohorts. Using an improved feature selection, we show that predictive models with fewer gene expression probes can achieve competitive performance.


Assuntos
Doença de Alzheimer , Disfunção Cognitiva , Doença de Alzheimer/diagnóstico por imagem , Doença de Alzheimer/genética , Biomarcadores , Encéfalo/diagnóstico por imagem , Disfunção Cognitiva/diagnóstico por imagem , Disfunção Cognitiva/genética , Progressão da Doença , Expressão Gênica , Humanos , Imageamento por Ressonância Magnética/métodos
3.
Genome Biol ; 22(1): 332, 2021 12 06.
Artigo em Inglês | MEDLINE | ID: mdl-34872606

RESUMO

BACKGROUND: Cytosine modifications in DNA such as 5-methylcytosine (5mC) underlie a broad range of developmental processes, maintain cellular lineage specification, and can define or stratify types of cancer and other diseases. However, the wide variety of approaches available to interrogate these modifications has created a need for harmonized materials, methods, and rigorous benchmarking to improve genome-wide methylome sequencing applications in clinical and basic research. Here, we present a multi-platform assessment and cross-validated resource for epigenetics research from the FDA's Epigenomics Quality Control Group. RESULTS: Each sample is processed in multiple replicates by three whole-genome bisulfite sequencing (WGBS) protocols (TruSeq DNA methylation, Accel-NGS MethylSeq, and SPLAT), oxidative bisulfite sequencing (TrueMethyl), enzymatic deamination method (EMSeq), targeted methylation sequencing (Illumina Methyl Capture EPIC), single-molecule long-read nanopore sequencing from Oxford Nanopore Technologies, and 850k Illumina methylation arrays. After rigorous quality assessment and comparison to Illumina EPIC methylation microarrays and testing on a range of algorithms (Bismark, BitmapperBS, bwa-meth, and BitMapperBS), we find overall high concordance between assays, but also differences in efficiency of read mapping, CpG capture, coverage, and platform performance, and variable performance across 26 microarray normalization algorithms. CONCLUSIONS: The data provided herein can guide the use of these DNA reference materials in epigenomics research, as well as provide best practices for experimental design in future studies. By leveraging seven human cell lines that are designated as publicly available reference materials, these data can be used as a baseline to advance epigenomics research.


Assuntos
Epigênese Genética , Epigenômica/métodos , Controle de Qualidade , 5-Metilcitosina , Algoritmos , Ilhas de CpG , DNA/genética , Metilação de DNA , Epigenoma , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Alinhamento de Sequência , Análise de Sequência de DNA/métodos , Sulfitos , Sequenciamento Completo do Genoma/métodos
5.
Nat Biotechnol ; 28(8): 827-38, 2010 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-20676074

RESUMO

Gene expression data from microarrays are being applied to predict preclinical and clinical endpoints, but the reliability of these predictions has not been established. In the MAQC-II project, 36 independent teams analyzed six microarray data sets to generate predictive models for classifying a sample with respect to one of 13 endpoints indicative of lung or liver toxicity in rodents, or of breast cancer, multiple myeloma or neuroblastoma in humans. In total, >30,000 models were built using many combinations of analytical methods. The teams generated predictive models without knowing the biological meaning of some of the endpoints and, to mimic clinical reality, tested the models on data that had not been used for training. We found that model performance depended largely on the endpoint and team proficiency and that different approaches generated models of similar performance. The conclusions and recommendations from MAQC-II should be useful for regulatory agencies, study committees and independent investigators that evaluate methods for global gene expression analysis.


Assuntos
Hepatopatias/genética , Pneumopatias/genética , Neoplasias/genética , Neoplasias/mortalidade , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Análise de Sequência com Séries de Oligonucleotídeos/normas , Animais , Neoplasias da Mama/diagnóstico , Neoplasias da Mama/genética , Modelos Animais de Doenças , Feminino , Perfilação da Expressão Gênica/métodos , Perfilação da Expressão Gênica/normas , Guias como Assunto , Humanos , Hepatopatias/etiologia , Hepatopatias/patologia , Pneumopatias/etiologia , Pneumopatias/patologia , Mieloma Múltiplo/diagnóstico , Mieloma Múltiplo/genética , Neoplasias/diagnóstico , Neuroblastoma/diagnóstico , Neuroblastoma/genética , Valor Preditivo dos Testes , Controle de Qualidade , Ratos , Análise de Sobrevida
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...