Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
1.
Cytometry A ; 101(4): 351-360, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-34967113

RESUMO

Mislabeling samples or data with the wrong participant information can affect study integrity and lead investigators to draw inaccurate conclusions. Quality control to prevent these types of errors is commonly embedded into the analysis of genomic datasets, but a similar identification strategy is not standard for cytometric data. Here, we present a method for detecting sample identification errors in cytometric data using expression of human leukocyte antigen (HLA) class I alleles. We measured HLA-A*02 and HLA-B*07 expression in three longitudinal samples from 41 participants using a 33-marker CyTOF panel designed to identify major immune cell types. 3/123 samples (2.4%) showed HLA allele expression that did not match their longitudinal pairs. Furthermore, these same three samples' cytometric signature did not match qPCR HLA class I allele data, suggesting that they were accurately identified as mismatches. We conclude that this technique is useful for detecting sample-labeling errors in cytometric analyses of longitudinal data. This technique could also be used in conjunction with another method, like GWAS or PCR, to detect errors in cross-sectional data. We suggest widespread adoption of this or similar techniques will improve the quality of clinical studies that utilize cytometry.


Assuntos
Estudos Transversais , Alelos , Humanos , Reação em Cadeia da Polimerase em Tempo Real
2.
BMC Genomics ; 20(Suppl 12): 1001, 2019 Dec 30.
Artigo em Inglês | MEDLINE | ID: mdl-31888490

RESUMO

BACKGROUND: Inadvertent sample swaps are a real threat to data quality in any medium to large scale omics studies. While matches between samples from the same individual can in principle be identified from a few well characterized single nucleotide polymorphisms (SNPs), omics data types often only provide low to moderate coverage, thus requiring integration of evidence from a large number of SNPs to determine if two samples derive from the same individual or not. METHODS: We select about six thousand SNPs in the human genome and develop a Bayesian framework that is able to robustly identify sample matches between next generation sequencing data sets. RESULTS: We validate our approach on a variety of data sets. Most importantly, we show that our approach can establish identity between different omics data types such as Exome, RNA-Seq, and MethylCap-Seq. We demonstrate how identity detection degrades with sample quality and read coverage, but show that twenty million reads of a fairly low quality RNA-Seq sample are still sufficient for reliable sample identification. CONCLUSION: Our tool, SMASH, is able to identify sample mismatches in next generation sequencing data sets between different sequencing modalities and for low quality sequencing data.


Assuntos
Genômica/métodos , Polimorfismo de Nucleotídeo Único/genética , Software , Teorema de Bayes , Genoma Humano/genética , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Reprodutibilidade dos Testes , Análise de Sequência de DNA
3.
Gigascience ; 132024 01 02.
Artigo em Inglês | MEDLINE | ID: mdl-38832466

RESUMO

BACKGROUND: Due to human error, sample swapping in large cohort studies with heterogeneous data types (e.g., mix of Oxford Nanopore Technologies, Pacific Bioscience, Illumina data, etc.) remains a common issue plaguing large-scale studies. At present, all sample swapping detection methods require costly and unnecessary (e.g., if data are only used for genome assembly) alignment, positional sorting, and indexing of the data in order to compare similarly. As studies include more samples and new sequencing data types, robust quality control tools will become increasingly important. FINDINGS: The similarity between samples can be determined using indexed k-mer sequence variants. To increase statistical power, we use coverage information on variant sites, calculating similarity using a likelihood ratio-based test. Per sample error rate, and coverage bias (i.e., missing sites) can also be estimated with this information, which can be used to determine if a spatially indexed principal component analysis (PCA)-based prescreening method can be used, which can greatly speed up analysis by preventing exhaustive all-to-all comparisons. CONCLUSIONS: Because this tool processes raw data, is faster than alignment, and can be used on very low-coverage data, it can save an immense degree of computational resources in standard quality control (QC) pipelines. It is robust enough to be used on different sequencing data types, important in studies that leverage the strengths of different sequencing technologies. In addition to its primary use case of sample swap detection, this method also provides information useful in QC, such as error rate and coverage bias, as well as population-level PCA ancestry analysis visualization.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência de DNA , Humanos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Software , Análise de Componente Principal , Biologia Computacional/métodos , Algoritmos
4.
J Microbiol Methods ; 197: 106482, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-35551970

RESUMO

In the Netherlands, local laboratories are involved in the primary diagnosis of tuberculosis. Positive Mycobacterium tuberculosis complex cultures are sent to the National Institute for Public Health and the Environment (RIVM) for species identification, epidemiological typing, and screening for resistance by Whole Genome Sequencing (WGS). Occasional sample-swaps and cross-contaminations are known to occur in the diagnostic procedures. Such errors may lead to incorrect diagnoses resulting in the unnecessary or sub-optimal treatment of patients. Internal controls throughout the process ideally allow the early detection of such mistakes.


Assuntos
Mycobacterium tuberculosis , Tuberculose dos Linfonodos , DNA , Genoma Bacteriano , Humanos , Mycobacterium tuberculosis/genética , Sequenciamento Completo do Genoma/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA