Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 29
Filtrar
1.
Genome Biol ; 25(1): 163, 2024 06 20.
Artigo em Inglês | MEDLINE | ID: mdl-38902799

RESUMO

BACKGROUND: Copy number variation (CNV) is a key genetic characteristic for cancer diagnostics and can be used as a biomarker for the selection of therapeutic treatments. Using data sets established in our previous study, we benchmark the performance of cancer CNV calling by six most recent and commonly used software tools on their detection accuracy, sensitivity, and reproducibility. In comparison to other orthogonal methods, such as microarray and Bionano, we also explore the consistency of CNV calling across different technologies on a challenging genome. RESULTS: While consistent results are observed for copy gain, loss, and loss of heterozygosity (LOH) calls across sequencing centers, CNV callers, and different technologies, variation of CNV calls are mostly affected by the determination of genome ploidy. Using consensus results from six CNV callers and confirmation from three orthogonal methods, we establish a high confident CNV call set for the reference cancer cell line (HCC1395). CONCLUSIONS: NGS technologies and current bioinformatics tools can offer reliable results for detection of copy gain, loss, and LOH. However, when working with a hyper-diploid genome, some software tools can call excessive copy gain or loss due to inaccurate assessment of genome ploidy. With performance matrices on various experimental conditions, this study raises awareness within the cancer research community for the selection of sequencing platforms, sample preparation, sequencing coverage, and the choice of CNV detection tools.


Assuntos
Biologia Computacional , Variações do Número de Cópias de DNA , Sequenciamento de Nucleotídeos em Larga Escala , Perda de Heterozigosidade , Neoplasias , Software , Humanos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Neoplasias/genética , Biologia Computacional/métodos , Diploide , Genoma Humano , Linhagem Celular Tumoral , Reprodutibilidade dos Testes , Análise de Sequência de DNA/métodos
2.
BMC Med Imaging ; 24(1): 122, 2024 May 24.
Artigo em Inglês | MEDLINE | ID: mdl-38789963

RESUMO

In response to the low real-time performance and accuracy of traditional sports injury monitoring, this article conducts research on a real-time injury monitoring system using the SVM model as an example. Video detection is performed to capture human movements, followed by human joint detection. Polynomial fitting analysis is used to extract joint motion patterns, and the average of training data is calculated as a reference point. The raw data is then normalized to adjust position and direction, and dimensionality reduction is achieved through singular value decomposition to enhance processing efficiency and model training speed. A support vector machine classifier is used to classify and identify the processed data. The experimental section monitors sports injuries and investigates the accuracy of the system's monitoring. Compared to mainstream models such as Random Forest and Naive Bayes, the SVM utilized demonstrates good performance in accuracy, sensitivity, and specificity, reaching 94.2%, 92.5%, and 96.0% respectively.


Assuntos
Traumatismos em Atletas , Aprendizado Profundo , Máquina de Vetores de Suporte , Humanos , Traumatismos em Atletas/diagnóstico por imagem , Gravação em Vídeo , Sensibilidade e Especificidade , Algoritmos
3.
Food Chem ; 447: 138969, 2024 Jul 30.
Artigo em Inglês | MEDLINE | ID: mdl-38507947

RESUMO

Food authenticity is extremely important and widely targeted bi-omics is a promising pipeline attributing to incorporating metabolomics and peptidomics. Colla Corii Asini (CCA, Ejiao) is one of the most popular tonic edible materials, with counterfeit and adulterated products being widespread. An attempt was devoted to develop a high-throughput and reliable DI-MRM3 program facilitating widely targeted bi-omics of CCA. Firstly, predictive MRM program captured metabolites and peptides in trypsin-digestive gelatins. After data alignment and structure annotation, primary parameters such as Q1 â†’ Q3 â†’ QLIT, CE, and EE were optimized for all 17 metabolites and 34 peptides by online ER-MS. Though a single run merely consumed 6.5 min, great selectivity was reached for each analyte. Statistical results showed that nine peptides contributed to distinguish CCA from other gelatins. After cross-validation with LC-MRM, DI-MRM3 was justified to be reproducible and high-throughput for widely targeted bi-omics of CCA, suggesting a meaningful tool for food authenticity.


Assuntos
Gelatina , Peptídeos , Gelatina/química , Metabolômica , China
4.
Nanoscale ; 16(12): 6199-6214, 2024 Mar 21.
Artigo em Inglês | MEDLINE | ID: mdl-38446101

RESUMO

While the filtering and accumulation effects of the extracellular matrix (ECM) on nanoparticles (NPs) have been experimentally observed, the detailed interactions between NPs and specific biomolecules within the ECM remain poorly understood and pose challenges for in vivo molecular-level investigations. Herein, we adopt molecular dynamics simulations to elucidate the impacts of methyl-, hydroxy-, amine-, and carboxyl-modified gold NPs on the cell-binding domains of fibronectin (Fn), an indispensable component of the ECM for cell attachment and signaling. Simulation results show that NPs can specifically bind to distinct Fn domains, and the strength of these interactions depends on the physicochemical properties of NPs. NP-NH3+ exhibits the highest affinity to domains rich in acidic residues, leading to strong electrostatic interactions that induce severe deformation, potentially disrupting the normal functioning of Fn. NP-CH3 and NP-COO- selectively occupy the RGD/PHSRN motifs, which may hinder their recognition by integrins on the cell surface. Additionally, NPs can disrupt the dimerization of Fn through competing for residues at the dimer interface or by diminishing the shape complementarity between dimerized proteins. The mechanical stretching of Fn, crucial for ECM fibrillogenesis, is suppressed by NPs due to their local rigidifying effect. These results provide valuable molecular-level insights into the impacts of various NPs on the ECM, holding significant implications for advancing nanomedicine and nanosafety evaluation.


Assuntos
Fibronectinas , Nanopartículas , Fibronectinas/química , Integrinas/metabolismo , Matriz Extracelular/metabolismo , Transdução de Sinais
5.
Sci Rep ; 14(1): 7028, 2024 03 25.
Artigo em Inglês | MEDLINE | ID: mdl-38528062

RESUMO

Accurate indel calling plays an important role in precision medicine. A benchmarking indel set is essential for thoroughly evaluating the indel calling performance of bioinformatics pipelines. A reference sample with a set of known-positive variants was developed in the FDA-led Sequencing Quality Control Phase 2 (SEQC2) project, but the known indels in the known-positive set were limited. This project sought to provide an enriched set of known indels that would be more translationally relevant by focusing on additional cancer related regions. A thorough manual review process completed by 42 reviewers, two advisors, and a judging panel of three researchers significantly enriched the known indel set by an additional 516 indels. The extended benchmarking indel set has a large range of variant allele frequencies (VAFs), with 87% of them having a VAF below 20% in reference Sample A. The reference Sample A and the indel set can be used for comprehensive benchmarking of indel calling across a wider range of VAF values in the lower range. Indel length was also variable, but the majority were under 10 base pairs (bps). Most of the indels were within coding regions, with the remainder in the gene regulatory regions. Although high confidence can be derived from the robust study design and meticulous human review, this extensive indel set has not undergone orthogonal validation. The extended benchmarking indel set, along with the indels in the previously published known-positive set, was the truth set used to benchmark indel calling pipelines in a community challenge hosted on the precisionFDA platform. This benchmarking indel set and reference samples can be utilized for a comprehensive evaluation of indel calling pipelines. Additionally, the insights and solutions obtained during the manual review process can aid in improving the performance of these pipelines.


Assuntos
Benchmarking , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Biologia Computacional , Controle de Qualidade , Mutação INDEL , Polimorfismo de Nucleotídeo Único
6.
Genome Biol ; 25(1): 34, 2024 01 24.
Artigo em Inglês | MEDLINE | ID: mdl-38268000

RESUMO

BACKGROUND: Various laboratory-developed metabolomic methods lead to big challenges in inter-laboratory comparability and effective integration of diverse datasets. RESULTS: As part of the Quartet Project, we establish a publicly available suite of four metabolite reference materials derived from B lymphoblastoid cell lines from a family of parents and monozygotic twin daughters. We generate comprehensive LC-MS-based metabolomic data from the Quartet reference materials using targeted and untargeted strategies in different laboratories. The Quartet multi-sample-based signal-to-noise ratio enables objective assessment of the reliability of intra-batch and cross-batch metabolomics profiling in detecting intrinsic biological differences among the four groups of samples. Significant variations in the reliability of the metabolomics profiling are identified across laboratories. Importantly, ratio-based metabolomics profiling, by scaling the absolute values of a study sample relative to those of a common reference sample, enables cross-laboratory quantitative data integration. Thus, we construct the ratio-based high-confidence reference datasets between two reference samples, providing "ground truth" for inter-laboratory accuracy assessment, which enables objective evaluation of quantitative metabolomics profiling using various instruments and protocols. CONCLUSIONS: Our study provides the community with rich resources and best practices for inter-laboratory proficiency tests and data integration, ensuring reliability of large-scale and longitudinal metabolomic studies.


Assuntos
Espectrometria de Massa com Cromatografia Líquida , Metabolômica , Humanos , Reprodutibilidade dos Testes , Linhagem Celular , Gêmeos Monozigóticos
7.
Genome Biol ; 24(1): 277, 2023 Dec 04.
Artigo em Inglês | MEDLINE | ID: mdl-38049885

RESUMO

BACKGROUND: Recent state-of-the-art sequencing technologies enable the investigation of challenging regions in the human genome and expand the scope of variant benchmarking datasets. Herein, we sequence a Chinese Quartet, comprising two monozygotic twin daughters and their biological parents, using four short and long sequencing platforms (Illumina, BGI, PacBio, and Oxford Nanopore Technology). RESULTS: The long reads from the monozygotic twin daughters are phased into paternal and maternal haplotypes using the parent-child genetic map and for each haplotype. We also use long reads to generate haplotype-resolved whole-genome assemblies with completeness and continuity exceeding that of GRCh38. Using this Quartet, we comprehensively catalogue the human variant landscape, generating a dataset of 3,962,453 SNVs, 886,648 indels (< 50 bp), 9726 large deletions (≥ 50 bp), 15,600 large insertions (≥ 50 bp), 40 inversions, 31 complex structural variants, and 68 de novo mutations which are shared between the monozygotic twin daughters. Variants underrepresented in previous benchmarks owing to their complexity-including those located at long repeat regions, complex structural variants, and de novo mutations-are systematically examined in this study. CONCLUSIONS: In summary, this study provides high-quality haplotype-resolved assemblies and a comprehensive set of benchmarking resources for two Chinese monozygotic twin samples which, relative to existing benchmarks, offers expanded genomic coverage and insight into complex variant categories.


Assuntos
Benchmarking , População do Leste Asiático , Gêmeos Monozigóticos , Humanos , População do Leste Asiático/genética , Genômica , Haplótipos , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência de DNA , Gêmeos Monozigóticos/genética , Estudos em Gêmeos como Assunto
8.
Genome Biol ; 24(1): 270, 2023 Nov 27.
Artigo em Inglês | MEDLINE | ID: mdl-38012772

RESUMO

BACKGROUND: Genomic DNA reference materials are widely recognized as essential for ensuring data quality in omics research. However, relying solely on reference datasets to evaluate the accuracy of variant calling results is incomplete, as they are limited to benchmark regions. Therefore, it is important to develop DNA reference materials that enable the assessment of variant detection performance across the entire genome. RESULTS: We established a DNA reference material suite from four immortalized cell lines derived from a family of parents and monozygotic twins. Comprehensive reference datasets of 4.2 million small variants and 15,000 structural variants were integrated and certified for evaluating the reliability of germline variant calls inside the benchmark regions. Importantly, the genetic built-in-truth of the Quartet family design enables estimation of the precision of variant calls outside the benchmark regions. Using the Quartet reference materials along with study samples, batch effects are objectively monitored and alleviated by training a machine learning model with the Quartet reference datasets to remove potential artifact calls. Moreover, the matched RNA and protein reference materials and datasets from the Quartet project enables cross-omics validation of variant calls from multiomics data. CONCLUSIONS: The Quartet DNA reference materials and reference datasets provide a unique resource for objectively assessing the quality of germline variant calls throughout the whole-genome regions and improving the reliability of large-scale genomic profiling.


Assuntos
Benchmarking , Genoma Humano , Humanos , Reprodutibilidade dos Testes , Polimorfismo de Nucleotídeo Único , Células Germinativas , Sequenciamento de Nucleotídeos em Larga Escala/métodos
9.
Genome Biol ; 24(1): 245, 2023 10 26.
Artigo em Inglês | MEDLINE | ID: mdl-37884999

RESUMO

The Quartet Data Portal facilitates community access to well-characterized reference materials, reference datasets, and related resources established based on a family of four individuals with identical twins from the Quartet Project. Users can request DNA, RNA, protein, and metabolite reference materials, as well as datasets generated across omics, platforms, labs, protocols, and batches. Reproducible analysis tools allow for objective performance assessment of user-submitted data, while interactive visualization tools support rapid exploration of reference datasets. A closed-loop "distribution-collection-evaluation-integration" workflow enables updates and integration of community-contributed multiomics data. Ultimately, this portal helps promote the advancement of reference datasets and multiomics quality control.


Assuntos
Multiômica , Software , Humanos , Controle de Qualidade
11.
Nat Biotechnol ; 2023 Sep 07.
Artigo em Inglês | MEDLINE | ID: mdl-37679545

RESUMO

Certified RNA reference materials are indispensable for assessing the reliability of RNA sequencing to detect intrinsically small biological differences in clinical settings, such as molecular subtyping of diseases. As part of the Quartet Project for quality control and data integration of multi-omics profiling, we established four RNA reference materials derived from immortalized B-lymphoblastoid cell lines from four members of a monozygotic twin family. Additionally, we constructed ratio-based transcriptome-wide reference datasets between two samples, providing cross-platform and cross-laboratory 'ground truth'. Investigation of the intrinsically subtle biological differences among the Quartet samples enables sensitive assessment of cross-batch integration of transcriptomic measurements at the ratio level. The Quartet RNA reference materials, combined with the ratio-based reference datasets, can serve as unique resources for assessing and improving the quality of transcriptomic data in clinical and biological settings.

12.
Nat Biotechnol ; 2023 Sep 07.
Artigo em Inglês | MEDLINE | ID: mdl-37679543

RESUMO

Characterization and integration of the genome, epigenome, transcriptome, proteome and metabolome of different datasets is difficult owing to a lack of ground truth. Here we develop and characterize suites of publicly available multi-omics reference materials of matched DNA, RNA, protein and metabolites derived from immortalized cell lines from a family quartet of parents and monozygotic twin daughters. These references provide built-in truth defined by relationships among the family members and the information flow from DNA to RNA to protein. We demonstrate how using a ratio-based profiling approach that scales the absolute feature values of a study sample relative to those of a concurrently measured common reference sample produces reproducible and comparable data suitable for integration across batches, labs, platforms and omics types. Our study identifies reference-free 'absolute' feature quantification as the root cause of irreproducibility in multi-omics measurement and data integration and establishes the advantages of ratio-based multi-omics profiling with common reference materials.

13.
Genome Biol ; 24(1): 201, 2023 09 07.
Artigo em Inglês | MEDLINE | ID: mdl-37674217

RESUMO

BACKGROUND: Batch effects are notoriously common technical variations in multiomics data and may result in misleading outcomes if uncorrected or over-corrected. A plethora of batch-effect correction algorithms are proposed to facilitate data integration. However, their respective advantages and limitations are not adequately assessed in terms of omics types, the performance metrics, and the application scenarios. RESULTS: As part of the Quartet Project for quality control and data integration of multiomics profiling, we comprehensively assess the performance of seven batch effect correction algorithms based on different performance metrics of clinical relevance, i.e., the accuracy of identifying differentially expressed features, the robustness of predictive models, and the ability of accurately clustering cross-batch samples into their own donors. The ratio-based method, i.e., by scaling absolute feature values of study samples relative to those of concurrently profiled reference material(s), is found to be much more effective and broadly applicable than others, especially when batch effects are completely confounded with biological factors of study interests. We further provide practical guidelines for implementing the ratio based approach in increasingly large-scale multiomics studies. CONCLUSIONS: Multiomics measurements are prone to batch effects, which can be effectively corrected using ratio-based scaling of the multiomics data. Our study lays the foundation for eliminating batch effects at a ratio scale.


Assuntos
Algoritmos , Multiômica , Composição de Bases , Benchmarking , Relevância Clínica
14.
Eur J Med Chem ; 260: 115728, 2023 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-37625288

RESUMO

The mitochondria have been identified as key targets in nonalcoholic fatty liver disease (NAFLD), one of the most prevalent chronic liver damage diseases globally. Meanwhile, the biological information analysis in this study revealed that SIRT1, PPARG, PPARA, and PPARGC1A (mitochondrial biogenesis-related proteins) were NAFLD therapeutic targets. Therefore, the design and synthesis of targeted drugs that promote mitochondrial biogenesis and improve mitochondrial function are particularly important for NAFLD treatment. Recently, we introduced butyls, hydroxyls, and halogens to benzophenone and synthesized a series of NAFLD-related 4-butylpolyhydroxybenzophenone compounds, aiming at investigating the hepatoprotective activity from the aspect of mitochondrial biogenesis. The structure-activity relationship demonstrated that hydroxyl and ketone groups were active groups interacting with mitochondrial biogenesis proteins (SIRT1 and PGC1α), and the activity was stronger when the o-hydroxyl group was present on the benzene ring. In contrast, the activity was little affected by the presence of the p-hydroxyl group, m-hydroxyl group, butyl group type, or halogen. In addition, in vitro studies confirmed that these compounds could directly bind to SIRT1 and PGC1α, markedly promote their interaction, significantly increase the expression of proteins and genes related to mitochondrial biogenesis (SIRT1, PGC1α, NRF1, TFAM, COX1, and ND6) and subsequently ameliorate mitochondria dysfunction, which was evidenced by the decreased ROS, upregulated ATP production, increased MMP, and enhanced mitochondrial number. According to the outcomes of our in vitro and in vivo experiments, 4-butyl-polyhydroxybenzophenone compounds could also effectively reduce the formation of lipid droplets and liver injury index (ALT, AST, LDH, AKP, γ-GT, and GDH) and improve the level of antioxidant enzymes (GSH and SOD). Particularly, the treatment of these compounds after a high-fat diet could significantly reduce body weight, decrease liver coefficient, attenuate liver damage, and ameliorate lipid accumulation in rat liver, demonstrating their therapeutic effects on NAFLD. Mechanistically, 4-butyl-polyhydroxybenzophenone compounds promoted mitochondrial biogenesis and eventually prevented NAFLD liver injury by activating the PGC1α signaling pathway in a SIRT1-dependent manner, which was strongly supported by SIRT1 inhibitor EX527.


Assuntos
Hepatopatia Gordurosa não Alcoólica , Animais , Ratos , Halogênios , Hepatopatia Gordurosa não Alcoólica/tratamento farmacológico , Biogênese de Organelas , Coativador 1-alfa do Receptor gama Ativado por Proliferador de Peroxissomo , Sirtuína 1
15.
Anal Methods ; 15(21): 2588-2598, 2023 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-37226530

RESUMO

The homeostasis of bile acid (BA)-submetabolome that is composed by correlating hundreds of BA species contributes a lot to maintaining physiological status. However, it is challenging to understand the transformational rules amongst endogenous BAs, but it is viable to profile the in vitro metabolism of BA analogues, as a compromise approach to isotopic labeling of BAs, to deduce the metabolism of BAs. An attempt is made here to characterize the metabolites of 23-nordeoxycholic acid (norDCA), a deoxycholic acid analogue with a C23-CH2 defect, after in vitro incubation with enzyme-enriched liver subcellular fractions of mouse, rat or human. A predictive multiple-reaction monitoring mode was deployed for sensitive metabolite detection, leading to the capture of twelve metabolites (M1-M12). After putative structural annotation by analyzing MS/MS spectra, special attention was paid to isomeric identification. Dozens of authentic BAs were collected and measured for modeling of the quantitative structure-retention time relationships. Because modifications in LC-MS/MS behaviors in response to C23-CH2 difference were characterized by comparing several pairs, the rules of 14.02 Da shift and 2.4-4.2 min distance were applied to improve identification confidence by matching with several authentic BAs bearing C23-CH2 additions compared to the metabolites. Consequently, confirmative structural identification was achieved for all metabolites. Metabolic pathways in response to M1-M12 were proposed, and hydroxylation, oxidation, epimerization, sulfation, and glucuronidation served as the primary metabolism channels for norDCA. Together, the findings provide meaningful information about the correlations between different endogenous BAs and the structural identification strategy offers a promising idea when facing an isomeric discrimination challenge.


Assuntos
Ácidos e Sais Biliares , Espectrometria de Massas em Tandem , Ratos , Humanos , Camundongos , Animais , Cromatografia Líquida , Ácido Desoxicólico
16.
J Hazard Mater ; 453: 131430, 2023 07 05.
Artigo em Inglês | MEDLINE | ID: mdl-37080032

RESUMO

By linking the cation and anion motifs of ionic liquids (ILs), zwitterionic liquids (ZILs) exhibit at least 146-2740 and 112-1550 folds less cytotoxicity in human gastric and colon cells than those of the structurally related ILs. Computer simulation shows that ZIL molecules hardly penetrate the cell membranes in contrast to ILs. These findings reveal a novel mechanism for ZILs to evade cytotoxicity, establishing a structure-based design principle for the next generation of sustainable ZILs.


Assuntos
Líquidos Iônicos , Humanos , Líquidos Iônicos/toxicidade , Simulação por Computador , Ânions
17.
Oxid Med Cell Longev ; 2023: 3782230, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36659905

RESUMO

Nonalcoholic fatty liver disease (NAFLD) has reached epidemic proportions with no pharmacological treatment approved. Several highly accessible computational tools were employed to predict the activities of twelve novel compounds prior to actual chemical synthesis. We began our work by designing two or three hydroxyl groups appended to the phenyl ketone core, followed by prediction of drug-likeness and targets. Most predicted targets for each compound overlapped with NAFLD targets (≥80%). Enrichment analysis showed that these compounds might regulate oxidoreductase activity. Then, these compounds were synthesized and confirmed by IR, MS, 1H, and 13C NMR. Their cell viability demonstrated that twelve compounds exhibited appreciable potencies against NAFLD (EC50 values ≤ 13.5 µM). Furthermore, the most potent compound 5f effectively prevented NAFLD progression as evidenced by the change in histological features. 5f significantly reduced total cholesterol and triglyceride levels in vitro/in vivo, and the effects of 5f were significantly stronger than those of the control drug. The proteomic data showed that oxidoreductase activity was the most significantly enriched, and this finding was consistent with docking results. In summary, this validated presynthesis prediction approach was cost-saving and worthy of popularization. The novel synthetic phenyl ketone derivative 5f holds great therapeutic potential by modulating oxidoreductase activity to counter NAFLD.


Assuntos
Hepatopatia Gordurosa não Alcoólica , Humanos , Simulação de Acoplamento Molecular , Hepatopatia Gordurosa não Alcoólica/tratamento farmacológico , Oxirredutases , Proteômica
18.
Sci Data ; 9(1): 201, 2022 05 12.
Artigo em Inglês | MEDLINE | ID: mdl-35551205

RESUMO

Rat is one of the most widely-used models in chemical safety evaluation and biomedical research. However, the knowledge about its microRNA (miRNA) expression patterns across multiple organs and various developmental stages is still limited. Here, we constructed a comprehensive rat miRNA expression BodyMap using a diverse collection of 320 RNA samples from 11 organs of both sexes of juvenile, adolescent, adult and aged Fischer 344 rats with four biological replicates per group. Following the Illumina TruSeq Small RNA protocol, an average of 5.1 million 50 bp single-end reads was generated per sample, yielding a total of 1.6 billion reads. The quality of the resulting miRNA-seq data was deemed to be high from raw sequences, mapped sequences, and biological reproducibility. Importantly, aliquots of the same RNA samples have previously been used to construct the mRNA BodyMap. The currently presented miRNA-seq dataset along with the existing mRNA-seq dataset from the same RNA samples provides a unique resource for studying the expression characteristics of existing and novel miRNAs, and for integrative analysis of miRNA-mRNA interactions, thereby facilitating better utilization of rats for biomarker discovery.


Assuntos
MicroRNAs , Ratos Endogâmicos F344 , Transcriptoma , Animais , Feminino , Perfilação da Expressão Gênica , Masculino , MicroRNAs/genética , RNA Mensageiro/genética , Ratos , Ratos Endogâmicos F344/genética , Reprodutibilidade dos Testes , Análise de Sequência de RNA
19.
Genome Biol ; 23(1): 2, 2022 01 03.
Artigo em Inglês | MEDLINE | ID: mdl-34980216

RESUMO

BACKGROUND: Reproducible detection of inherited variants with whole genome sequencing (WGS) is vital for the implementation of precision medicine and is a complicated process in which each step affects variant call quality. Systematically assessing reproducibility of inherited variants with WGS and impact of each step in the process is needed for understanding and improving quality of inherited variants from WGS. RESULTS: To dissect the impact of factors involved in detection of inherited variants with WGS, we sequence triplicates of eight DNA samples representing two populations on three short-read sequencing platforms using three library kits in six labs and call variants with 56 combinations of aligners and callers. We find that bioinformatics pipelines (callers and aligners) have a larger impact on variant reproducibility than WGS platform or library preparation. Single-nucleotide variants (SNVs), particularly outside difficult-to-map regions, are more reproducible than small insertions and deletions (indels), which are least reproducible when > 5 bp. Increasing sequencing coverage improves indel reproducibility but has limited impact on SNVs above 30×. CONCLUSIONS: Our findings highlight sources of variability in variant detection and the need for improvement of bioinformatics pipelines in the era of precision medicine with WGS.


Assuntos
Genoma Humano , Polimorfismo de Nucleotídeo Único , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Mutação INDEL , Reprodutibilidade dos Testes , Sequenciamento Completo do Genoma
20.
Sci Data ; 8(1): 296, 2021 11 09.
Artigo em Inglês | MEDLINE | ID: mdl-34753956

RESUMO

With the rapid advancement of sequencing technologies, next generation sequencing (NGS) analysis has been widely applied in cancer genomics research. More recently, NGS has been adopted in clinical oncology to advance personalized medicine. Clinical applications of precision oncology require accurate tests that can distinguish tumor-specific mutations from artifacts introduced during NGS processes or data analysis. Therefore, there is an urgent need to develop best practices in cancer mutation detection using NGS and the need for standard reference data sets for systematically measuring accuracy and reproducibility across platforms and methods. Within the SEQC2 consortium context, we established paired tumor-normal reference samples and generated whole-genome (WGS) and whole-exome sequencing (WES) data using sixteen library protocols, seven sequencing platforms at six different centers. We systematically interrogated somatic mutations in the reference samples to identify factors affecting detection reproducibility and accuracy in cancer genomes. These large cross-platform/site WGS and WES datasets using well-characterized reference samples will represent a powerful resource for benchmarking NGS technologies, bioinformatics pipelines, and for the cancer genomics studies.


Assuntos
Sequenciamento do Exoma , Genoma Humano , Neoplasias/genética , Sequenciamento Completo do Genoma , Benchmarking , Linhagem Celular Tumoral , Biologia Computacional , Genômica , Humanos , Medicina de Precisão
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA