Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 29
Filter
1.
BMC Med Imaging ; 24(1): 122, 2024 May 24.
Article in English | MEDLINE | ID: mdl-38789963

ABSTRACT

In response to the low real-time performance and accuracy of traditional sports injury monitoring, this article conducts research on a real-time injury monitoring system using the SVM model as an example. Video detection is performed to capture human movements, followed by human joint detection. Polynomial fitting analysis is used to extract joint motion patterns, and the average of training data is calculated as a reference point. The raw data is then normalized to adjust position and direction, and dimensionality reduction is achieved through singular value decomposition to enhance processing efficiency and model training speed. A support vector machine classifier is used to classify and identify the processed data. The experimental section monitors sports injuries and investigates the accuracy of the system's monitoring. Compared to mainstream models such as Random Forest and Naive Bayes, the SVM utilized demonstrates good performance in accuracy, sensitivity, and specificity, reaching 94.2%, 92.5%, and 96.0% respectively.


Subject(s)
Athletic Injuries , Deep Learning , Support Vector Machine , Humans , Athletic Injuries/diagnostic imaging , Video Recording , Sensitivity and Specificity , Algorithms
2.
Breast Cancer Res ; 23(1): 53, 2021 05 01.
Article in English | MEDLINE | ID: mdl-33933153

ABSTRACT

We identified a rare missense germline mutation in BARD1 (c.403G>A or p.Asp135Asn) as pathogenic using integrated genomics and transcriptomics profiling of germline and tumor samples from an early-onset triple-negative breast cancer patient who later was administrated with a PARP inhibitor for 2 months. We demonstrated in cell and mouse models that, compared to the wild-type, (1) c.403G>A mutant cell lines were more sensitive to irradiation, a DNA damage agent, and a PARP inhibitor; (2) c.403G>A mutation inhibited interaction between BARD1 and RAD51 (but not BRCA1); and (3) c.403G>A mutant mice were hypersensitive to ionizing radiation. Our study shed lights on the clinical interpretation of rare germline mutations of BARD1.


Subject(s)
Triple Negative Breast Neoplasms/genetics , Tumor Suppressor Proteins/genetics , Tumor Suppressor Proteins/metabolism , Ubiquitin-Protein Ligases/genetics , Ubiquitin-Protein Ligases/metabolism , Animals , DNA Damage/genetics , Female , Gene Expression Profiling , Genetic Predisposition to Disease/genetics , Genomics , Germ-Line Mutation , Humans , Mice , Mutation, Missense , Poly(ADP-ribose) Polymerase Inhibitors/therapeutic use , Rad51 Recombinase/metabolism , Radiation Tolerance/genetics , Triple Negative Breast Neoplasms/drug therapy , Triple Negative Breast Neoplasms/metabolism
3.
Nucleic Acids Res ; 47(D1): D1090-D1101, 2019 01 08.
Article in English | MEDLINE | ID: mdl-30407536

ABSTRACT

One important aspect of precision medicine aims to deliver the right medicine to the right patient at the right dose at the right time based on the unique 'omics' features of each individual patient, thus maximizing drug efficacy and minimizing adverse drug reactions. However, fragmentation and heterogeneity of available data makes it challenging to readily obtain first-hand information regarding some particular diseases, drugs, genes and variants of interest. Therefore, we developed the Precision Medicine Knowledgebase (PreMedKB) by seamlessly integrating the four fundamental components of precision medicine: diseases, genes, variants and drugs. PreMedKB allows for search of comprehensive information within each of the four components, the relationships between any two or more components, and importantly, the interpretation of the clinical meanings of a patient's genetic variants. PreMedKB is an efficient and user-friendly tool to assist researchers, clinicians or patients in interpreting a patient's genetic profile in terms of discovering potential pathogenic variants, recommending therapeutic regimens, designing panels for genetic testing kits, and matching patients for clinical trials. PreMedKB is freely accessible and available at http://www.fudan-pgx.org/premedkb/index.html#/home.


Subject(s)
Disease/genetics , Genetic Variation , Knowledge Bases , Pharmacogenetics/methods , Precision Medicine/methods , Computational Biology/methods , Humans , Information Storage and Retrieval/methods , Internet , Reproducibility of Results
4.
Environ Sci Technol ; 54(17): 10783-10796, 2020 09 01.
Article in English | MEDLINE | ID: mdl-32786597

ABSTRACT

Tris(1,3-dichloro-2-propyl)phosphate (TDCPP) is an environmental contaminant that has attracted increasing concern due to its presence in environmental media and biological samples. Our previous study demonstrated that exposure to TDCPP reduced the lifespan of Caenorhabditis elegans, but the mechanisms, including the relevant signaling pathways, are unclear. The current study found that TDCPP exposure triggers an unconventional insulin/insulin-like growth factor signaling (IIS) pathway, not by disrupting the insulin-like growth factor-1 receptor DAF-2/IGF1R but by inhibiting the downstream tumor-suppressor factor DAF-18/PTEN. This inhibition reduces PI(3,4,5)P3 (PIP3) dephosphorylation, causing buildup that increases the activation of the Akt/Protein Kinase B (PKB) family of serine/threonine kinases. This activation induces DAF-16/FoxO phosphorylation and promotes the sequestration of DAF-16/FoxO in the cytoplasm, reducing the lifespan of nematodes. Our results have important diagnostic and therapeutic implications for controlling TDCPP-related diseases, especially those originating with IIS pathway components.


Subject(s)
Caenorhabditis elegans Proteins , Longevity , Animals , Caenorhabditis elegans/metabolism , Caenorhabditis elegans Proteins/genetics , Forkhead Transcription Factors/metabolism , Insulin , Insulin-Like Growth Factor I , Mutation , Organophosphorus Compounds , Phosphates , Receptor, Insulin/metabolism , Signal Transduction
5.
Food Chem ; 447: 138969, 2024 Jul 30.
Article in English | MEDLINE | ID: mdl-38507947

ABSTRACT

Food authenticity is extremely important and widely targeted bi-omics is a promising pipeline attributing to incorporating metabolomics and peptidomics. Colla Corii Asini (CCA, Ejiao) is one of the most popular tonic edible materials, with counterfeit and adulterated products being widespread. An attempt was devoted to develop a high-throughput and reliable DI-MRM3 program facilitating widely targeted bi-omics of CCA. Firstly, predictive MRM program captured metabolites and peptides in trypsin-digestive gelatins. After data alignment and structure annotation, primary parameters such as Q1 â†’ Q3 â†’ QLIT, CE, and EE were optimized for all 17 metabolites and 34 peptides by online ER-MS. Though a single run merely consumed 6.5 min, great selectivity was reached for each analyte. Statistical results showed that nine peptides contributed to distinguish CCA from other gelatins. After cross-validation with LC-MRM, DI-MRM3 was justified to be reproducible and high-throughput for widely targeted bi-omics of CCA, suggesting a meaningful tool for food authenticity.


Subject(s)
Gelatin , Peptides , Gelatin/chemistry , Metabolomics , China
6.
Nanoscale ; 16(12): 6199-6214, 2024 Mar 21.
Article in English | MEDLINE | ID: mdl-38446101

ABSTRACT

While the filtering and accumulation effects of the extracellular matrix (ECM) on nanoparticles (NPs) have been experimentally observed, the detailed interactions between NPs and specific biomolecules within the ECM remain poorly understood and pose challenges for in vivo molecular-level investigations. Herein, we adopt molecular dynamics simulations to elucidate the impacts of methyl-, hydroxy-, amine-, and carboxyl-modified gold NPs on the cell-binding domains of fibronectin (Fn), an indispensable component of the ECM for cell attachment and signaling. Simulation results show that NPs can specifically bind to distinct Fn domains, and the strength of these interactions depends on the physicochemical properties of NPs. NP-NH3+ exhibits the highest affinity to domains rich in acidic residues, leading to strong electrostatic interactions that induce severe deformation, potentially disrupting the normal functioning of Fn. NP-CH3 and NP-COO- selectively occupy the RGD/PHSRN motifs, which may hinder their recognition by integrins on the cell surface. Additionally, NPs can disrupt the dimerization of Fn through competing for residues at the dimer interface or by diminishing the shape complementarity between dimerized proteins. The mechanical stretching of Fn, crucial for ECM fibrillogenesis, is suppressed by NPs due to their local rigidifying effect. These results provide valuable molecular-level insights into the impacts of various NPs on the ECM, holding significant implications for advancing nanomedicine and nanosafety evaluation.


Subject(s)
Fibronectins , Nanoparticles , Fibronectins/chemistry , Integrins/metabolism , Extracellular Matrix/metabolism , Signal Transduction
7.
Genome Biol ; 25(1): 34, 2024 01 24.
Article in English | MEDLINE | ID: mdl-38268000

ABSTRACT

BACKGROUND: Various laboratory-developed metabolomic methods lead to big challenges in inter-laboratory comparability and effective integration of diverse datasets. RESULTS: As part of the Quartet Project, we establish a publicly available suite of four metabolite reference materials derived from B lymphoblastoid cell lines from a family of parents and monozygotic twin daughters. We generate comprehensive LC-MS-based metabolomic data from the Quartet reference materials using targeted and untargeted strategies in different laboratories. The Quartet multi-sample-based signal-to-noise ratio enables objective assessment of the reliability of intra-batch and cross-batch metabolomics profiling in detecting intrinsic biological differences among the four groups of samples. Significant variations in the reliability of the metabolomics profiling are identified across laboratories. Importantly, ratio-based metabolomics profiling, by scaling the absolute values of a study sample relative to those of a common reference sample, enables cross-laboratory quantitative data integration. Thus, we construct the ratio-based high-confidence reference datasets between two reference samples, providing "ground truth" for inter-laboratory accuracy assessment, which enables objective evaluation of quantitative metabolomics profiling using various instruments and protocols. CONCLUSIONS: Our study provides the community with rich resources and best practices for inter-laboratory proficiency tests and data integration, ensuring reliability of large-scale and longitudinal metabolomic studies.


Subject(s)
Liquid Chromatography-Mass Spectrometry , Metabolomics , Humans , Reproducibility of Results , Cell Line , Twins, Monozygotic
8.
Genome Biol ; 25(1): 163, 2024 06 20.
Article in English | MEDLINE | ID: mdl-38902799

ABSTRACT

BACKGROUND: Copy number variation (CNV) is a key genetic characteristic for cancer diagnostics and can be used as a biomarker for the selection of therapeutic treatments. Using data sets established in our previous study, we benchmark the performance of cancer CNV calling by six most recent and commonly used software tools on their detection accuracy, sensitivity, and reproducibility. In comparison to other orthogonal methods, such as microarray and Bionano, we also explore the consistency of CNV calling across different technologies on a challenging genome. RESULTS: While consistent results are observed for copy gain, loss, and loss of heterozygosity (LOH) calls across sequencing centers, CNV callers, and different technologies, variation of CNV calls are mostly affected by the determination of genome ploidy. Using consensus results from six CNV callers and confirmation from three orthogonal methods, we establish a high confident CNV call set for the reference cancer cell line (HCC1395). CONCLUSIONS: NGS technologies and current bioinformatics tools can offer reliable results for detection of copy gain, loss, and LOH. However, when working with a hyper-diploid genome, some software tools can call excessive copy gain or loss due to inaccurate assessment of genome ploidy. With performance matrices on various experimental conditions, this study raises awareness within the cancer research community for the selection of sequencing platforms, sample preparation, sequencing coverage, and the choice of CNV detection tools.


Subject(s)
Computational Biology , DNA Copy Number Variations , High-Throughput Nucleotide Sequencing , Loss of Heterozygosity , Neoplasms , Software , Humans , High-Throughput Nucleotide Sequencing/methods , Neoplasms/genetics , Computational Biology/methods , Diploidy , Genome, Human , Cell Line, Tumor , Reproducibility of Results , Sequence Analysis, DNA/methods
9.
Sci Rep ; 14(1): 7028, 2024 03 25.
Article in English | MEDLINE | ID: mdl-38528062

ABSTRACT

Accurate indel calling plays an important role in precision medicine. A benchmarking indel set is essential for thoroughly evaluating the indel calling performance of bioinformatics pipelines. A reference sample with a set of known-positive variants was developed in the FDA-led Sequencing Quality Control Phase 2 (SEQC2) project, but the known indels in the known-positive set were limited. This project sought to provide an enriched set of known indels that would be more translationally relevant by focusing on additional cancer related regions. A thorough manual review process completed by 42 reviewers, two advisors, and a judging panel of three researchers significantly enriched the known indel set by an additional 516 indels. The extended benchmarking indel set has a large range of variant allele frequencies (VAFs), with 87% of them having a VAF below 20% in reference Sample A. The reference Sample A and the indel set can be used for comprehensive benchmarking of indel calling across a wider range of VAF values in the lower range. Indel length was also variable, but the majority were under 10 base pairs (bps). Most of the indels were within coding regions, with the remainder in the gene regulatory regions. Although high confidence can be derived from the robust study design and meticulous human review, this extensive indel set has not undergone orthogonal validation. The extended benchmarking indel set, along with the indels in the previously published known-positive set, was the truth set used to benchmark indel calling pipelines in a community challenge hosted on the precisionFDA platform. This benchmarking indel set and reference samples can be utilized for a comprehensive evaluation of indel calling pipelines. Additionally, the insights and solutions obtained during the manual review process can aid in improving the performance of these pipelines.


Subject(s)
Benchmarking , High-Throughput Nucleotide Sequencing , Humans , Computational Biology , Quality Control , INDEL Mutation , Polymorphism, Single Nucleotide
10.
Anal Methods ; 15(21): 2588-2598, 2023 06 01.
Article in English | MEDLINE | ID: mdl-37226530

ABSTRACT

The homeostasis of bile acid (BA)-submetabolome that is composed by correlating hundreds of BA species contributes a lot to maintaining physiological status. However, it is challenging to understand the transformational rules amongst endogenous BAs, but it is viable to profile the in vitro metabolism of BA analogues, as a compromise approach to isotopic labeling of BAs, to deduce the metabolism of BAs. An attempt is made here to characterize the metabolites of 23-nordeoxycholic acid (norDCA), a deoxycholic acid analogue with a C23-CH2 defect, after in vitro incubation with enzyme-enriched liver subcellular fractions of mouse, rat or human. A predictive multiple-reaction monitoring mode was deployed for sensitive metabolite detection, leading to the capture of twelve metabolites (M1-M12). After putative structural annotation by analyzing MS/MS spectra, special attention was paid to isomeric identification. Dozens of authentic BAs were collected and measured for modeling of the quantitative structure-retention time relationships. Because modifications in LC-MS/MS behaviors in response to C23-CH2 difference were characterized by comparing several pairs, the rules of 14.02 Da shift and 2.4-4.2 min distance were applied to improve identification confidence by matching with several authentic BAs bearing C23-CH2 additions compared to the metabolites. Consequently, confirmative structural identification was achieved for all metabolites. Metabolic pathways in response to M1-M12 were proposed, and hydroxylation, oxidation, epimerization, sulfation, and glucuronidation served as the primary metabolism channels for norDCA. Together, the findings provide meaningful information about the correlations between different endogenous BAs and the structural identification strategy offers a promising idea when facing an isomeric discrimination challenge.


Subject(s)
Bile Acids and Salts , Tandem Mass Spectrometry , Rats , Humans , Mice , Animals , Chromatography, Liquid , Deoxycholic Acid
11.
J Hazard Mater ; 453: 131430, 2023 07 05.
Article in English | MEDLINE | ID: mdl-37080032

ABSTRACT

By linking the cation and anion motifs of ionic liquids (ILs), zwitterionic liquids (ZILs) exhibit at least 146-2740 and 112-1550 folds less cytotoxicity in human gastric and colon cells than those of the structurally related ILs. Computer simulation shows that ZIL molecules hardly penetrate the cell membranes in contrast to ILs. These findings reveal a novel mechanism for ZILs to evade cytotoxicity, establishing a structure-based design principle for the next generation of sustainable ZILs.


Subject(s)
Ionic Liquids , Humans , Ionic Liquids/toxicity , Computer Simulation , Anions
12.
Eur J Med Chem ; 260: 115728, 2023 Nov 15.
Article in English | MEDLINE | ID: mdl-37625288

ABSTRACT

The mitochondria have been identified as key targets in nonalcoholic fatty liver disease (NAFLD), one of the most prevalent chronic liver damage diseases globally. Meanwhile, the biological information analysis in this study revealed that SIRT1, PPARG, PPARA, and PPARGC1A (mitochondrial biogenesis-related proteins) were NAFLD therapeutic targets. Therefore, the design and synthesis of targeted drugs that promote mitochondrial biogenesis and improve mitochondrial function are particularly important for NAFLD treatment. Recently, we introduced butyls, hydroxyls, and halogens to benzophenone and synthesized a series of NAFLD-related 4-butylpolyhydroxybenzophenone compounds, aiming at investigating the hepatoprotective activity from the aspect of mitochondrial biogenesis. The structure-activity relationship demonstrated that hydroxyl and ketone groups were active groups interacting with mitochondrial biogenesis proteins (SIRT1 and PGC1α), and the activity was stronger when the o-hydroxyl group was present on the benzene ring. In contrast, the activity was little affected by the presence of the p-hydroxyl group, m-hydroxyl group, butyl group type, or halogen. In addition, in vitro studies confirmed that these compounds could directly bind to SIRT1 and PGC1α, markedly promote their interaction, significantly increase the expression of proteins and genes related to mitochondrial biogenesis (SIRT1, PGC1α, NRF1, TFAM, COX1, and ND6) and subsequently ameliorate mitochondria dysfunction, which was evidenced by the decreased ROS, upregulated ATP production, increased MMP, and enhanced mitochondrial number. According to the outcomes of our in vitro and in vivo experiments, 4-butyl-polyhydroxybenzophenone compounds could also effectively reduce the formation of lipid droplets and liver injury index (ALT, AST, LDH, AKP, γ-GT, and GDH) and improve the level of antioxidant enzymes (GSH and SOD). Particularly, the treatment of these compounds after a high-fat diet could significantly reduce body weight, decrease liver coefficient, attenuate liver damage, and ameliorate lipid accumulation in rat liver, demonstrating their therapeutic effects on NAFLD. Mechanistically, 4-butyl-polyhydroxybenzophenone compounds promoted mitochondrial biogenesis and eventually prevented NAFLD liver injury by activating the PGC1α signaling pathway in a SIRT1-dependent manner, which was strongly supported by SIRT1 inhibitor EX527.


Subject(s)
Non-alcoholic Fatty Liver Disease , Animals , Rats , Halogens , Non-alcoholic Fatty Liver Disease/drug therapy , Organelle Biogenesis , Peroxisome Proliferator-Activated Receptor Gamma Coactivator 1-alpha , Sirtuin 1
13.
Oxid Med Cell Longev ; 2023: 3782230, 2023.
Article in English | MEDLINE | ID: mdl-36659905

ABSTRACT

Nonalcoholic fatty liver disease (NAFLD) has reached epidemic proportions with no pharmacological treatment approved. Several highly accessible computational tools were employed to predict the activities of twelve novel compounds prior to actual chemical synthesis. We began our work by designing two or three hydroxyl groups appended to the phenyl ketone core, followed by prediction of drug-likeness and targets. Most predicted targets for each compound overlapped with NAFLD targets (≥80%). Enrichment analysis showed that these compounds might regulate oxidoreductase activity. Then, these compounds were synthesized and confirmed by IR, MS, 1H, and 13C NMR. Their cell viability demonstrated that twelve compounds exhibited appreciable potencies against NAFLD (EC50 values ≤ 13.5 µM). Furthermore, the most potent compound 5f effectively prevented NAFLD progression as evidenced by the change in histological features. 5f significantly reduced total cholesterol and triglyceride levels in vitro/in vivo, and the effects of 5f were significantly stronger than those of the control drug. The proteomic data showed that oxidoreductase activity was the most significantly enriched, and this finding was consistent with docking results. In summary, this validated presynthesis prediction approach was cost-saving and worthy of popularization. The novel synthetic phenyl ketone derivative 5f holds great therapeutic potential by modulating oxidoreductase activity to counter NAFLD.


Subject(s)
Non-alcoholic Fatty Liver Disease , Humans , Molecular Docking Simulation , Non-alcoholic Fatty Liver Disease/drug therapy , Oxidoreductases , Proteomics
14.
Genome Biol ; 24(1): 245, 2023 10 26.
Article in English | MEDLINE | ID: mdl-37884999

ABSTRACT

The Quartet Data Portal facilitates community access to well-characterized reference materials, reference datasets, and related resources established based on a family of four individuals with identical twins from the Quartet Project. Users can request DNA, RNA, protein, and metabolite reference materials, as well as datasets generated across omics, platforms, labs, protocols, and batches. Reproducible analysis tools allow for objective performance assessment of user-submitted data, while interactive visualization tools support rapid exploration of reference datasets. A closed-loop "distribution-collection-evaluation-integration" workflow enables updates and integration of community-contributed multiomics data. Ultimately, this portal helps promote the advancement of reference datasets and multiomics quality control.


Subject(s)
Multiomics , Software , Humans , Quality Control
15.
Genome Biol ; 24(1): 201, 2023 09 07.
Article in English | MEDLINE | ID: mdl-37674217

ABSTRACT

BACKGROUND: Batch effects are notoriously common technical variations in multiomics data and may result in misleading outcomes if uncorrected or over-corrected. A plethora of batch-effect correction algorithms are proposed to facilitate data integration. However, their respective advantages and limitations are not adequately assessed in terms of omics types, the performance metrics, and the application scenarios. RESULTS: As part of the Quartet Project for quality control and data integration of multiomics profiling, we comprehensively assess the performance of seven batch effect correction algorithms based on different performance metrics of clinical relevance, i.e., the accuracy of identifying differentially expressed features, the robustness of predictive models, and the ability of accurately clustering cross-batch samples into their own donors. The ratio-based method, i.e., by scaling absolute feature values of study samples relative to those of concurrently profiled reference material(s), is found to be much more effective and broadly applicable than others, especially when batch effects are completely confounded with biological factors of study interests. We further provide practical guidelines for implementing the ratio based approach in increasingly large-scale multiomics studies. CONCLUSIONS: Multiomics measurements are prone to batch effects, which can be effectively corrected using ratio-based scaling of the multiomics data. Our study lays the foundation for eliminating batch effects at a ratio scale.


Subject(s)
Algorithms , Multiomics , Base Composition , Benchmarking , Clinical Relevance
16.
Genome Biol ; 24(1): 277, 2023 Dec 04.
Article in English | MEDLINE | ID: mdl-38049885

ABSTRACT

BACKGROUND: Recent state-of-the-art sequencing technologies enable the investigation of challenging regions in the human genome and expand the scope of variant benchmarking datasets. Herein, we sequence a Chinese Quartet, comprising two monozygotic twin daughters and their biological parents, using four short and long sequencing platforms (Illumina, BGI, PacBio, and Oxford Nanopore Technology). RESULTS: The long reads from the monozygotic twin daughters are phased into paternal and maternal haplotypes using the parent-child genetic map and for each haplotype. We also use long reads to generate haplotype-resolved whole-genome assemblies with completeness and continuity exceeding that of GRCh38. Using this Quartet, we comprehensively catalogue the human variant landscape, generating a dataset of 3,962,453 SNVs, 886,648 indels (< 50 bp), 9726 large deletions (≥ 50 bp), 15,600 large insertions (≥ 50 bp), 40 inversions, 31 complex structural variants, and 68 de novo mutations which are shared between the monozygotic twin daughters. Variants underrepresented in previous benchmarks owing to their complexity-including those located at long repeat regions, complex structural variants, and de novo mutations-are systematically examined in this study. CONCLUSIONS: In summary, this study provides high-quality haplotype-resolved assemblies and a comprehensive set of benchmarking resources for two Chinese monozygotic twin samples which, relative to existing benchmarks, offers expanded genomic coverage and insight into complex variant categories.


Subject(s)
Benchmarking , East Asian People , Twins, Monozygotic , Humans , East Asian People/genetics , Genomics , Haplotypes , High-Throughput Nucleotide Sequencing , Sequence Analysis, DNA , Twins, Monozygotic/genetics , Twin Studies as Topic
17.
Genome Biol ; 24(1): 270, 2023 Nov 27.
Article in English | MEDLINE | ID: mdl-38012772

ABSTRACT

BACKGROUND: Genomic DNA reference materials are widely recognized as essential for ensuring data quality in omics research. However, relying solely on reference datasets to evaluate the accuracy of variant calling results is incomplete, as they are limited to benchmark regions. Therefore, it is important to develop DNA reference materials that enable the assessment of variant detection performance across the entire genome. RESULTS: We established a DNA reference material suite from four immortalized cell lines derived from a family of parents and monozygotic twins. Comprehensive reference datasets of 4.2 million small variants and 15,000 structural variants were integrated and certified for evaluating the reliability of germline variant calls inside the benchmark regions. Importantly, the genetic built-in-truth of the Quartet family design enables estimation of the precision of variant calls outside the benchmark regions. Using the Quartet reference materials along with study samples, batch effects are objectively monitored and alleviated by training a machine learning model with the Quartet reference datasets to remove potential artifact calls. Moreover, the matched RNA and protein reference materials and datasets from the Quartet project enables cross-omics validation of variant calls from multiomics data. CONCLUSIONS: The Quartet DNA reference materials and reference datasets provide a unique resource for objectively assessing the quality of germline variant calls throughout the whole-genome regions and improving the reliability of large-scale genomic profiling.


Subject(s)
Benchmarking , Genome, Human , Humans , Reproducibility of Results , Polymorphism, Single Nucleotide , Germ Cells , High-Throughput Nucleotide Sequencing/methods
18.
Nat Biotechnol ; 2023 Sep 07.
Article in English | MEDLINE | ID: mdl-37679545

ABSTRACT

Certified RNA reference materials are indispensable for assessing the reliability of RNA sequencing to detect intrinsically small biological differences in clinical settings, such as molecular subtyping of diseases. As part of the Quartet Project for quality control and data integration of multi-omics profiling, we established four RNA reference materials derived from immortalized B-lymphoblastoid cell lines from four members of a monozygotic twin family. Additionally, we constructed ratio-based transcriptome-wide reference datasets between two samples, providing cross-platform and cross-laboratory 'ground truth'. Investigation of the intrinsically subtle biological differences among the Quartet samples enables sensitive assessment of cross-batch integration of transcriptomic measurements at the ratio level. The Quartet RNA reference materials, combined with the ratio-based reference datasets, can serve as unique resources for assessing and improving the quality of transcriptomic data in clinical and biological settings.

19.
Nat Biotechnol ; 2023 Sep 07.
Article in English | MEDLINE | ID: mdl-37679543

ABSTRACT

Characterization and integration of the genome, epigenome, transcriptome, proteome and metabolome of different datasets is difficult owing to a lack of ground truth. Here we develop and characterize suites of publicly available multi-omics reference materials of matched DNA, RNA, protein and metabolites derived from immortalized cell lines from a family quartet of parents and monozygotic twin daughters. These references provide built-in truth defined by relationships among the family members and the information flow from DNA to RNA to protein. We demonstrate how using a ratio-based profiling approach that scales the absolute feature values of a study sample relative to those of a concurrently measured common reference sample produces reproducible and comparable data suitable for integration across batches, labs, platforms and omics types. Our study identifies reference-free 'absolute' feature quantification as the root cause of irreproducibility in multi-omics measurement and data integration and establishes the advantages of ratio-based multi-omics profiling with common reference materials.

20.
Sci Data ; 9(1): 201, 2022 05 12.
Article in English | MEDLINE | ID: mdl-35551205

ABSTRACT

Rat is one of the most widely-used models in chemical safety evaluation and biomedical research. However, the knowledge about its microRNA (miRNA) expression patterns across multiple organs and various developmental stages is still limited. Here, we constructed a comprehensive rat miRNA expression BodyMap using a diverse collection of 320 RNA samples from 11 organs of both sexes of juvenile, adolescent, adult and aged Fischer 344 rats with four biological replicates per group. Following the Illumina TruSeq Small RNA protocol, an average of 5.1 million 50 bp single-end reads was generated per sample, yielding a total of 1.6 billion reads. The quality of the resulting miRNA-seq data was deemed to be high from raw sequences, mapped sequences, and biological reproducibility. Importantly, aliquots of the same RNA samples have previously been used to construct the mRNA BodyMap. The currently presented miRNA-seq dataset along with the existing mRNA-seq dataset from the same RNA samples provides a unique resource for studying the expression characteristics of existing and novel miRNAs, and for integrative analysis of miRNA-mRNA interactions, thereby facilitating better utilization of rats for biomarker discovery.


Subject(s)
MicroRNAs , Rats, Inbred F344 , Transcriptome , Animals , Female , Gene Expression Profiling , Male , MicroRNAs/genetics , RNA, Messenger/genetics , Rats , Rats, Inbred F344/genetics , Reproducibility of Results , Sequence Analysis, RNA
SELECTION OF CITATIONS
SEARCH DETAIL