Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 86
Filtrar
Más filtros

Base de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Genome Biol ; 25(1): 254, 2024 Oct 03.
Artículo en Inglés | MEDLINE | ID: mdl-39363244

RESUMEN

Batch effects in omics data are notoriously common technical variations unrelated to study objectives, and may result in misleading outcomes if uncorrected, or hinder biomedical discovery if over-corrected. Assessing and mitigating batch effects is crucial for ensuring the reliability and reproducibility of omics data and minimizing the impact of technical variations on biological interpretation. In this review, we highlight the profound negative impact of batch effects and the urgent need to address this challenging problem in large-scale omics studies. We summarize potential sources of batch effects, current progress in evaluating and correcting them, and consortium efforts aiming to tackle them.


Asunto(s)
Genómica , Genómica/métodos , Humanos , Reproducibilidad de los Resultados , Proteómica/métodos
2.
Int J Parasitol ; 2024 Aug 20.
Artículo en Inglés | MEDLINE | ID: mdl-39168434

RESUMEN

Millions of livestock animals worldwide are infected with the haematophagous barber's pole worm, Haemonchus contortus, the aetiological agent of haemonchosis. Despite the major significance of this parasite worldwide and its widespread resistance to current treatments, the lack of a high-quality genome for the well-defined strain of this parasite from Australia, called Haecon-5, has constrained research in a number of areas including host-parasite interactions, drug discovery and population genetics. To enable research in these areas, we report here a chromosome-contiguous genome (∼280 Mb) for Haecon-5 with high-quality models for 19,234 protein-coding genes. Comparative genomic analyses show significant genomic similarity (synteny) with a UK strain of H. contortus, called MHco3(ISE).N1 (abbreviated as "ISE"), but we also discover marked differences in genomic structure/gene arrangements, distribution of nucleotide variability (single nucleotide polymorphisms (SNPs) and indels) and orthology between Haecon-5 and ISE. We used the genome and extensive transcriptomic resources for Haecon-5 to predict a subset of essential single-copy genes employing a "cross-species" machine learning (ML) approach using a range of features from nucleotide/protein sequences, protein orthology, subcellular localisation, single-cell RNA-seq and/or histone methylation data available for the model organisms Caenorhabditis elegans and Drosophila melanogaster. From a set of 1,464 conserved single copy genes, transcribed in key life-cycle stages of H. contortus, we identified 232 genes whose homologs have critical functions in C. elegans and/or D. melanogaster, and prioritised 10 of them for further characterisation; nine of the 10 genes likely play roles in neurophysiological processes, germline, hypodermis and/or respiration, and one is an unknown (orphan) gene for which no detailed functional information exists. Future studies of these genes/gene products are warranted to elucidate their roles in parasite biology, host-parasite interplay and/or disease. Clearly, the present Haecon-5 reference genome and associated resources now underpin a broad range of fundamental investigations of H. contortus and could assist in accelerating the discovery of novel intervention targets and drug candidates to combat haemonchosis.

3.
Int J Mol Sci ; 25(16)2024 Aug 12.
Artículo en Inglés | MEDLINE | ID: mdl-39201452

RESUMEN

Haemonchus contortus (the barber's pole worm)-a highly pathogenic gastric nematode of ruminants-causes significant economic losses in the livestock industry worldwide. H. contortus has become a valuable model organism for both fundamental and applied research (e.g., drug and vaccine discovery) because of the availability of well-defined laboratory strains (e.g., MHco3(ISE).N1 in the UK and Haecon-5 in Australia) and genomic, transcriptomic and proteomic data sets. Many recent investigations have relied heavily on the use of the chromosome-contiguous genome of MHco3(ISE).N1 in the absence of a genome for Haecon-5. However, there has been no genetic comparison of these and other strains to date. Here, we assembled and characterised the mitochondrial genome (14.1 kb) of Haecon-5 and compared it with that of MHco3(ISE).N1 and two other strains (i.e., McMaster and NZ_Hco_NP) from Australasia. We detected 276 synonymous and 25 non-synonymous single nucleotide polymorphisms (SNPs) within Haecon-5. Between the Haecon-5 and MHco3(ISE).N1 strains, we recorded 345 SNPs, 31 of which were non-synonymous and linked to fixed amino acid differences in seven protein-coding genes (nad5, nad6, nad1, atp6, nad2, cytb and nad4) between these strains. Pronounced variation (344 and 435 SNPs) was seen between Haecon-5 and each of the other two strains from Australasia. The question remains as to what impact these mitogenomic mutations might have on the biology and physiology of H. contortus, which warrants exploration. The high degree of mitogenomic variability recorded here among these strains suggests that further work should be undertaken to assess the nature and extent of the nuclear genomic variation within H. contortus.


Asunto(s)
Genoma Mitocondrial , Haemonchus , Polimorfismo de Nucleótido Simple , Animales , Haemonchus/genética , Filogenia , Variación Genética , Australia
4.
Nat Commun ; 15(1): 6167, 2024 Jul 22.
Artículo en Inglés | MEDLINE | ID: mdl-39039053

RESUMEN

Translating RNA-seq into clinical diagnostics requires ensuring the reliability and cross-laboratory consistency of detecting clinically relevant subtle differential expressions, such as those between different disease subtypes or stages. As part of the Quartet project, we present an RNA-seq benchmarking study across 45 laboratories using the Quartet and MAQC reference samples spiked with ERCC controls. Based on multiple types of 'ground truth', we systematically assess the real-world RNA-seq performance and investigate the influencing factors involved in 26 experimental processes and 140 bioinformatics pipelines. Here we show greater inter-laboratory variations in detecting subtle differential expressions among the Quartet samples. Experimental factors including mRNA enrichment and strandedness, and each bioinformatics step, emerge as primary sources of variations in gene expression. We underscore the profound influence of experimental execution, and provide best practice recommendations for experimental designs, strategies for filtering low-expression genes, and the optimal gene annotation and analysis pipelines. In summary, this study lays the foundation for developing and quality control of RNA-seq for clinical diagnostic purposes.


Asunto(s)
Benchmarking , Biología Computacional , Control de Calidad , RNA-Seq , Estándares de Referencia , Benchmarking/métodos , Humanos , RNA-Seq/métodos , RNA-Seq/normas , Biología Computacional/métodos , Reproducibilidad de los Resultados , Análisis de Secuencia de ARN/métodos , Análisis de Secuencia de ARN/normas , Perfilación de la Expresión Génica/métodos , Perfilación de la Expresión Génica/normas , ARN Mensajero/genética , ARN Mensajero/metabolismo
5.
Phenomics ; 4(2): 109-124, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38884056

RESUMEN

RNA sequencing (RNAseq) technology has become increasingly important in precision medicine and clinical diagnostics, and emerged as a powerful tool for identifying protein-coding genes, performing differential gene analysis, and inferring immune cell composition. Human peripheral blood samples are widely used for RNAseq, providing valuable insights into individual biomolecular information. Blood samples can be classified as whole blood (WB), plasma, serum, and remaining sediment samples, including plasma-free blood (PFB) and serum-free blood (SFB) samples that are generally considered less useful byproducts during the processes of plasma and serum separation, respectively. However, the feasibility of using PFB and SFB samples for transcriptome analysis remains unclear. In this study, we aimed to assess the suitability of employing PFB or SFB samples as an alternative RNA source in transcriptomic analysis. We performed a comparative analysis of WB, PFB, and SFB samples for different applications. Our results revealed that PFB samples exhibit greater similarity to WB samples than SFB samples in terms of protein-coding gene expression patterns, detection of differentially expressed genes, and immunological characterizations, suggesting that PFB can serve as a viable alternative to WB for transcriptomic analysis. Our study contributes to the optimization of blood sample utilization and the advancement of precision medicine research. Supplementary Information: The online version contains supplementary material available at 10.1007/s43657-023-00121-1.

6.
Nat Genet ; 56(5): 846-860, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38641644

RESUMEN

Methylation quantitative trait loci (mQTLs) are essential for understanding the role of DNA methylation changes in genetic predisposition, yet they have not been fully characterized in East Asians (EAs). Here we identified mQTLs in whole blood from 3,523 Chinese individuals and replicated them in additional 1,858 Chinese individuals from two cohorts. Over 9% of mQTLs displayed specificity to EAs, facilitating the fine-mapping of EA-specific genetic associations, as shown for variants associated with height. Trans-mQTL hotspots revealed biological pathways contributing to EA-specific genetic associations, including an ERG-mediated 233 trans-mCpG network, implicated in hematopoietic cell differentiation, which likely reflects binding efficiency modulation of the ERG protein complex. More than 90% of mQTLs were shared between different blood cell lineages, with a smaller fraction of lineage-specific mQTLs displaying preferential hypomethylation in the respective lineages. Our study provides new insights into the mQTL landscape across genetic ancestries and their downstream effects on cellular processes and diseases/traits.


Asunto(s)
Metilación de ADN , Pueblos del Este de Asia , Sitios de Carácter Cuantitativo , Femenino , Humanos , Masculino , Pueblos del Este de Asia/genética , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo/métodos , Herencia Multifactorial , Polimorfismo de Nucleótido Simple
7.
Nat Cancer ; 5(4): 673-690, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38347143

RESUMEN

Molecular profiling guides precision treatment of breast cancer; however, Asian patients are underrepresented in publicly available large-scale studies. We established a comprehensive multiomics cohort of 773 Chinese patients with breast cancer and systematically analyzed their genomic, transcriptomic, proteomic, metabolomic, radiomic and digital pathology characteristics. Here we show that compared to breast cancers in white individuals, Asian individuals had more targetable AKT1 mutations. Integrated analysis revealed a higher proportion of HER2-enriched subtype and correspondingly more frequent ERBB2 amplification and higher HER2 protein abundance in the Chinese HR+HER2+ cohort, stressing anti-HER2 therapy for these individuals. Furthermore, comprehensive metabolomic and proteomic analyses revealed ferroptosis as a potential therapeutic target for basal-like tumors. The integration of clinical, transcriptomic, metabolomic, radiomic and pathological features allowed for efficient stratification of patients into groups with varying recurrence risks. Our study provides a public resource and new insights into the biology and ancestry specificity of breast cancer in the Asian population, offering potential for further precision treatment approaches.


Asunto(s)
Pueblo Asiatico , Neoplasias de la Mama , Receptor ErbB-2 , Humanos , Neoplasias de la Mama/genética , Neoplasias de la Mama/terapia , Femenino , Pueblo Asiatico/genética , Receptor ErbB-2/genética , Mutación , Proteómica/métodos , Perfilación de la Expresión Génica/métodos , Proteínas Proto-Oncogénicas c-akt/metabolismo , Proteínas Proto-Oncogénicas c-akt/genética , Persona de Mediana Edad , China/epidemiología , Ferroptosis/genética , Adulto , Metabolómica/métodos , Transcriptoma , Biomarcadores de Tumor/genética , Pueblos del Este de Asia
8.
Adv Sci (Weinh) ; 11(15): e2305546, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38342612

RESUMEN

The heterogeneity of triple-negative breast cancers (TNBC) remains challenging for various treatments. Ferroptosis, a recently identified form of cell death resulting from the unrestrained peroxidation of phospholipids, represents a potential vulnerability in TNBC. In this study, a high intensity focused ultrasound (HIFU)-driven nanomotor is developed for effective therapy of TNBC through induction of ferroptosis. Through bioinformatics analysis of typical ferroptosis-associated genes in the FUSCCTNBC dataset, gambogic acid is identified as a promising ferroptosis drug and loaded it into the nanomotor. It is found that the rapid motion of nanomotors propelled by HIFU significantly enhanced tumor accumulation and penetration. More importantly, HIFU not only actuated nanomotors to trigger effective ferroptosis of TNBC cells, but also drove nanomotors to activate ferroptosis-mediated antitumor immunity in primary and metastatic TNBC models, resulting in effective tumor regression and prevention of metastases. Overall, HIFU-driven nanomotors show great potential for ferroptosis-immunotherapy of TNBC.


Asunto(s)
Ferroptosis , Neoplasias de la Mama Triple Negativas , Humanos , Neoplasias de la Mama Triple Negativas/terapia , Inmunoterapia , Muerte Celular , Biología Computacional
9.
Genome Biol ; 25(1): 34, 2024 01 24.
Artículo en Inglés | MEDLINE | ID: mdl-38268000

RESUMEN

BACKGROUND: Various laboratory-developed metabolomic methods lead to big challenges in inter-laboratory comparability and effective integration of diverse datasets. RESULTS: As part of the Quartet Project, we establish a publicly available suite of four metabolite reference materials derived from B lymphoblastoid cell lines from a family of parents and monozygotic twin daughters. We generate comprehensive LC-MS-based metabolomic data from the Quartet reference materials using targeted and untargeted strategies in different laboratories. The Quartet multi-sample-based signal-to-noise ratio enables objective assessment of the reliability of intra-batch and cross-batch metabolomics profiling in detecting intrinsic biological differences among the four groups of samples. Significant variations in the reliability of the metabolomics profiling are identified across laboratories. Importantly, ratio-based metabolomics profiling, by scaling the absolute values of a study sample relative to those of a common reference sample, enables cross-laboratory quantitative data integration. Thus, we construct the ratio-based high-confidence reference datasets between two reference samples, providing "ground truth" for inter-laboratory accuracy assessment, which enables objective evaluation of quantitative metabolomics profiling using various instruments and protocols. CONCLUSIONS: Our study provides the community with rich resources and best practices for inter-laboratory proficiency tests and data integration, ensuring reliability of large-scale and longitudinal metabolomic studies.


Asunto(s)
Cromatografía Líquida con Espectrometría de Masas , Metabolómica , Humanos , Reproducibilidad de los Resultados , Línea Celular , Gemelos Monocigóticos
10.
Int J Parasitol Drugs Drug Resist ; 24: 100522, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38295619

RESUMEN

Within the context of our anthelmintic discovery program, we recently identified and evaluated a quinoline derivative, called ABX464 or obefazimod, as a nematocidal candidate; synthesised a series of analogues which were assessed for activity against the free-living nematode Caenorhabditis elegans; and predicted compound-target relationships by thermal proteome profiling (TPP) and in silico docking. Here, we logically extended this work and critically evaluated the anthelmintic activity of ABX464 analogues on Haemonchus contortus (barber's pole worm) - a highly pathogenic nematode of ruminant livestock. First, we tested a series of 44 analogues on H. contortus (larvae and adults) to investigate the nematocidal pharmacophore of ABX464, and identified one compound with greater potency than the parent compound and showed moderate activity against a select number of other parasitic nematodes (including Ancylostoma, Heligmosomoides and Strongyloides species). Using TPP and in silico modelling studies, we predicted protein HCON_00074590 (a predicted aldo-keto reductase) as a target candidate for ABX464 in H. contortus. Future work aims to optimise this compound as a nematocidal candidate and investigate its pharmacokinetic properties. Overall, this study presents a first step toward the development of a new nematocide.


Asunto(s)
Antihelmínticos , Haemonchus , Nematodos , Quinolinas , Animales , Antinematodos/farmacología , Antihelmínticos/farmacología , Relación Estructura-Actividad , Caenorhabditis elegans , Quinolinas/farmacología
11.
Bioorg Med Chem ; 98: 117540, 2024 Jan 15.
Artículo en Inglés | MEDLINE | ID: mdl-38134663

RESUMEN

Global challenges with treatment failures and/or widespread resistance in parasitic worms against commercially available anthelmintics lend impetus to the development of new anthelmintics with novel mechanism(s) of action. The free-living nematode Caenorhabditis elegans is an important model organism used for drug discovery, including the screening and structure-activity investigation of new compounds, and target deconvolution. Previously, we conducted a whole-organism phenotypic screen of the 'Pandemic Response Box' (from Medicines for Malaria Venture, MMV) and identified a hit compound, called ABX464, with activity against C. elegans and a related, parasitic nematode, Haemonchus contortus. Here, we tested a series of 44 synthesized analogues to explore the pharmacophore of activity on C. elegans and revealed five compounds whose potency was similar or greater than that of ABX464, but which were not toxic to human hepatoma (HepG2) cells. Subsequently, we employed thermal proteome profiling (TPP), protein structure prediction and an in silico-docking algorithm to predict ABX464-target candidates. Taken together, the findings from this study contribute significantly to the early-stage drug discovery of a new nematocide based on ABX464. Future work is aimed at validating the ABX464-protein interactions identified here, and at assessing ABX464 and associated analogues against a panel of parasitic nematodes, towards developing a new anthelmintic with a mechanism of action that is distinct from any of the compounds currently-available commercially.


Asunto(s)
Antihelmínticos , Nematodos , Quinolinas , Animales , Humanos , Caenorhabditis elegans , Antihelmínticos/farmacología , Antihelmínticos/química , Relación Estructura-Actividad
12.
Genome Biol ; 24(1): 277, 2023 Dec 04.
Artículo en Inglés | MEDLINE | ID: mdl-38049885

RESUMEN

BACKGROUND: Recent state-of-the-art sequencing technologies enable the investigation of challenging regions in the human genome and expand the scope of variant benchmarking datasets. Herein, we sequence a Chinese Quartet, comprising two monozygotic twin daughters and their biological parents, using four short and long sequencing platforms (Illumina, BGI, PacBio, and Oxford Nanopore Technology). RESULTS: The long reads from the monozygotic twin daughters are phased into paternal and maternal haplotypes using the parent-child genetic map and for each haplotype. We also use long reads to generate haplotype-resolved whole-genome assemblies with completeness and continuity exceeding that of GRCh38. Using this Quartet, we comprehensively catalogue the human variant landscape, generating a dataset of 3,962,453 SNVs, 886,648 indels (< 50 bp), 9726 large deletions (≥ 50 bp), 15,600 large insertions (≥ 50 bp), 40 inversions, 31 complex structural variants, and 68 de novo mutations which are shared between the monozygotic twin daughters. Variants underrepresented in previous benchmarks owing to their complexity-including those located at long repeat regions, complex structural variants, and de novo mutations-are systematically examined in this study. CONCLUSIONS: In summary, this study provides high-quality haplotype-resolved assemblies and a comprehensive set of benchmarking resources for two Chinese monozygotic twin samples which, relative to existing benchmarks, offers expanded genomic coverage and insight into complex variant categories.


Asunto(s)
Benchmarking , Pueblos del Este de Asia , Gemelos Monocigóticos , Humanos , Pueblos del Este de Asia/genética , Genómica , Haplotipos , Secuenciación de Nucleótidos de Alto Rendimiento , Análisis de Secuencia de ADN , Gemelos Monocigóticos/genética , Estudios en Gemelos como Asunto
13.
Genome Biol ; 24(1): 270, 2023 Nov 27.
Artículo en Inglés | MEDLINE | ID: mdl-38012772

RESUMEN

BACKGROUND: Genomic DNA reference materials are widely recognized as essential for ensuring data quality in omics research. However, relying solely on reference datasets to evaluate the accuracy of variant calling results is incomplete, as they are limited to benchmark regions. Therefore, it is important to develop DNA reference materials that enable the assessment of variant detection performance across the entire genome. RESULTS: We established a DNA reference material suite from four immortalized cell lines derived from a family of parents and monozygotic twins. Comprehensive reference datasets of 4.2 million small variants and 15,000 structural variants were integrated and certified for evaluating the reliability of germline variant calls inside the benchmark regions. Importantly, the genetic built-in-truth of the Quartet family design enables estimation of the precision of variant calls outside the benchmark regions. Using the Quartet reference materials along with study samples, batch effects are objectively monitored and alleviated by training a machine learning model with the Quartet reference datasets to remove potential artifact calls. Moreover, the matched RNA and protein reference materials and datasets from the Quartet project enables cross-omics validation of variant calls from multiomics data. CONCLUSIONS: The Quartet DNA reference materials and reference datasets provide a unique resource for objectively assessing the quality of germline variant calls throughout the whole-genome regions and improving the reliability of large-scale genomic profiling.


Asunto(s)
Benchmarking , Genoma Humano , Humanos , Reproducibilidad de los Resultados , Polimorfismo de Nucleótido Simple , Células Germinativas , Secuenciación de Nucleótidos de Alto Rendimiento/métodos
14.
Genome Biol ; 24(1): 245, 2023 10 26.
Artículo en Inglés | MEDLINE | ID: mdl-37884999

RESUMEN

The Quartet Data Portal facilitates community access to well-characterized reference materials, reference datasets, and related resources established based on a family of four individuals with identical twins from the Quartet Project. Users can request DNA, RNA, protein, and metabolite reference materials, as well as datasets generated across omics, platforms, labs, protocols, and batches. Reproducible analysis tools allow for objective performance assessment of user-submitted data, while interactive visualization tools support rapid exploration of reference datasets. A closed-loop "distribution-collection-evaluation-integration" workflow enables updates and integration of community-contributed multiomics data. Ultimately, this portal helps promote the advancement of reference datasets and multiomics quality control.


Asunto(s)
Multiómica , Programas Informáticos , Humanos , Control de Calidad
16.
Genome Biol ; 24(1): 201, 2023 09 07.
Artículo en Inglés | MEDLINE | ID: mdl-37674217

RESUMEN

BACKGROUND: Batch effects are notoriously common technical variations in multiomics data and may result in misleading outcomes if uncorrected or over-corrected. A plethora of batch-effect correction algorithms are proposed to facilitate data integration. However, their respective advantages and limitations are not adequately assessed in terms of omics types, the performance metrics, and the application scenarios. RESULTS: As part of the Quartet Project for quality control and data integration of multiomics profiling, we comprehensively assess the performance of seven batch effect correction algorithms based on different performance metrics of clinical relevance, i.e., the accuracy of identifying differentially expressed features, the robustness of predictive models, and the ability of accurately clustering cross-batch samples into their own donors. The ratio-based method, i.e., by scaling absolute feature values of study samples relative to those of concurrently profiled reference material(s), is found to be much more effective and broadly applicable than others, especially when batch effects are completely confounded with biological factors of study interests. We further provide practical guidelines for implementing the ratio based approach in increasingly large-scale multiomics studies. CONCLUSIONS: Multiomics measurements are prone to batch effects, which can be effectively corrected using ratio-based scaling of the multiomics data. Our study lays the foundation for eliminating batch effects at a ratio scale.


Asunto(s)
Algoritmos , Multiómica , Composición de Base , Benchmarking , Relevancia Clínica
17.
Genome Biol ; 24(1): 202, 2023 09 07.
Artículo en Inglés | MEDLINE | ID: mdl-37674236

RESUMEN

BACKGROUND: Quantitative proteomics is an indispensable tool in life science research. However, there is a lack of reference materials for evaluating the reproducibility of label-free liquid chromatography-tandem mass spectrometry (LC-MS/MS)-based measurements among different instruments and laboratories. RESULTS: Here, we develop the Quartet standard as a proteome reference material with built-in truths, and distribute the same aliquots to 15 laboratories with nine conventional LC-MS/MS platforms across six cities in China. Relative abundance of over 12,000 proteins on 816 mass spectrometry files are obtained and compared for reproducibility among the instruments and laboratories to ultimately generate proteomics benchmark datasets. There is a wide dynamic range of proteomes spanning about 7 orders of magnitude, and the injection order has marked effects on quantitative instead of qualitative characteristics. CONCLUSION: Overall, the Quartet offers valuable standard materials and data resources for improving the quality control of proteomic analyses as well as the reproducibility and reliability of research findings.


Asunto(s)
Proteómica , Espectrometría de Masas en Tándem , Cromatografía Liquida , Reproducibilidad de los Resultados , Proteoma
18.
Nat Biotechnol ; 2023 Sep 07.
Artículo en Inglés | MEDLINE | ID: mdl-37679545

RESUMEN

Certified RNA reference materials are indispensable for assessing the reliability of RNA sequencing to detect intrinsically small biological differences in clinical settings, such as molecular subtyping of diseases. As part of the Quartet Project for quality control and data integration of multi-omics profiling, we established four RNA reference materials derived from immortalized B-lymphoblastoid cell lines from four members of a monozygotic twin family. Additionally, we constructed ratio-based transcriptome-wide reference datasets between two samples, providing cross-platform and cross-laboratory 'ground truth'. Investigation of the intrinsically subtle biological differences among the Quartet samples enables sensitive assessment of cross-batch integration of transcriptomic measurements at the ratio level. The Quartet RNA reference materials, combined with the ratio-based reference datasets, can serve as unique resources for assessing and improving the quality of transcriptomic data in clinical and biological settings.

19.
Nat Biotechnol ; 2023 Sep 07.
Artículo en Inglés | MEDLINE | ID: mdl-37679543

RESUMEN

Characterization and integration of the genome, epigenome, transcriptome, proteome and metabolome of different datasets is difficult owing to a lack of ground truth. Here we develop and characterize suites of publicly available multi-omics reference materials of matched DNA, RNA, protein and metabolites derived from immortalized cell lines from a family quartet of parents and monozygotic twin daughters. These references provide built-in truth defined by relationships among the family members and the information flow from DNA to RNA to protein. We demonstrate how using a ratio-based profiling approach that scales the absolute feature values of a study sample relative to those of a concurrently measured common reference sample produces reproducible and comparable data suitable for integration across batches, labs, platforms and omics types. Our study identifies reference-free 'absolute' feature quantification as the root cause of irreproducibility in multi-omics measurement and data integration and establishes the advantages of ratio-based multi-omics profiling with common reference materials.

20.
Int J Mol Sci ; 24(15)2023 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-37569696

RESUMEN

Biodiversity within the animal kingdom is associated with extensive molecular diversity. The expansion of genomic, transcriptomic and proteomic data sets for invertebrate groups and species with unique biological traits necessitates reliable in silico tools for the accurate identification and annotation of molecules and molecular groups. However, conventional tools are inadequate for lesser-known organismal groups, such as eukaryotic pathogens (parasites), so that improved approaches are urgently needed. Here, we established a combined sequence- and structure-based workflow system to harness well-curated publicly available data sets and resources to identify, classify and annotate proteases and protease inhibitors of a highly pathogenic parasitic roundworm (nematode) of global relevance, called Haemonchus contortus (barber's pole worm). This workflow performed markedly better than conventional, sequence-based classification and annotation alone and allowed the first genome-wide characterisation of protease and protease inhibitor genes and gene products in this worm. In total, we identified 790 genes encoding 860 proteases and protease inhibitors representing 83 gene families. The proteins inferred included 280 metallo-, 145 cysteine, 142 serine, 121 aspartic and 81 "mixed" proteases as well as 91 protease inhibitors, all of which had marked physicochemical diversity and inferred involvements in >400 biological processes or pathways. A detailed investigation revealed a remarkable expansion of some protease or inhibitor gene families, which are likely linked to parasitism (e.g., host-parasite interactions, immunomodulation and blood-feeding) and exhibit stage- or sex-specific transcription profiles. This investigation provides a solid foundation for detailed explorations of the structures and functions of proteases and protease inhibitors of H. contortus and related nematodes, and it could assist in the discovery of new drug or vaccine targets against infections or diseases.


Asunto(s)
Haemonchus , Nematodos , Parásitos , Animales , Masculino , Femenino , Haemonchus/genética , Haemonchus/química , Haemonchus/metabolismo , Interacciones Huésped-Parásitos/genética , Péptido Hidrolasas/metabolismo , Proteómica , Inhibidores de Proteasas/farmacología , Inhibidores de Proteasas/metabolismo , Endopeptidasas/metabolismo , Informática
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA