ABSTRACT
INTRODUCTION: Maximising alternative sample types for genomics in advanced lung cancer is important because bronchoscopic samples may sometimes be insufficient for this purpose. Further, the clinical applications of comprehensive molecular analysis such as whole genome sequencing (WGS) are rapidly developing. Diff-Quik cytology smears from EBUS TBNA is an alternative source of DNA, but its feasibility for WGS has not been previously demonstrated. METHODS: Diff-Quik smears were collected along with research cell pellets. RESULTS: Tumour content of smears were compared to research cell pellets from 42 patients, which showed good correlation (Spearman correlation 0.85, P < 0.0001). A subset of eight smears underwent WGS, which presented similar mutation profiles to WGS of the matched cell pellet. DNA yield was predicted using a regression equation of the smears cytology features, which correctly predicted DNA yield > 1500 ng in 7 out of 8 smears. CONCLUSIONS: WGS of commonly collected Diff-Quik slides is feasible and their DNA yield can be predicted.
Subject(s)
Lung Neoplasms , Humans , Lung Neoplasms/genetics , Lung Neoplasms/pathology , Biopsy, Fine-Needle , Endosonography , Whole Genome Sequencing , Endoscopic Ultrasound-Guided Fine Needle Aspiration , Bronchoscopy , Lymph Nodes/pathologyABSTRACT
OBJECTIVES: Analysis of oral dysbiosis in individuals sharing genetic and environmental risk factors with rheumatoid arthritis (RA) patients may illuminate how microbiota contribute to disease susceptibility. We studied the oral microbiota in a prospective cohort of patients with RA, first-degree relatives (FDR) and healthy controls (HC), then genomically and functionally characterised streptococcal species from each group to understand their potential contribution to RA development. METHODS: After DNA extraction from tongue swabs, targeted 16S rRNA gene sequencing and statistical analysis, we defined a microbial dysbiosis score based on an operational taxonomic unit signature of disease. After selective culture from swabs, we identified streptococci by sequencing. We examined the ability of streptococcal cell walls (SCW) from isolates to induce cytokines from splenocytes and arthritis in ZAP-70-mutant SKG mice. RESULTS: RA and FDR were more likely to have periodontitis symptoms. An oral microbial dysbiosis score discriminated RA and HC subjects and predicted similarity of FDR to RA. Streptococcaceae were major contributors to the score. We identified 10 out of 15 streptococcal isolates as S. parasalivarius sp. nov., a distinct sister species to S. salivarius. Tumour necrosis factor and interleukin 6 production in vitro differed in response to individual S. parasalivarius isolates, suggesting strain specific effects on innate immunity. Cytokine secretion was associated with the presence of proteins potentially involved in S. parasalivarius SCW synthesis. Systemic administration of SCW from RA and HC-associated S. parasalivarius strains induced similar chronic arthritis. CONCLUSIONS: Dysbiosis-associated periodontal inflammation and barrier dysfunction may permit arthritogenic insoluble pro-inflammatory pathogen-associated molecules, like SCW, to reach synovial tissue.
Subject(s)
Arthritis, Rheumatoid/microbiology , Biopolymers/isolation & purification , Dysbiosis/microbiology , Peptidoglycan/isolation & purification , Periodontitis/microbiology , Streptococcus/isolation & purification , Adult , Animals , Disease Susceptibility/microbiology , Female , Humans , Male , Mice , Microbiota , Middle Aged , Mouth/microbiology , Pedigree , RNA, Ribosomal, 16SABSTRACT
We examined transcriptional changes in CD4+ T cells during blood-stage Plasmodium falciparum infection in individuals without a history of previous parasite exposure. Transcription of CXCL8 (encoding interleukin 8) in CD4+ T cells was identified as an early biomarker of submicroscopic P. falciparum infection, with predictive power for parasite growth. Following antiparasitic drug treatment, a CD4+ T-cell regulatory phenotype developed. PD1 expression on CD49b+CD4+ T (putative type I regulatory T) cells after drug treatment negatively correlated with earlier parasite growth. Blockade of PD1 but no other immune checkpoint molecules tested increased interferon γ and interleukin 10 production in an ex vivo antigen-specific cellular assay at the peak of infection. These results demonstrate the early development of an immunoregulatory CD4+ T-cell phenotype in blood-stage P. falciparum infection and show that a selective immune checkpoint blockade may be used to modulate early developing antiparasitic immunoregulatory pathways as part of malaria vaccine and/or drug treatment protocols.
Subject(s)
Interleukin-8/genetics , Malaria Vaccines/immunology , Malaria, Falciparum/immunology , Plasmodium falciparum/immunology , Adolescent , Adult , Biomarkers/analysis , CD4-Positive T-Lymphocytes/immunology , Computational Biology , Humans , Lymphocyte Activation , Malaria, Falciparum/parasitology , Middle Aged , Parasitemia , Phenotype , T-Lymphocytes, Regulatory/immunology , Young AdultABSTRACT
BACKGROUND: Correct identification of the amyloidosis-causing protein is crucial for clinical management. Recently the Mayo Clinic reported laser-capture microdissection (LCM) with liquid chromatography-coupled tandem mass spectrometry (MS/MS) as a new diagnostic tool for amyloid diagnosis. Here, we report an independent implementation of this proteomic diagnostics method at the Princess Alexandra Hospital Amyloidosis Centre in Brisbane, Australia. RESULTS: From 2010 to 2014, 138 biopsies received from 35 different organ sites were analysed by LCM-MS/MS using Congo Red staining to visualise amyloid deposits. There was insufficient tissue in the block for LCM for 7 cases. An amyloid forming protein was ultimately identified in 121 out of 131 attempted cases (94 %). Of the 121 successful cases, the Mayo Clinic amyloid proteomic signature (at least two of Serum Amyloid P, ApoE and ApoA4) was detected in 92 (76 %). Low levels of additional amyloid forming proteins were frequently identified with the main amyloid forming protein, which may reflect co-deposition of fibrils. Furthermore, vitronectin and clusterin were frequently identified in our samples. Adding vitronectin to the amyloid signature increases the number of positive cases, suggesting a potential 4th protein for the signature. In terms of clinical impact, amyloid typing by immunohistochemistry was attempted in 88 cases, reported as diagnostic in 39, however, 5 were subsequently revealed by proteomic analysis to be incorrect. Overall, the referring clinician's diagnosis of amyloid subtype was altered by proteomic analysis in 24 % of cases. While LCM-MS/MS was highly robust in protein identification, clinical information was still required for subtyping, particularly for systemic versus localized amyloidosis. CONCLUSIONS: This study reports the independent implementation and evaluation of a proteomics-based diagnostic for amyloidosis subtyping. Our results support LCM-MS/MS as a powerful new diagnostic technique for amyloidosis, but also identified some challenges and further development opportunities.
ABSTRACT
Endobronchial ultrasound-guided transbronchial needle aspiration (EBUS-TBNA) is often the only source of tumor tissue from patients with advanced, inoperable lung cancer. EBUS-TBNA aspirates are used for the diagnosis, staging, and genomic testing to inform therapy options. Here we extracted DNA and RNA from 220 EBUS-TBNA aspirates to evaluate their suitability for whole genome (WGS), whole exome (WES), and comprehensive panel sequencing. For a subset of 40 cases, the same nucleic acid extraction was sequenced using WGS, WES, and the TruSight Oncology 500 assay. Genomic features were compared between sequencing platforms and compared with those reported by clinical testing. A total of 204 aspirates (92.7%) had sufficient DNA (100 ng) for comprehensive panel sequencing, and 109 aspirates (49.5%) had sufficient material for WGS. Comprehensive sequencing platforms detected all seven clinically reported tier 1 actionable mutations, an additional three (7%) tier 1 mutations, six (15%) tier 2-3 mutations, and biomarkers of potential immunotherapy benefit (tumor mutation burden and microsatellite instability). As expected, WGS was more suited for the detection and discovery of emerging novel biomarkers of treatment response. WGS could be performed in half of all EBUS-TBNA aspirates, which points to the enormous potential of EBUS-TBNA as source material for large, well-curated discovery-based studies for novel and more effective predictors of treatment response. Comprehensive panel sequencing is possible in the vast majority of fresh EBUS-TBNA aspirates and enhances the detection of actionable mutations over current clinical testing.
ABSTRACT
BACKGROUND: Cytology smears are commonly collected during endobronchial ultrasound-guided transbronchial needle aspiration (EBUS TBNA) procedures but are rarely used for molecular testing. Studies are needed to demonstrate their great potential, in particular for the prediction of malignant cell DNA content and for utility in molecular diagnostics using large gene panels. METHODS: A prospective study was performed on samples from 66 patients with malignant lymph nodes who underwent EBUS TBNA. All patients had air-dried, Diff-Quik cytology smears and formalin-fixed, paraffin-embedded cell blocks collected for cytopathology and molecular testing. One hundred eighty-five smears were evaluated by microscopy to estimate malignant cell percentage and abundance and to calculate smear size and were subjected to DNA extraction. DNA from 56 smears from 27 patients was sequenced with the TruSight Oncology 500 assay (Illumina). RESULTS: Each microscopy parameter had a significant effect on the DNA yield. An algorithm was developed that predicted a >50-ng DNA yield of a smear with an area under the curve of 0.86. Fifty DNA samples (89%) with varying malignant yields were successfully sequenced. Low-malignant-cell content (<25%) and smear area (<15%) were the main reasons for failure. All standard-of-care mutations were detected in replicate smears from individual patients, regardless of malignant cell content. Tier 1/2 mutations were discovered in two cases where standard-of-care specimens were inadequate for sequencing. Smears were scored for tumor mutation burden. CONCLUSIONS: Microscopy of Diff-Quik smears can triage samples for comprehensive panel sequencing, which highlights smears as an excellent alternative to traditional testing with cell blocks.
Subject(s)
Lung Neoplasms , Humans , Prospective Studies , Lung Neoplasms/diagnosis , Lung Neoplasms/genetics , Lung Neoplasms/pathology , Endoscopic Ultrasound-Guided Fine Needle Aspiration/methods , Mutation , Lymph Nodes/pathologyABSTRACT
Introduction: Tumour Mutation Burden (TMB) is a potential biomarker for immune cancer therapies. Here we investigated parameters that might affect TMB using duplicate cytology smears obtained from endobronchial ultrasound transbronchial needle aspiration (EBUS TBNA)-sampled malignant lymph nodes. Methods: Individual Diff-Quik cytology smears were prepared for each needle pass. DNA extracted from each smear underwent sequencing using large gene panel (TruSight Oncology 500 (TSO500 - Illumina)). TMB was estimated using the TSO500 Local App v. 2.0 (Illumina). Results: Twenty patients had two or more Diff-Quik smears (total 45 smears) which passed sequencing quality control. Average smear TMB was 8.7 ± 5.0 mutations per megabase (Mb). Sixteen of the 20 patients had paired samples with minimal differences in TMB score (average difference 1.3 ± 0.85). Paired samples from 13 patients had concordant TMB (scores below or above a threshold of 10 mutations/Mb). Markedly discrepant TMB was observed in four cases, with an average difference of 11.3 ± 2.7 mutations/Mb. Factors affecting TMB calling included sample tumour content, the amount of DNA used in sequencing, and bone fide heterogeneity of node tumour between paired samples. Conclusion: TMB assessment is feasible from EBUS-TBNA smears from a single needle pass. Repeated samples of a lymph node station have minimal variation in TMB in most cases. However, this novel data shows how tumour content and minor change in site of node sampling can impact TMB. Further study is needed on whether all node aspirates should be combined in 1 sample, or whether testing independent nodes using smears is needed.
ABSTRACT
Oesophageal adenocarcinoma is a poor prognosis cancer and the molecular features underpinning response to treatment remain unclear. We investigate whole genome, transcriptomic and methylation data from 115 oesophageal adenocarcinoma patients mostly from the DOCTOR phase II clinical trial (Australian New Zealand Clinical Trials Registry-ACTRN12609000665235), with exploratory analysis pre-specified in the study protocol of the trial. We report genomic features associated with poorer overall survival, such as the APOBEC mutational and RS3-like rearrangement signatures. We also show that positron emission tomography non-responders have more sub-clonal genomic copy number alterations. Transcriptomic analysis categorises patients into four immune clusters correlated with survival. The immune suppressed cluster is associated with worse survival, enriched with myeloid-derived cells, and an epithelial-mesenchymal transition signature. The immune hot cluster is associated with better survival, enriched with lymphocytes, myeloid-derived cells, and an immune signature including CCL5, CD8A, and NKG7. The immune clusters highlight patients who may respond to immunotherapy and thus may guide future clinical trials.
Subject(s)
Adenocarcinoma , Esophageal Neoplasms , Humans , Neoadjuvant Therapy , Multiomics , Australia , Adenocarcinoma/drug therapy , Adenocarcinoma/genetics , Esophageal Neoplasms/drug therapy , Esophageal Neoplasms/geneticsABSTRACT
BACKGROUND: Many families and individuals do not meet criteria for a known hereditary cancer syndrome but display unusual clusters of cancers. These families may carry pathogenic variants in cancer predisposition genes and be at higher risk for developing cancer. METHODS: This multi-centre prospective study recruited 195 cancer-affected participants suspected to have a hereditary cancer syndrome for whom previous clinical targeted genetic testing was either not informative or not available. To identify pathogenic disease-causing variants explaining participant presentation, germline whole-genome sequencing (WGS) and a comprehensive cancer virtual gene panel analysis were undertaken. RESULTS: Pathogenic variants consistent with the presenting cancer(s) were identified in 5.1% (10/195) of participants and pathogenic variants considered secondary findings with potential risk management implications were identified in another 9.7% (19/195) of participants. Health economic analysis estimated the marginal cost per case with an actionable variant was significantly lower for upfront WGS with virtual panel ($8744AUD) compared to standard testing followed by WGS ($24,894AUD). Financial analysis suggests that national adoption of diagnostic WGS testing would require a ninefold increase in government annual expenditure compared to conventional testing. CONCLUSIONS: These findings make a case for replacing conventional testing with WGS to deliver clinically important benefits for cancer patients and families. The uptake of such an approach will depend on the perspectives of different payers on affordability.
Subject(s)
Neoplastic Syndromes, Hereditary , Humans , Prospective Studies , Oncogenes , Genetic Testing , Germ CellsABSTRACT
Introduction: Endobronchial ultrasound-guided transbronchial needle aspiration (EBUS TBNA) is an important means of obtaining a tissue for advanced lung cancer. Optimizing the EBUS TBNA needling technique is important to maintain procedural simplicity and maximize sample quality for emerging molecular diagnostics. Methods: We prospectively explored three versus 10 agitations of the needle in sequential passes into the lymph node using separate needles. Resulting Diff-Quik cytology smears were quantitatively assessed using microscopic (tumor cell cellularity, abundance scores, erythrocyte contamination) and DNA yields. Microscopy was reported by two cytopathologists, and an inter-rater assessment was made by four additional cytopathologists. Results: In 86 patients confirmed as having malignant disease by EBUS TBNA (45 males, 41 females), a mean of 5.3 smears were made per patient with a total of 459 smears scored by pathologists and 168 paired smears extracted for DNA. There was no significant difference between three versus 10 agitations for smear cellularity (p = 0.44), DNA yield (p = 0.84), or DNA integrity (p = 0.20), but there was significantly less contamination by erythrocytes from three agitations (chi-square p = 0.008). There was significantly more DNA in the first pass into the node using three agitations than with other passes and with 10 agitations (pass × agitations interaction, p = 0.031). Reviewing pathologists correctly classified smears as more than or equal to 25% cellularity 86.3% of the time (κ = 0.63 [95% confidence interval: 0.55-0.71]). Conclusions: Three agitations are noninferior to 10 agitations for overall abundance of malignant cells and DNA content on smears. A smear with adequate DNA for panel sequencing could almost always be made with the first needle pass using three agitations.
ABSTRACT
BACKGROUND: Endometrial cancer (EC) is a major gynecological cancer with increasing incidence. It comprises four molecular subtypes with differing etiology, prognoses, and responses to chemotherapy. In the future, clinical trials testing new single agents or combination therapies will be targeted to the molecular subtype most likely to respond. As pre-clinical models that faithfully represent the molecular subtypes of EC are urgently needed, we sought to develop and characterize a panel of novel EC patient-derived xenograft (PDX) models. METHODS: Here, we report whole exome or whole genome sequencing of 11 PDX models and their matched primary tumor. Analysis of multiple PDX lineages and passages was performed to study tumor heterogeneity across lineages and/or passages. Based on recent reports of frequent defects in the homologous recombination (HR) pathway in EC, we assessed mutational signatures and HR deficiency scores and correlated these with in vivo responses to the PARP inhibitor (PARPi) talazoparib in six PDXs representing the copy number high/p53-mutant and mismatch-repair deficient molecular subtypes of EC. RESULTS: PDX models were successfully generated from grade 2/3 tumors, including three uterine carcinosarcomas. The models showed similar histomorphology to the primary tumors and represented all four molecular subtypes of EC, including five mismatch-repair deficient models. The different PDX lineages showed a wide range of inter-tumor and intra-tumor heterogeneity. However, for most PDX models, one arm recapitulated the molecular landscape of the primary tumor without major genomic drift. An in vivo response to talazoparib was detected in four copy number high models. Two models (carcinosarcomas) showed a response consistent with stable disease and two models (one copy number high serous EC and another carcinosarcoma) showed significant tumor growth inhibition, albeit one consistent with progressive disease; however, all lacked the HR deficiency genomic signature. CONCLUSIONS: EC PDX models represent the four molecular subtypes of disease and can capture intra-tumor heterogeneity of the original primary tumor. PDXs of the copy number high molecular subtype showed sensitivity to PARPi; however, deeper and more durable responses will likely require combination of PARPi with other agents.
Subject(s)
Antineoplastic Agents , Endometrial Neoplasms , Antineoplastic Agents/therapeutic use , Endometrial Neoplasms/drug therapy , Endometrial Neoplasms/genetics , Female , Genomics , Heterografts , Humans , Poly(ADP-ribose) Polymerase Inhibitors/pharmacology , Xenograft Model Antitumor AssaysABSTRACT
We concurrently examine the whole genome, transcriptome, methylome, and immune cell infiltrates in baseline tumors from 77 patients with advanced cutaneous melanoma treated with anti-PD-1 with or without anti-CTLA-4. We show that high tumor mutation burden (TMB), neoantigen load, expression of IFNγ-related genes, programmed death ligand expression, low PSMB8 methylation (therefore high expression), and T cells in the tumor microenvironment are associated with response to immunotherapy. No specific mutation correlates with therapy response. A multivariable model combining the TMB and IFNγ-related gene expression robustly predicts response (89% sensitivity, 53% specificity, area under the curve [AUC], 0.84); tumors with high TMB and a high IFNγ signature show the best response to immunotherapy. This model validates in an independent cohort (80% sensitivity, 59% specificity, AUC, 0.79). Except for a JAK3 loss-of-function mutation, for patients who did not respond as predicted there is no obvious biological mechanism that clearly explained their outlier status, consistent with intratumor and intertumor heterogeneity in response to immunotherapy.
Subject(s)
Antineoplastic Agents, Immunological/therapeutic use , Carcinoma, Non-Small-Cell Lung/drug therapy , Lung Neoplasms/drug therapy , Melanoma/drug therapy , Skin Neoplasms/drug therapy , B7-H1 Antigen/immunology , Biomarkers, Tumor/genetics , CTLA-4 Antigen/immunology , Carcinoma, Non-Small-Cell Lung/immunology , Humans , Immunotherapy/methods , Lung Neoplasms/genetics , Lung Neoplasms/immunology , Melanoma/immunology , Mutation/genetics , Skin Neoplasms/immunology , Tumor Microenvironment/immunology , Melanoma, Cutaneous MalignantABSTRACT
BACKGROUND: Malignant pleural mesothelioma (MPM) has a poor overall survival with few treatment options. Whole genome sequencing (WGS) combined with the immune features of MPM offers the prospect of identifying changes that could inform future clinical trials. METHODS: We analysed somatic mutations from 229 MPM samples, including previously published data and 58 samples that had undergone WGS within this study. This was combined with RNA-seq analysis to characterize the tumour immune environment. RESULTS: The comprehensive genome analysis identified 12 driver genes, including new candidate genes. Whole genome doubling was a frequent event that correlated with shorter survival. Mutational signature analysis revealed SBS5/40 were dominant in 93% of samples, and defects in homologous recombination repair were infrequent in our cohort. The tumour immune environment contained high M2 macrophage infiltrate linked with MMP2, MMP14, TGFB1 and CCL2 expression, representing an immune suppressive environment. The expression of TGFB1 was associated with overall survival. A small subset of samples (less than 10%) had a higher proportion of CD8 T cells and a high cytolytic score, suggesting a 'hot' immune environment independent of the somatic mutations. CONCLUSIONS: We propose accounting for genomic and immune microenvironment status may influence therapeutic planning in the future.
Subject(s)
Lung Neoplasms , Mesothelioma, Malignant , Mesothelioma , Pleural Neoplasms , Genomics , Humans , Lung Neoplasms/genetics , Mesothelioma/genetics , Pleural Neoplasms/genetics , Pleural Neoplasms/pathology , Tumor Microenvironment/geneticsABSTRACT
Melanoma is a cancer of melanocytes, with multiple subtypes based on body site location. Cutaneous melanoma is associated with skin exposed to ultraviolet radiation; uveal melanoma occurs in the eyes; mucosal melanoma occurs in internal mucous membranes; and acral melanoma occurs on the palms, soles, and nail beds. Here, we present the largest whole-genome sequencing study of melanoma to date, with 570 tumors profiled, as well as methylation and RNA sequencing for subsets of tumors. Uveal melanoma is genomically distinct from other melanoma subtypes, harboring the lowest tumor mutation burden and with significantly mutated genes in the G-protein signaling pathway. Most cutaneous, acral, and mucosal melanomas share alterations in components of the MAPK, PI3K, p53, p16, and telomere pathways. However, the mechanism by which these pathways are activated or inactivated varies between melanoma subtypes. Additionally, we identify potential novel germline predisposition genes for some of the less common melanoma subtypes. SIGNIFICANCE: This is the largest whole-genome analysis of melanoma to date, comprehensively comparing the genomics of the four major melanoma subtypes. This study highlights both similarities and differences between the subtypes, providing insights into the etiology and biology of melanoma. This article is highlighted in the In This Issue feature, p. 2711.
Subject(s)
Melanoma , Skin Neoplasms , Humans , Melanoma/pathology , Skin Neoplasms/genetics , Skin Neoplasms/metabolism , Ultraviolet Rays , Genomics , Mutation , Melanoma, Cutaneous MalignantABSTRACT
Here we report the DNA methylation profile of 84 sporadic pancreatic neuroendocrine tumors (PanNETs) with associated clinical and genomic information. We identified three subgroups of PanNETs, termed T1, T2 and T3, with distinct patterns of methylation. The T1 subgroup was enriched for functional tumors and ATRX, DAXX and MEN1 wild-type genotypes. The T2 subgroup contained tumors with mutations in ATRX, DAXX and MEN1 and recurrent patterns of chromosomal losses in half of the genome with no association between regions with recurrent loss and methylation levels. T2 tumors were larger and had lower methylation in the MGMT gene body, which showed positive correlation with gene expression. The T3 subgroup harboured mutations in MEN1 with recurrent loss of chromosome 11, was enriched for grade G1 tumors and showed histological parameters associated with better prognosis. Our results suggest a role for methylation in both driving tumorigenesis and potentially stratifying prognosis in PanNETs.
Subject(s)
Biomarkers, Tumor/genetics , Carcinoma, Neuroendocrine/genetics , DNA Methylation , Epigenesis, Genetic , Epigenome , Pancreatic Neoplasms/genetics , Carcinoma, Neuroendocrine/metabolism , Epigenomics , Genetic Predisposition to Disease , Humans , Neoplasm Grading , Pancreatic Neoplasms/pathology , Phenotype , Tumor BurdenABSTRACT
To increase understanding of the genomic landscape of acral melanoma, a rare form of melanoma occurring on palms, soles or nail beds, whole genome sequencing of 87 tumors with matching transcriptome sequencing for 63 tumors was performed. Here we report that mutational signature analysis reveals a subset of tumors, mostly subungual, with an ultraviolet radiation signature. Significantly mutated genes are BRAF, NRAS, NF1, NOTCH2, PTEN and TYRP1. Mutations and amplification of KIT are also common. Structural rearrangement and copy number signatures show that whole genome duplication, aneuploidy and complex rearrangements are common. Complex rearrangements occur recurrently and are associated with amplification of TERT, CDK4, MDM2, CCND1, PAK1 and GAB2, indicating potential therapeutic options.
Subject(s)
Melanoma/genetics , Skin Neoplasms/genetics , Female , GTP Phosphohydrolases/genetics , GTP Phosphohydrolases/metabolism , Gene Amplification , Gene Dosage , Genomics , Humans , Male , Melanoma/metabolism , Membrane Glycoproteins/genetics , Membrane Glycoproteins/metabolism , Membrane Proteins/genetics , Membrane Proteins/metabolism , Mutation , Oxidoreductases/genetics , Oxidoreductases/metabolism , Proto-Oncogene Proteins B-raf/genetics , Proto-Oncogene Proteins B-raf/metabolism , Receptor, Notch2/genetics , Receptor, Notch2/metabolism , Skin Neoplasms/metabolism , Whole Genome SequencingABSTRACT
Culture independent techniques, such as shotgun metagenomics and 16S rRNA amplicon sequencing have dramatically changed the way we can examine microbial communities. Recently, changes in microbial community structure and dynamics have been associated with a growing list of human diseases. The identification and comparison of bacteria driving those changes requires the development of sound statistical tools, especially if microbial biomarkers are to be used in a clinical setting. We present mixMC, a novel multivariate data analysis framework for metagenomic biomarker discovery. mixMC accounts for the compositional nature of 16S data and enables detection of subtle differences when high inter-subject variability is present due to microbial sampling performed repeatedly on the same subjects, but in multiple habitats. Through data dimension reduction the multivariate methods provide insightful graphical visualisations to characterise each type of environment in a detailed manner. We applied mixMC to 16S microbiome studies focusing on multiple body sites in healthy individuals, compared our results with existing statistical tools and illustrated added value of using multivariate methodologies to fully characterise and compare microbial communities.