RESUMO
BACKGROUND: X-chromosome inactivation (XCI) is an epigenetic process that occurs during early development in mammalian females by randomly silencing one of two copies of the X chromosome in each cell. The preferential inactivation of either the maternal or paternal copy of the X chromosome in a majority of cells results in a skewed or non-random pattern of X inactivation and is observed in over 25% of adult females. Identifying skewed X inactivation is of clinical significance in patients with suspected rare genetic diseases due to the possibility of biased expression of disease-causing genes present on the active X chromosome. The current clinical test for the detection of skewed XCI relies on the methylation status of the methylation-sensitive restriction enzyme (Hpall) binding site present in proximity of short tandem polymorphic repeats on the androgen receptor (AR) gene. This approach using one locus results in uninformative or inconclusive data for 10-20% of tests. Further, recent studies have shown inconsistency between methylation of the AR locus and the state of inactivation of the X chromosome. Herein, we develop a method for estimating X inactivation status, using exome and transcriptome sequencing data derived from blood in 227 female samples. We built a reference model for evaluation of XCI in 135 females from the GTEx consortium. We tested and validated the model on 11 female individuals with different types of undiagnosed rare genetic disorders who were clinically tested for X-skew using the AR gene assay and compared results to our outlier-based analysis technique. RESULTS: In comparison to the AR clinical test for identification of X inactivation, our method was concordant with the AR method in 9 samples, discordant in 1, and provided a measure of X inactivation in 1 sample with uninformative clinical results. We applied this method on an additional 81 females presenting to the clinic with phenotypes consistent with different hereditary disorders without a known genetic diagnosis. CONCLUSIONS: This study presents the use of transcriptome and exome sequencing data to provide an accurate and complete estimation of X-inactivation and skew status in a cohort of female patients with different types of suspected rare genetic disease.
Assuntos
Exoma , Inativação do Cromossomo X , Adulto , Humanos , Feminino , Transcriptoma , Sequenciamento do Exoma , Cromossomos Humanos X/genéticaRESUMO
Most rare disease patients (75-50%) undergoing genomic sequencing remain unsolved, often due to lack of information about variants identified. Data review over time can leverage novel information regarding disease-causing variants and genes, increasing this diagnostic yield. However, time and resource constraints have limited reanalysis of genetic data in clinical laboratories setting. We developed RENEW, (REannotation of NEgative WES/WGS) an automated reannotation procedure that uses relevant new information in on-line genomic databases to enable rapid review of genomic findings. We tested RENEW in an unselected cohort of 1066 undiagnosed cases with a broad spectrum of phenotypes from the Mayo Clinic Center for Individualized Medicine using new information in ClinVar, HGMD and OMIM between the date of previous analysis/testing and April of 2022. 5741 variants prioritized by RENEW were rapidly reviewed by variant interpretation specialists. Mean analysis time was approximately 20 s per variant (32 h total time). Reviewed cases were classified as: 879 (93.0%) undiagnosed, 63 (6.6%) putatively diagnosed, and 4 (0.4%) definitively diagnosed. New strategies are needed to enable efficient review of genomic findings in unsolved cases. We report on a fast and practical approach to address this need and improve overall diagnostic success in patient testing through a recurrent reannotation process.
Assuntos
Genômica , Humanos , Genômica/métodos , Exoma/genética , Sequenciamento do Exoma/métodos , Bases de Dados Genéticas , Testes Genéticos/métodos , Genoma Humano , Sequenciamento Completo do Genoma/métodos , FenótipoRESUMO
MOTIVATION: Next-generation sequencing is rapidly improving diagnostic rates in rare Mendelian diseases, but even with whole genome or whole exome sequencing, the majority of cases remain unsolved. Increasingly, RNA sequencing is being used to solve many cases that evade diagnosis through sequencing alone. Specifically, the detection of aberrant splicing in many rare disease patients suggests that identifying RNA splicing outliers is particularly useful for determining causal Mendelian disease genes. However, there is as yet a paucity of statistical methodologies to detect splicing outliers. RESULTS: We developed LeafCutterMD, a new statistical framework that significantly improves the previously published LeafCutter in the context of detecting outlier splicing events. Through simulations and analysis of real patient data, we demonstrate that LeafCutterMD has better power than the state-of-the-art methodology while controlling false-positive rates. When applied to a cohort of disease-affected probands from the Mayo Clinic Center for Individualized Medicine, LeafCutterMD recovered all aberrantly spliced genes that had previously been identified by manual curation efforts. AVAILABILITY AND IMPLEMENTATION: The source code for this method is available under the opensource Apache 2.0 license in the latest release of the LeafCutter software package available online at http://davidaknowles.github.io/leafcutter. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Genoma , Doenças Raras , Algoritmos , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Splicing de RNA , Doenças Raras/diagnóstico , Doenças Raras/genética , Análise de Sequência de RNA , SoftwareRESUMO
PURPOSE: Exome sequencing often identifies pathogenic genetic variants in patients with undiagnosed diseases. Nevertheless, frequent findings of variants of uncertain significance necessitate additional efforts to establish causality before reaching a conclusive diagnosis. To provide comprehensive genomic testing to patients with undiagnosed disease, we established an Individualized Medicine Clinic, which offered clinical exome testing and included a Translational Omics Program (TOP) that provided variant curation, research activities, or research exome sequencing. METHODS: From 2012 to 2018, 1101 unselected patients with undiagnosed diseases received exome testing. Outcomes were reviewed to assess impact of the TOP and patient characteristics on diagnostic rates through descriptive and multivariate analyses. RESULTS: The overall diagnostic yield was 24.9% (274 of 1101 patients), with 174 (15.8% of 1101) diagnosed on the basis of clinical exome sequencing alone. Four hundred twenty-three patients with nondiagnostic or without access to clinical exome sequencing were evaluated by the TOP, with 100 (9% of 1101) patients receiving a diagnosis, accounting for 36.5% of the diagnostic yield. The identification of a genetic diagnosis was influenced by the age at time of testing and the disease phenotype of the patient. CONCLUSION: Integration of translational research activities into clinical practice of a tertiary medical center can significantly increase the diagnostic yield of patients with undiagnosed disease.
Assuntos
Exoma , Doenças não Diagnosticadas , Exoma/genética , Testes Genéticos , Humanos , Fenótipo , Pesquisa Translacional Biomédica , Sequenciamento do ExomaRESUMO
BACKGROUND: COVID-19 is caused by the SARS-CoV-2 virus and has strikingly heterogeneous clinical manifestations, with most individuals contracting mild disease but a substantial minority experiencing fulminant cardiopulmonary symptoms or death. The clinical covariates and the laboratory tests performed on a patient provide robust statistics to guide clinical treatment. Deep learning approaches on a data set of this nature enable patient stratification and provide methods to guide clinical treatment. OBJECTIVE: Here, we report on the development and prospective validation of a state-of-the-art machine learning model to provide mortality prediction shortly after confirmation of SARS-CoV-2 infection in the Mayo Clinic patient population. METHODS: We retrospectively constructed one of the largest reported and most geographically diverse laboratory information system and electronic health record of COVID-19 data sets in the published literature, which included 11,807 patients residing in 41 states of the United States of America and treated at medical sites across 5 states in 3 time zones. Traditional machine learning models were evaluated independently as well as in a stacked learner approach by using AutoGluon, and various recurrent neural network architectures were considered. The traditional machine learning models were implemented using the AutoGluon-Tabular framework, whereas the recurrent neural networks utilized the TensorFlow Keras framework. We trained these models to operate solely using routine laboratory measurements and clinical covariates available within 72 hours of a patient's first positive COVID-19 nucleic acid test result. RESULTS: The GRU-D recurrent neural network achieved peak cross-validation performance with 0.938 (SE 0.004) as the area under the receiver operating characteristic (AUROC) curve. This model retained strong performance by reducing the follow-up time to 12 hours (0.916 [SE 0.005] AUROC), and the leave-one-out feature importance analysis indicated that the most independently valuable features were age, Charlson comorbidity index, minimum oxygen saturation, fibrinogen level, and serum iron level. In the prospective testing cohort, this model provided an AUROC of 0.901 and a statistically significant difference in survival (P<.001, hazard ratio for those predicted to survive, 95% CI 0.043-0.106). CONCLUSIONS: Our deep learning approach using GRU-D provides an alert system to flag mortality for COVID-19-positive patients by using clinical covariates and laboratory values within a 72-hour window after the first positive nucleic acid test result.
Assuntos
COVID-19 , Sistemas de Informação em Laboratório Clínico , Aprendizado Profundo , Algoritmos , Registros Eletrônicos de Saúde , Humanos , Estudos Retrospectivos , SARS-CoV-2RESUMO
PURPOSE: We report a female infant identified by newborn screening for severe combined immunodeficiencies (NBS SCID) with T cell lymphopenia (TCL). The patient had persistently elevated alpha-fetoprotein (AFP) with IgA deficiency, and elevated IgM. Gene sequencing for a SCID panel was uninformative. We sought to determine the cause of the immunodeficiency in this infant. METHODS: We performed whole-exome sequencing (WES) on the patient and parents to identify a genetic diagnosis. Based on the WES result, we developed a novel flow cytometric panel for rapid assessment of DNA repair defects using blood samples. We also performed whole transcriptome sequencing (WTS) on fibroblast RNA from the patient and father for abnormal transcript analysis. RESULTS: WES revealed a pathogenic paternally inherited indel in ATM. We used the flow panel to assess several proteins in the DNA repair pathway in lymphocyte subsets. The patient had absent phosphorylation of ATM, resulting in absent or aberrant phosphorylation of downstream proteins, including γH2AX. However, ataxia-telangiectasia (AT) is an autosomal recessive condition, and the abnormal functional data did not correspond with a single ATM variant. WTS revealed in-frame reciprocal fusion transcripts involving ATM and SLC35F2 indicating a chromosome 11 inversion within 11q22.3, of maternal origin. Inversion breakpoints were identified within ATM intron 16 and SLC35F2 intron 7. CONCLUSIONS: We identified a novel ATM-breaking chromosome 11 inversion in trans with a pathogenic indel (compound heterozygote) resulting in non-functional ATM protein, consistent with a diagnosis of AT. Utilization of several molecular and functional assays allowed successful resolution of this case.
Assuntos
Genômica , Síndromes de Imunodeficiência/etiologia , Síndromes de Imunodeficiência/metabolismo , Proteômica , Biomarcadores , Biologia Computacional/métodos , DNA , Feminino , Perfilação da Expressão Gênica , Variação Genética , Genômica/métodos , Humanos , Síndromes de Imunodeficiência/diagnóstico , Imunofenotipagem , Lactente , Proteínas , Proteômica/métodos , RNA , Sequenciamento do ExomaRESUMO
Advanced cholangiocarcinoma continues to harbor a difficult prognosis and therapeutic options have been limited. During the course of a clinical trial of whole genomic sequencing seeking druggable targets, we examined six patients with advanced cholangiocarcinoma. Integrated genome-wide and whole transcriptome sequence analyses were performed on tumors from six patients with advanced, sporadic intrahepatic cholangiocarcinoma (SIC) to identify potential therapeutically actionable events. Among the somatic events captured in our analysis, we uncovered two novel therapeutically relevant genomic contexts that when acted upon, resulted in preliminary evidence of anti-tumor activity. Genome-wide structural analysis of sequence data revealed recurrent translocation events involving the FGFR2 locus in three of six assessed patients. These observations and supporting evidence triggered the use of FGFR inhibitors in these patients. In one example, preliminary anti-tumor activity of pazopanib (in vitro FGFR2 IC50≈350 nM) was noted in a patient with an FGFR2-TACC3 fusion. After progression on pazopanib, the same patient also had stable disease on ponatinib, a pan-FGFR inhibitor (in vitro, FGFR2 IC50≈8 nM). In an independent non-FGFR2 translocation patient, exome and transcriptome analysis revealed an allele specific somatic nonsense mutation (E384X) in ERRFI1, a direct negative regulator of EGFR activation. Rapid and robust disease regression was noted in this ERRFI1 inactivated tumor when treated with erlotinib, an EGFR kinase inhibitor. FGFR2 fusions and ERRFI mutations may represent novel targets in sporadic intrahepatic cholangiocarcinoma and trials should be characterized in larger cohorts of patients with these aberrations.
Assuntos
Neoplasias dos Ductos Biliares/tratamento farmacológico , Colangiocarcinoma/tratamento farmacológico , Receptores ErbB/metabolismo , Receptor Tipo 2 de Fator de Crescimento de Fibroblastos/genética , Transdução de Sinais/genética , Neoplasias dos Ductos Biliares/genética , Neoplasias dos Ductos Biliares/patologia , Ductos Biliares Intra-Hepáticos/patologia , Linhagem Celular Tumoral , Colangiocarcinoma/genética , Colangiocarcinoma/patologia , Receptores ErbB/antagonistas & inibidores , Receptores ErbB/genética , Cloridrato de Erlotinib , Genoma Humano , Humanos , Imidazóis/administração & dosagem , Indazóis , Terapia de Alvo Molecular , Mutação , Prognóstico , Inibidores de Proteínas Quinases , Piridazinas/administração & dosagem , Pirimidinas/administração & dosagem , Quinazolinas/administração & dosagem , Receptor Tipo 2 de Fator de Crescimento de Fibroblastos/antagonistas & inibidores , Receptor Tipo 2 de Fator de Crescimento de Fibroblastos/metabolismo , Sulfonamidas/administração & dosagem , TranscriptomaRESUMO
MOTIVATION: Exome sequencing (exome-seq) data, which are typically used for calling exonic mutations, have also been utilized in detecting DNA copy number variations (CNVs). Despite the existence of several CNV detection tools, there is still a great need for a sensitive and an accurate CNV-calling algorithm with built-in QC steps, and does not require a paired reference for each sample. RESULTS: We developed a novel method named PatternCNV, which (i) accounts for the read coverage variations between exons while leveraging the consistencies of this variability across different samples; (ii) reduces alignment BAM files to WIG format and therefore greatly accelerates computation; (iii) incorporates multiple QC measures designed to identify outlier samples and batch effects; and (iv) provides a variety of visualization options including chromosome, gene and exon-level views of CNVs, along with a tabular summarization of the exon-level CNVs. Compared with other CNV-calling algorithms using data from a lymphoma exome-seq study, PatternCNV has higher sensitivity and specificity. AVAILABILITY AND IMPLEMENTATION: The software for PatternCNV is implemented using Perl and R, and can be used in Mac or Linux environments. Software and user manual are available at http://bioinformaticstools.mayo.edu/research/patterncnv/, and R package at https://github.com/topsoil/patternCNV/.
Assuntos
Algoritmos , Variações do Número de Cópias de DNA , Exoma/genética , Genômica/métodos , Análise de Sequência de DNA , Éxons/genética , SoftwareRESUMO
BACKGROUND: Next generation sequencing (NGS)-based assays continue to redefine the field of genetic testing. Owing to the complexity of the data, bioinformatics has become a necessary component in any laboratory implementing a clinical NGS test. CONTENT: The computational components of an NGS-based work flow can be conceptualized as primary, secondary, and tertiary analytics. Each of these components addresses a necessary step in the transformation of raw data into clinically actionable knowledge. Understanding the basic concepts of these analysis steps is important in assessing and addressing the informatics needs of a molecular diagnostics laboratory. Equally critical is a familiarity with the regulatory requirements addressing the bioinformatics analyses. These and other topics are covered in this review article. SUMMARY: Bioinformatics has become an important component in clinical laboratories generating, analyzing, maintaining, and interpreting data from molecular genetics testing. Given the rapid adoption of NGS-based clinical testing, service providers must develop informatics work flows that adhere to the rigor of clinical laboratory standards, yet are flexible to changes as the chemistry and software for analyzing sequencing data mature.
Assuntos
Técnicas de Laboratório Clínico/métodos , Biologia Computacional/métodos , Análise de Sequência de DNA/métodos , Algoritmos , Técnicas de Laboratório Clínico/instrumentação , Técnicas de Laboratório Clínico/tendências , Biologia Computacional/instrumentação , Biologia Computacional/tendências , Genoma Humano , Humanos , Guias de Prática Clínica como Assunto , Análise de Sequência de DNA/tendênciasRESUMO
MOTIVATION: Genomic data are prevalent, leading to frequent encounters with uninterpreted variants or mutations with unknown mechanisms of effect. Researchers must manually aggregate data from multiple sources and across related proteins, mentally translating effects between the genome and proteome, to attempt to understand mechanisms. MATERIALS AND METHODS: P2T2 presents diverse data and annotation types in a unified protein-centric view, facilitating the interpretation of coding variants and hypothesis generation. Information from primary sequence, domain, motif, and structural levels are presented and also organized into the first Paralog Annotation Analysis across the human proteome. RESULTS: Our tool assists research efforts to interpret genomic variation by aggregating diverse, relevant, and proteome-wide information into a unified interactive web-based interface. Additionally, we provide a REST API enabling automated data queries, or repurposing data for other studies. CONCLUSION: The unified protein-centric interface presented in P2T2 will help researchers interpret novel variants identified through next-generation sequencing. Code and server link available at github.com/GenomicInterpretation/p2t2.
RESUMO
Detecting gene fusions involving driver oncogenes is pivotal in clinical diagnosis and treatment of cancer patients. Recent developments in next-generation sequencing (NGS) technologies have enabled improved assays for bioinformatics-based gene fusions detection. In clinical applications, where a small number of fusions are clinically actionable, targeted polymerase chain reaction (PCR)-based NGS chemistries, such as the QIAseq RNAscan assay, aim to improve accuracy compared to standard RNA sequencing. Existing informatics methods for gene fusion detection in NGS-based RNA sequencing assays traditionally use a transcriptome-based spliced alignment approach or a de-novo assembly approach. Transcriptome-based spliced alignment methods face challenges with short read mapping yielding low quality alignments. De-novo assembly-based methods yield longer contigs from short reads that can be more sensitive for genomic rearrangements, but face performance and scalability challenges. Consequently, there exists a need for a method to efficiently and accurately detect fusions in targeted PCR-based NGS chemistries. We describe SeekFusion, a highly accurate and computationally efficient pipeline enabling identification of gene fusions from PCR-based NGS chemistries. Utilizing biological samples processed with the QIAseq RNAscan assay and in-silico simulated data we demonstrate that SeekFusion gene fusion detection accuracy outperforms popular existing methods such as STAR-Fusion, TOPHAT-Fusion and JAFFA-hybrid. We also present results from 4,484 patient samples tested for neurological tumors and sarcoma, encompassing details on some novel fusions identified.
RESUMO
Gestational trophoblastic disease (GTD) is a heterogeneous group of lesions arising from placental tissue. Epithelioid trophoblastic tumor (ETT), derived from chorionic-type trophoblast, is the rarest form of GTD with only approximately 130 cases described in the literature. Due to its morphologic mimicry of epithelioid smooth muscle tumors and carcinoma, ETT can be misdiagnosed. To date, molecular characterization of ETTs is lacking. Furthermore, ETT is difficult to treat when disease spreads beyond the uterus. Here using RNA-Seq analysis in a cohort of ETTs and other gestational trophoblastic lesions we describe the discovery of LPCAT1-TERT fusion transcripts that occur in ETTs and coincide with underlying genomic deletions. Through cell-growth assays we demonstrate that LPCAT1-TERT fusion proteins can positively modulate cell proliferation and therefore may represent future treatment targets. Furthermore, we demonstrate that TERT upregulation appears to be a characteristic of ETTs, even in the absence of LPCAT1-TERT fusions, and that it appears linked to copy number gains of chromosome 5. No evidence of TERT upregulation was identified in other trophoblastic lesions tested, including placental site trophoblastic tumors and placental site nodules, which are thought to be the benign chorionic-type trophoblast counterpart to ETT. These findings indicate that LPCAT1-TERT fusions and copy-number driven TERT activation may represent novel markers for ETT, with the potential to improve the diagnosis, treatment, and outcome for women with this rare form of GTD.
Assuntos
1-Acilglicerofosfocolina O-Aciltransferase/genética , Células Epitelioides/patologia , Doença Trofoblástica Gestacional/etiologia , Proteínas de Fusão Oncogênica/genética , Telomerase/genética , Neoplasias Trofoblásticas/patologia , Neoplasias Uterinas/patologia , 1-Acilglicerofosfocolina O-Aciltransferase/metabolismo , Adulto , Biomarcadores Tumorais/genética , Proliferação de Células , Células Epitelioides/metabolismo , Feminino , Doença Trofoblástica Gestacional/patologia , Humanos , Pessoa de Meia-Idade , Proteínas de Fusão Oncogênica/metabolismo , Gravidez , Telomerase/metabolismo , Neoplasias Trofoblásticas/genética , Neoplasias Trofoblásticas/metabolismo , Neoplasias Uterinas/genética , Neoplasias Uterinas/metabolismoRESUMO
BACKGROUND: To date, there are no clinically reliable predictive markers of response to the current treatment regimens for advanced colorectal cancer. The aim of the current study was to compare and assess the power of transcriptional profiling using a generic microarray and a disease-specific transcriptome-based microarray. We also examined the biological and clinical relevance of the disease-specific transcriptome. METHODS: DNA microarray profiling was carried out on isogenic sensitive and 5-FU-resistant HCT116 colorectal cancer cell lines using the Affymetrix HG-U133 Plus2.0 array and the Almac Diagnostics Colorectal cancer disease specific Research tool. In addition, DNA microarray profiling was also carried out on pre-treatment metastatic colorectal cancer biopsies using the colorectal cancer disease specific Research tool. The two microarray platforms were compared based on detection of probesets and biological information. RESULTS: The results demonstrated that the disease-specific transcriptome-based microarray was able to out-perform the generic genomic-based microarray on a number of levels including detection of transcripts and pathway analysis. In addition, the disease-specific microarray contains a high percentage of antisense transcripts and further analysis demonstrated that a number of these exist in sense:antisense pairs. Comparison between cell line models and metastatic CRC patient biopsies further demonstrated that a number of the identified sense:antisense pairs were also detected in CRC patient biopsies, suggesting potential clinical relevance. CONCLUSIONS: Analysis from our in vitro and clinical experiments has demonstrated that many transcripts exist in sense:antisense pairs including IGF2BP2, which may have a direct regulatory function in the context of colorectal cancer. While the functional relevance of the antisense transcripts has been established by many studies, their functional role is currently unclear; however, the numbers that have been detected by the disease-specific microarray would suggest that they may be important regulatory transcripts. This study has demonstrated the power of a disease-specific transcriptome-based approach and highlighted the potential novel biologically and clinically relevant information that is gained when using such a methodology.
Assuntos
Neoplasias Colorretais/genética , Resistencia a Medicamentos Antineoplásicos/genética , Perfilação da Expressão Gênica/métodos , Regulação Neoplásica da Expressão Gênica , Análise de Sequência com Séries de Oligonucleotídeos , Antimetabólitos Antineoplásicos/farmacologia , Antimetabólitos Antineoplásicos/uso terapêutico , Biópsia , Sobrevivência Celular/efeitos dos fármacos , Neoplasias Colorretais/tratamento farmacológico , Neoplasias Colorretais/patologia , Fluoruracila/farmacologia , Genótipo , Células HCT116 , Humanos , Proteína 2 de Ligação a Fator de Crescimento Semelhante à Insulina/genética , Fenótipo , Prognóstico , RNA Antissenso/metabolismo , Reprodutibilidade dos Testes , Reação em Cadeia da Polimerase Via Transcriptase ReversaRESUMO
BACKGROUND: RNA polymerase III (Pol III)-related disorders are autosomal recessive neurodegenerative disorders caused by variants in POLR3A or POLR3B. Recently, a novel phenotype of adult-onset spastic ataxia was identified in individuals with the c.1909+22G>A POLR3A variant in compound heterozygosity. METHODS: Whole-exome sequencing was performed in the proband and parents. Variants were confirmed by Sanger sequencing. RNA sequencing was performed to evaluate splicing implications. RESULTS: A 42-year-old female was evaluated for unexplained neurological findings with a slow progressive decline in gait and walking speed since adolescence. WES revealed a novel missense variant (c.3593A>C, p.Lys1198Arg) in exon 27 of POLR3A in compound heterozygosity with the c.1909+22G>A variant. Summary of previously reported clinical features from individuals with pathogenic biallelic alterations in POLR3A and adult-onset phenotype is consistent with our findings. RNA analysis revealed c.3593A>G drives the production of four RNA transcript products each with different functional impacts. CONCLUSION: The novel dual-class c.3593A>C variant in POLR3A causes an amino acid substitution and complex disruption of splicing. Our report supports the need to investigate variants near splice junctions for proper interpretation. Current interpretation guidelines need to address best practices for inclusion of predicted or measured transcriptional disruption pending functional activity or reliable transcript abundance estimates.
Assuntos
Ataxia Cerebelar/genética , Testes Genéticos/normas , RNA Polimerase III/genética , Adulto , Ataxia Cerebelar/diagnóstico por imagem , Ataxia Cerebelar/patologia , Feminino , Testes Genéticos/métodos , Humanos , Mutação de Sentido Incorreto , Fenótipo , RNA Polimerase III/metabolismo , Splicing de RNARESUMO
BACKGROUND: More than 20% of human transcripts have naturally occurring antisense products (or natural antisense transcripts--NATs), some of which may play a key role in a range of human diseases. To date, several databases of in silico defined human sense-antisense (SAS) pairs have appeared, however no study has focused on differential expression of SAS pairs in breast tissue. We therefore investigated the expression levels of sense and antisense transcripts in normal and malignant human breast epithelia using the Affymetrix HG-U133 Plus 2.0 and Almac Diagnostics Breast Cancer DSA microarray technologies as well as massively parallel signature sequencing (MPSS) data. RESULTS: The expression of more than 2500 antisense transcripts were detected in normal breast duct luminal cells and in primary breast tumors substantially enriched for their epithelial cell content by DSA microarray. Expression of 431 NATs were confirmed by either of the other two technologies. A corresponding sense transcript could be identified on DSA for 257 antisense transcripts. Of these SAS pairs, 163 have not been previously reported. A positive correlation of differential expression between normal and malignant breast samples was observed for most SAS pairs. Orientation specific RT-QPCR of selected SAS pairs validated their expression in several breast cancer cell lines and solid breast tumours. CONCLUSION: Disease-focused and antisense enriched microarray platforms (such as Breast Cancer DSA) confirm the assumption that antisense transcription in the human breast is more prevalent than previously anticipated. Expression of a proportion of these NATs has already been confirmed by other technologies while the true existence of the remaining ones has to be validated. Nevertheless, future studies will reveal whether the relative abundances of antisense and sense transcripts have regulatory influences on the translation of these mRNAs.
Assuntos
Elementos Antissenso (Genética)/genética , Perfilação da Expressão Gênica , Glândulas Mamárias Humanas/metabolismo , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Linhagem Celular Tumoral , Genoma Humano , HumanosRESUMO
BACKGROUND: We describe a patient presenting with pachygyria, epilepsy, developmental delay, short stature, failure to thrive, facial dysmorphisms, and multiple osteochondromas. METHODS: The patient underwent extensive genetic testing and analysis in an attempt to diagnose the cause of his condition. Clinical testing included metaphase karyotyping, array comparative genomic hybridization, direct sequencing and multiplex ligation-dependent probe amplification and trio-based exome sequencing. Subsequently, research-based whole transcriptome sequencing was conducted to determine whether it might shed light on the undiagnosed phenotype. RESULTS: Clinical exome sequencing of patient and parent samples revealed a maternally inherited splice-site variant in the doublecortin (DCX) gene that was classified as likely pathogenic and diagnostic of the patient's neurological phenotype. Clinical array comparative genome hybridization analysis revealed a 16p13.3 deletion that could not be linked to the patient phenotype based on affected genes. Further clinical testing to determine the cause of the patient's multiple osteochondromas was unrevealing despite extensive profiling of the most likely causative genes, EXT1 and EXT2, including mutation screening by direct sequence analysis and multiplex ligation-dependent probe amplification. Whole transcriptome sequencing identified a SAMD12-EXT1 fusion transcript that could have resulted from a chromosomal deletion, leading to the loss of EXT1 function. Re-review of the clinical array comparative genomic hybridization results indicated a possible unreported mosaic deletion affecting the SAMD12 and EXT1 genes that corresponded precisely to the introns predicted to be affected by a fusion-causing deletion. The existence of the mosaic deletion was subsequently confirmed clinically by an increased density copy number array and orthogonal methodologies CONCLUSIONS: While mosaic mutations and deletions of EXT1 and EXT2 have been reported in the context of multiple osteochondromas, to our knowledge, this is the first time that transcriptomics technologies have been used to diagnose a patient via fusion transcript analysis in the congenital disease setting.
Assuntos
Exostose Múltipla Hereditária/genética , Fusão Gênica , N-Acetilglucosaminiltransferases/genética , Proteínas do Tecido Nervoso/genética , Criança , Exostose Múltipla Hereditária/patologia , Deleção de Genes , Humanos , Masculino , RNA Mensageiro/genética , Motivo Estéril alfa/genéticaRESUMO
BACKGROUND: RNA sequencing has been proposed as a means of increasing diagnostic rates in studies of undiagnosed rare inherited disease. Recent studies have reported diagnostic improvements in the range of 7.5-35% by profiling splicing, gene expression quantification and allele specific expression. To-date however, no study has systematically assessed the presence of gene-fusion transcripts in cases of germline disease. Fusion transcripts are routinely identified in cancer studies and are increasingly recognized as having diagnostic, prognostic or therapeutic relevance. Isolated reports exist of fusion transcripts being detected in cases of developmental and neurological phenotypes, and thus, systematic application of fusion detection to germline conditions may further increase diagnostic rates. However, current fusion detection methods are unsuited to the investigation of germline disease due to performance biases arising from their development using tumor, cell-line or in-silico data. METHODS: We describe a tailored approach to fusion candidate identification and prioritization in a cohort of 47 undiagnosed, suspected inherited disease patients. We modify an existing fusion transcript detection algorithm by eliminating its cell line-derived filtering steps, and instead, prioritize candidates using a custom workflow that integrates genomic and transcriptomic sequence alignment, biological and technical annotations, customized categorization logic, and phenotypic prioritization. RESULTS: We demonstrate that our approach to fusion transcript identification and prioritization detects genuine fusion events excluded by standard analyses and efficiently removes phenotypically unimportant candidates and false positive events, resulting in a reduced candidate list enriched for events with potential phenotypic relevance. We describe the successful genetic resolution of two previously undiagnosed disease cases through the detection of pathogenic fusion transcripts. Furthermore, we report the experimental validation of five additional cases of fusion transcripts with potential phenotypic relevance. CONCLUSIONS: The approach we describe can be implemented to enable the detection of phenotypically relevant fusion transcripts in studies of rare inherited disease. Fusion transcript detection has the potential to increase diagnostic rates in rare inherited disease and should be included in RNA-based analytical pipelines aimed at genetic diagnosis.
Assuntos
Estudos de Associação Genética , Doenças Genéticas Inatas/diagnóstico , Doenças Genéticas Inatas/genética , Predisposição Genética para Doença , Proteínas Mutantes Quiméricas/genética , Doenças Raras/diagnóstico , Doenças Raras/genética , Adolescente , Adulto , Idoso , Criança , Pré-Escolar , Feminino , Estudos de Associação Genética/métodos , Marcadores Genéticos , Humanos , Lactente , Padrões de Herança , Masculino , Pessoa de Meia-Idade , Fenótipo , Fluxo de Trabalho , Adulto JovemRESUMO
Purpose: Demand is increasing for clinical genomic sequencing to provide diagnoses for patients presenting phenotypes indicative of genetic diseases, but for whom routine genetic testing failed to yield a diagnosis. DNA-based testing using high-throughput technologies often identifies variants with insufficient evidence to determine whether they are disease-causal or benign, leading to categorization as variants of uncertain significance (VUS). Methods: We used molecular modeling and simulation to generate specific hypotheses for the molecular effects of variants in the human glucose transporter, GLUT10 (SLC2A10). Similar to many disease-relevant membrane proteins, no experimentally derived 3D structure exists. An atomic model was generated and used to evaluate multiple variants, including pathogenic, benign, and VUS. Results: These analyses yielded detailed mechanistic data, not currently predictable from sequence, including altered protein stability, charge distribution of ligand binding surfaces, and shifts toward or away from transport-competent conformations. Consideration of the two major conformations of GLUT10 was important as variants have conformation-specific effects. We generated detailed molecular hypotheses for the functional impact of variants in GLUT10 and propose means to determine their pathogenicity. Conclusion: The type of workflow we present here is valuable for increasing the throughput and resolution with which VUS effects can be assessed and interpreted.
RESUMO
Whole exome sequencing (WES) is utilized in diagnostic odyssey cases to identify the underlying genetic cause associated with complex phenotypes. Recent publications suggest that WES reveals the genetic cause in ~25% of these cases and is most successful when applied to children with neurological disease. The residual 75% of cases remain genetically elusive until more information becomes available in the literature or functional studies are pursued. WES performed on three families with presumed ciliopathy diagnoses, including orofaciodigital (OFD) syndrome, fetal encephalocele, or Joubert-related disorder, identified compound heterozygous variants in C2CD3. Biallelic variants in C2CD3 have previously been associated with ciliopathies, including OFD syndrome type 14 (OFD14; MIM: 615948). As three of the six identified variants were predicted to affect splicing, exon-skipping analysis using either RNA sequencing or PCR-based methods were completed to determine the pathogenicity of these variants, and showed that each of the splicing variants led to a frameshifted protein product. Using these studies in combination with the 2015 ACMG guidelines, each of the six identified variants were classified as either pathogenic or likely pathogenic, and are therefore likely responsible for our patients' phenotypes. Each of the families had a distinct clinical phenotype and severity of disease, extending from lethal to viable. These findings highlight that there is a broad phenotypic spectrum associated with C2CD3-mediated disease and not all patients present with the typical features of OFD14.