Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 76
Filter
1.
Curr Hematol Malig Rep ; 18(6): 284-291, 2023 Dec.
Article in English | MEDLINE | ID: mdl-37947937

ABSTRACT

PURPOSE OF REVIEW: The length of telomeres, protective structures at the chromosome ends, is a well-established biomarker for pathological conditions including multisystemic syndromes called telomere biology disorders. Approaches to measure telomere length (TL) differ on whether they estimate average, distribution, or chromosome-specific TL, and each presents their own advantages and limitations. RECENT FINDINGS: The development of long-read sequencing and publication of the telomere-to-telomere human genome reference has allowed for scalable and high-resolution TL estimation in pre-existing sequencing datasets but is still impractical as a dedicated TL test. As sequencing costs continue to fall and strategies for selectively enriching telomere regions prior to sequencing improve, these approaches may become a promising alternative to classic methods. Measurement methods rely on probe hybridization, qPCR or more recently, computational methods using sequencing data. Refinements of existing techniques and new approaches have been recently developed but a test that is accurate, simple, and scalable is still lacking.


Subject(s)
Telomere , Humans , Forecasting , Telomere/genetics
2.
Nat Commun ; 13(1): 6286, 2022 10 21.
Article in English | MEDLINE | ID: mdl-36271076

ABSTRACT

A GGGGCC24+ hexanucleotide repeat expansion (HRE) in the C9ORF72 gene is the most common genetic cause of amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD), fatal neurodegenerative diseases with no cure or approved treatments that substantially slow disease progression or extend survival. Mechanistic underpinnings of neuronal death include C9ORF72 haploinsufficiency, sequestration of RNA-binding proteins in the nucleus, and production of dipeptide repeat proteins. Here, we used an adeno-associated viral vector system to deliver CRISPR/Cas9 gene-editing machineries to effectuate the removal of the HRE from the C9ORF72 genomic locus. We demonstrate successful excision of the HRE in primary cortical neurons and brains of three mouse models containing the expansion (500-600 repeats) as well as in patient-derived iPSC motor neurons and brain organoids (450 repeats). This resulted in a reduction of RNA foci, poly-dipeptides and haploinsufficiency, major hallmarks of C9-ALS/FTD, making this a promising therapeutic approach to these diseases.


Subject(s)
Amyotrophic Lateral Sclerosis , Frontotemporal Dementia , Animals , Mice , Frontotemporal Dementia/genetics , Frontotemporal Dementia/metabolism , Amyotrophic Lateral Sclerosis/genetics , Amyotrophic Lateral Sclerosis/metabolism , C9orf72 Protein/genetics , C9orf72 Protein/metabolism , DNA Repeat Expansion/genetics , CRISPR-Cas Systems , Motor Neurons/metabolism , Dipeptides/metabolism , RNA/metabolism
3.
Bioinformatics ; 38(7): 1788-1793, 2022 03 28.
Article in English | MEDLINE | ID: mdl-35022670

ABSTRACT

MOTIVATION: Telomeres are the repetitive sequences found at the ends of eukaryotic chromosomes and are often thought of as a 'biological clock,' with their average length shortening during division in most cells. In addition to their association with senescence, abnormal telomere lengths are well known to be associated with multiple cancers, short telomere syndromes and as risk factors for a broad range of diseases. While a majority of methods for measuring telomere length will report average lengths across all chromosomes, it is known that aberrations in specific chromosome arms are biomarkers for certain diseases. Due to their repetitive nature, characterizing telomeres at this resolution is prohibitive for short read sequencing approaches, and is challenging still even with longer reads. RESULTS: We present Telogator: a method for reporting chromosome-specific telomere length from long read sequencing data. We demonstrate Telogator's sensitivity in detecting chromosome-specific telomere length in simulated data across a range of read lengths and error rates. Telogator is then applied to 10 germline samples, yielding a high correlation with short read methods in reporting average telomere length. In addition, we investigate common subtelomere rearrangements and identify the minimum read length required to anchor telomere/subtelomere boundaries in samples with these haplotypes. AVAILABILITY AND IMPLEMENTATION: Telogator is written in Python3 and is available at github.com/zstephens/telogator. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Repetitive Sequences, Nucleic Acid , Telomere , Telomere/genetics , Haplotypes
4.
JAMIA Open ; 4(3): ooab065, 2021 Jul.
Article in English | MEDLINE | ID: mdl-34377961

ABSTRACT

MOTIVATION: Genomic data are prevalent, leading to frequent encounters with uninterpreted variants or mutations with unknown mechanisms of effect. Researchers must manually aggregate data from multiple sources and across related proteins, mentally translating effects between the genome and proteome, to attempt to understand mechanisms. MATERIALS AND METHODS: P2T2 presents diverse data and annotation types in a unified protein-centric view, facilitating the interpretation of coding variants and hypothesis generation. Information from primary sequence, domain, motif, and structural levels are presented and also organized into the first Paralog Annotation Analysis across the human proteome. RESULTS: Our tool assists research efforts to interpret genomic variation by aggregating diverse, relevant, and proteome-wide information into a unified interactive web-based interface. Additionally, we provide a REST API enabling automated data queries, or repurposing data for other studies. CONCLUSION: The unified protein-centric interface presented in P2T2 will help researchers interpret novel variants identified through next-generation sequencing. Code and server link available at github.com/GenomicInterpretation/p2t2.

5.
Front Genet ; 12: 716586, 2021.
Article in English | MEDLINE | ID: mdl-34394200

ABSTRACT

Long read sequencing technologies have the potential to accurately detect and phase variation in genomic regions that are difficult to fully characterize with conventional short read methods. These difficult to sequence regions include several clinically relevant genes with highly homologous pseudogenes, many of which are prone to gene conversions or other types of complex structural rearrangements. We present PB-Motif, a new method for identifying rearrangements between two highly homologous genomic regions using PacBio long reads. PB-Motif leverages clustering and filtering techniques to efficiently report rearrangements in the presence of sequencing errors and other systematic artifacts. Supporting reads for each high-confidence rearrangement can then be used for copy number estimation and phased variant calling. First, we demonstrate PB-Motif's accuracy with simulated sequence rearrangements of PMS2 and its pseudogene PMS2CL using simulated reads sweeping over a range of sequencing error rates. We then apply PB-Motif to 26 clinical samples, characterizing CYP21A2 and its pseudogene CYP21A1P as part of a diagnostic assay for congenital adrenal hyperplasia. We successfully identify damaging variation and patient carrier status concordant with clinical diagnosis obtained from multiplex ligation-dependent amplification (MLPA) and Sanger sequencing. The source code is available at: github.com/zstephens/pb-motif.

6.
Bioinformatics ; 37(11): 1598-1599, 2021 07 12.
Article in English | MEDLINE | ID: mdl-31808791

ABSTRACT

MOTIVATION: DNA methylation can be measured at the single CpG level using sodium bisulfite conversion of genomic DNA followed by sequencing or array hybridization. Many analytic tools have been developed, yet there is still a high demand for a comprehensive and multifaceted tool suite to analyze, annotate, QC and visualize the DNA methylation data. RESULTS: We developed the CpGtools package to analyze DNA methylation data generated from bisulfite sequencing or Illumina methylation arrays. The CpGtools package consists of three types of modules: (i) 'CpG position modules' focus on analyzing the genomic positions of CpGs, including associating other genomic and epigenomic features to a given list of CpGs and generating the DNA motif logo enriched in the genomic contexts of a given list of CpGs; (ii) 'CpG signal modules' are designed to analyze DNA methylation values, such as performing the PCA or t-SNE analyses, using Bayesian Gaussian mixture modeling to classify CpG sites into fully methylated, partially methylated and unmethylated groups, profiling the average DNA methylation level over user-specified genomics regions and generating the bean/violin plots and (iii) 'differential CpG analysis modules' focus on identifying differentially methylated CpGs between groups using different statistical methods including Fisher's Exact Test, Student's t-test, ANOVA, non-parametric tests, linear regression, logistic regression, beta-binomial regression and Bayesian estimation. AVAILABILITY AND IMPLEMENTATION: CpGtools is written in Python under the open-source GPL license. The source code and documentation are freely available at https://github.com/liguowang/cpgtools. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
DNA Methylation , High-Throughput Nucleotide Sequencing , Bayes Theorem , CpG Islands , Humans , Sequence Analysis, DNA
7.
J Clin Med ; 9(11)2020 Nov 17.
Article in English | MEDLINE | ID: mdl-33213041

ABSTRACT

(1) Background: Arthrofibrosis is a common cause of patient debility and dissatisfaction after total knee arthroplasty (TKA). The diversity of molecular pathways involved in arthrofibrosis disease progression suggest that effective treatments for arthrofibrosis may require a multimodal approach to counter the complex cellular mechanisms that direct disease pathogenesis. In this study, we leveraged RNA-seq data to define genes that are suppressed in arthrofibrosis patients and identified adiponectin (ADIPOQ) as a potential candidate. We hypothesized that signaling pathways activated by ADIPOQ and the cognate receptors ADIPOR1 and ADIPOR2 may prevent fibrosis-related events that contribute to arthrofibrosis. (2) Methods: Therefore, ADIPOR1 and ADIPOR2 were analyzed in a TGFß1 inducible cell model for human myofibroblastogenesis by both loss- and gain-of-function experiments. (3) Results: Treatment with AdipoRon, which is a small molecule agonist of ADIPOR1 and ADIPOR2, decreased expression of collagens (COL1A1, COL3A1, and COL6A1) and the myofibroblast marker smooth muscle α-actin (ACTA2) at both mRNA and protein levels in basal and TGFß1-induced cells. (4) Conclusions: Thus, ADIPOR1 and ADIPOR2 represent potential drug targets that may attenuate the pathogenesis of arthrofibrosis by suppressing TGFß-dependent induction of myofibroblasts. These findings also suggest that AdipoRon therapy may reduce the development of arthrofibrosis by mediating anti-fibrotic effects in joint capsular tissues.

8.
Mol Ther Methods Clin Dev ; 18: 738-750, 2020 Sep 11.
Article in English | MEDLINE | ID: mdl-32913881

ABSTRACT

The effectiveness of cell-based therapies to treat liver failure is often limited by the diseased liver environment. Here, we provide preclinical proof of concept for hepatocyte transplantation into lymph nodes as a cure for liver failure in a large-animal model with hereditary tyrosinemia type 1 (HT1), a metabolic liver disease caused by deficiency of fumarylacetoacetate hydrolase (FAH) enzyme. Autologous porcine hepatocytes were transduced ex vivo with a lentiviral vector carrying the pig Fah gene and transplanted into mesenteric lymph nodes. Hepatocytes showed early (6 h) and durable (8 months) engraftment in lymph nodes, with reproduction of vascular and hepatic microarchitecture. Subsequently, hepatocytes migrated to and repopulated the native diseased liver. The corrected cells generated sufficient liver mass to clinically ameliorate the acute liver failure and HT1 disease as early as 97 days post-transplantation. Integration site analysis defined the corrected hepatocytes in the liver as a subpopulation of hepatocytes from lymph nodes, indicating that the lymph nodes served as a source for healthy hepatocytes to repopulate a diseased liver. Therefore, ectopic transplantation of healthy hepatocytes cures this pig model of liver failure and presents a promising approach for the development of cures for liver disease in patients.

10.
Gynecol Oncol ; 156(2): 387-392, 2020 02.
Article in English | MEDLINE | ID: mdl-31787246

ABSTRACT

OBJECTIVE: We aimed to assess whether endometrial cancer (EC) can be detected in shed DNA collected with vaginal tampon by analyzing copy number, methylation markers, and mutations. METHODS: Tampons were collected prior to hysterectomy from 38 EC patients and 28 women with benign indications. Extracted tampon DNA underwent the following: 1) low-coverage whole genome sequencing (LC-WGS) to assess copy number, 2) pyrosequencing to measure percent promotor methylation of HOXA9, RASSF1, and CDH13 and 3) next generation sequencing (NGS) to identify mutations in 19 genes associated with EC identified through The Cancer Genome Atlas. Sensitivity and specificity for each test and test combinations were calculated. RESULTS: Methylation analysis yielded the highest specificities but lowest sensitivities (37-40% sensitivity; 100% specificity for HOXA9, RASSF1 and HTR1B) while mutation analysis had improved sensitivity (50% sensitivity; 83% specificity). Only one "false positive" result for copy number variants was identified among women with benign surgical indications, which was based on detection of copy number changes, and associated with a leiomyosarcoma that was only recognized at hysterectomy. Considering any of the 3 biomarker classes as a positive, resulted in a sensitivity of 92% and specificity of 86%. Mutation analysis did not add sensitivity to the combination of analysis of copy number and methylation. CONCLUSIONS: This study demonstrates a proof-of-principle for non-invasive yet precise detection of endometrial cancer. We propose that with improved biomarker testing, it may be possible to develop a clinically useful test for detecting EC.


Subject(s)
DNA Methylation , Endometrial Neoplasms/genetics , Gene Dosage , Menstrual Hygiene Products , Biomarkers, Tumor/genetics , Diagnosis, Differential , Endometrial Neoplasms/diagnosis , Endometrial Neoplasms/pathology , Female , Humans , Middle Aged , Mutation , Uterine Diseases/diagnosis , Uterine Diseases/genetics , Uterine Diseases/pathology , Vaginal Smears/methods
11.
Hepatol Int ; 13(4): 490-500, 2019 Jul.
Article in English | MEDLINE | ID: mdl-31214875

ABSTRACT

BACKGROUND: Although molecular characterization of iCCA has been studied recently, integrative analysis of molecular and clinical characterization has not been fully established. If molecular features of iCCA can be predicted based on clinical findings, we can approach to distinguish targeted treatment. We analyzed RNA sequencing data annotated with clinicopathologic data to clarify molecular-specific clinical features and to evaluate potential therapies for molecular subtypes. METHODS: We performed next-generation RNA sequencing of 30 surgically resected iCCA from Korean patients and the clinicopathologic features were analyzed. The RNA sequences from 32 iCCA resected from US patients were used for validation. RESULTS: Patients were grouped into two subclasses on the basis of unsupervised clustering, which showed a difference in 5-year survival rates (48.5% vs 14.2%, p = 0.007) and similar survival outcome in the US samples. In subclass B (poor prognosis), both data sets were similar in higher carcinoembryonic antigen and cancer antigen 19-9 levels, underlying cholangitis, and bile duct-type pathology; in subclass A (better prognosis), there was more frequent viral hepatitis and cholangiolar-type pathology. On pathway analysis, subclass A had enriched liver-related signatures. Subclass B had enriched inflammation-related and TP53 pathways, with more frequent KRAS mutations. CCA cell lines with similar gene expression patterns of subclass A were sensitive to gemcitabine. CONCLUSIONS: Two molecular subtypes of iCCA with distinct clinicopathological differences were identified. Knowledge of clinical and pathologic characteristics can predict molecular subtypes, and knowledge of different subtype signaling pathways may lead to more rational, targeted approaches to treatment.


Subject(s)
Bile Duct Neoplasms/mortality , Cholangiocarcinoma/mortality , Aged , Aged, 80 and over , Antimetabolites, Antineoplastic/therapeutic use , Bile Duct Neoplasms/drug therapy , Bile Duct Neoplasms/genetics , Bile Ducts, Intrahepatic , Cholangiocarcinoma/drug therapy , Cholangiocarcinoma/genetics , Deoxycytidine/analogs & derivatives , Deoxycytidine/therapeutic use , Female , Genes, Neoplasm/genetics , Humans , Male , Middle Aged , Mutation/genetics , Prognosis , RNA, Neoplasm/genetics , RNA, Neoplasm/metabolism , Republic of Korea/epidemiology , Retrospective Studies , United States/epidemiology , Up-Regulation , Gemcitabine
12.
Gastroenterology ; 157(1): 210-226.e12, 2019 07.
Article in English | MEDLINE | ID: mdl-30878468

ABSTRACT

BACKGROUND & AIMS: The CCNE1 locus, which encodes cyclin E1, is amplified in many types of cancer cells and is activated in hepatocellular carcinomas (HCCs) from patients infected with hepatitis B virus or adeno-associated virus type 2, due to integration of the virus nearby. We investigated cell-cycle and oncogenic effects of cyclin E1 overexpression in tissues of mice. METHODS: We generated mice with doxycycline-inducible expression of Ccne1 (Ccne1T mice) and activated overexpression of cyclin E1 from age 3 weeks onward. At 14 months of age, livers were collected from mice that overexpress cyclin E1 and nontransgenic mice (controls) and analyzed for tumor burden and by histology. Mouse embryonic fibroblasts (MEFs) and hepatocytes from Ccne1T and control mice were analyzed to determine the extent to which cyclin E1 overexpression perturbs S-phase entry, DNA replication, and numbers and structures of chromosomes. Tissues from 4-month-old Ccne1T and control mice (at that age were free of tumors) were analyzed for chromosome alterations, to investigate the mechanisms by which cyclin E1 predisposes hepatocytes to transformation. RESULTS: Ccne1T mice developed more hepatocellular adenomas and HCCs than control mice. Tumors developed only in livers of Ccne1T mice, despite high levels of cyclin E1 in other tissues. Ccne1T MEFs had defects that promoted chromosome missegregation and aneuploidy, including incomplete replication of DNA, centrosome amplification, and formation of nonperpendicular mitotic spindles. Whereas Ccne1T mice accumulated near-diploid aneuploid cells in multiple tissues and organs, polyploidization was observed only in hepatocytes, with losses and gains of whole chromosomes, DNA damage, and oxidative stress. CONCLUSIONS: Livers, but not other tissues of mice with inducible overexpression of cyclin E1, develop tumors. More hepatocytes from the cyclin E1-overexpressing mice were polyploid than from control mice, and had losses or gains of whole chromosomes, DNA damage, and oxidative stress; all of these have been observed in human HCC cells. The increased risk of HCC in patients with hepatitis B virus or adeno-associated virus type 2 infection might involve activation of cyclin E1 and its effects on chromosomes and genomes of liver cells.


Subject(s)
Adenoma, Liver Cell/genetics , Carcinoma, Hepatocellular/genetics , Chromosomal Instability/genetics , Cyclin E/genetics , Liver Neoplasms/genetics , Liver/metabolism , Oncogene Proteins/genetics , Adenoma, Liver Cell/pathology , Adenoma, Liver Cell/virology , Animals , Carcinoma, Hepatocellular/pathology , Carcinoma, Hepatocellular/virology , Chromosome Structures , DNA Damage/genetics , DNA Replication , Dependovirus , Fibroblasts , Hepatitis B, Chronic , Hepatocytes , Liver/pathology , Liver Neoplasms/pathology , Liver Neoplasms/virology , Liver Neoplasms, Experimental/genetics , Liver Neoplasms, Experimental/pathology , Mice , Oxidative Stress/genetics , Parvoviridae Infections , Parvovirinae , Polyploidy , S Phase Cell Cycle Checkpoints
13.
BMC Genomics ; 19(1): 841, 2018 Nov 27.
Article in English | MEDLINE | ID: mdl-30482155

ABSTRACT

BACKGROUND: Copy Number Alternations (CNAs) is defined as somatic gain or loss of DNA regions. The profiles of CNAs may provide a fingerprint specific to a tumor type or tumor grade. Low-coverage sequencing for reporting CNAs has recently gained interest since successfully translated into clinical applications. Ovarian serous carcinomas can be classified into two largely mutually exclusive grades, low grade and high grade, based on their histologic features. The grade classification based on the genomics may provide valuable clue on how to best manage these patients in clinic. Based on the study of ovarian serous carcinomas, we explore the methodology of combining CNAs reporting from low-coverage sequencing with machine learning techniques to stratify tumor biospecimens of different grades. RESULTS: We have developed a data-driven methodology for tumor classification using the profiles of CNAs reported by low-coverage sequencing. The proposed method called Bag-of-Segments is used to summarize fixed-length CNA features predictive of tumor grades. These features are further processed by machine learning techniques to obtain classification models. High accuracy is obtained for classifying ovarian serous carcinoma into high and low grades based on leave-one-out cross-validation experiments. The models that are weakly influenced by the sequence coverage and the purity of the sample can also be built, which would be of higher relevance for clinical applications. The patterns captured by Bag-of-Segments features correlate with current clinical knowledge: low grade ovarian tumors being related to aneuploidy events associated to mitotic errors while high grade ovarian tumors are induced by DNA repair gene malfunction. CONCLUSIONS: The proposed data-driven method obtains high accuracy with various parametrizations for the ovarian serous carcinoma study, indicating that it has good generalization potential towards other CNA classification problems. This method could be applied to the more difficult task of classifying ovarian serous carcinomas with ambiguous histology or in those with low grade tumor co-existing with high grade tumor. The closer genomic relationship of these tumor samples to low or high grade may provide important clinical value.


Subject(s)
Cystadenocarcinoma, Serous/classification , DNA Copy Number Variations , Data Science/methods , Genome, Human , Ovarian Neoplasms/classification , Cystadenocarcinoma, Serous/genetics , Cystadenocarcinoma, Serous/pathology , Female , Humans , Neoplasm Grading , Ovarian Neoplasms/genetics , Ovarian Neoplasms/pathology , Whole Genome Sequencing
14.
Hum Hered ; 83(2): 79-91, 2018.
Article in English | MEDLINE | ID: mdl-30347404

ABSTRACT

AIMS: We propose a novel machine learning approach to expand the knowledge about drug-target interactions. Our method may help to develop effective, less harmful treatment strategies and to enable the detection of novel indications for existing drugs. METHODS: We developed a novel machine learning strategy to predict drug-target interactions based on drug side effects and traits from genome-wide association studies. We integrated data from the databases SIDER and GWASdb and utilized them in a unique way by a neural network approach. RESULTS: We validate our method using drug-target interactions from the STITCH database. In addition, we compare the chemical similarity of the predicted target to known targets of the drug under consideration and present literature-based evidence for predicted interactions. We find drug combination warnings for drugs we predict to target the same protein, hinting to synergistic effects aggravating harmful events. This substantiates the translational value of our approach, because we are able to detect drugs that should be taken together with care due to common mechanisms of action. CONCLUSION: Taken together, we conclude that our approach is able to generate a novel and clinically applicable insight into the molecular determinants of drug action.


Subject(s)
Drug Interactions , Drug-Related Side Effects and Adverse Reactions , Genome-Wide Association Study , Machine Learning , Humans , Neural Networks, Computer
15.
BMC Med Genomics ; 11(Suppl 3): 67, 2018 Sep 14.
Article in English | MEDLINE | ID: mdl-30255803

ABSTRACT

BACKGROUND: RNA-seq is the most commonly used sequencing application. Not only does it measure gene expression but it is also an excellent media to detect important structural variants such as single nucleotide variants (SNVs), insertion/deletion (Indels) or fusion transcripts. However, detection of these variants is challenging and complex from RNA-seq. Here we describe a sensitive and accurate analytical pipeline which detects various mutations at once for translational precision medicine. METHODS: The pipeline incorporates most sensitive aligners for Indels in RNA-Seq, the best practice for data preprocessing and variant calling, and STAR-fusion is for chimeric transcripts. Variants/mutations are annotated, and key genes can be extracted for further investigation and clinical actions. Three datasets were used to evaluate the performance of the pipeline for SNVs, indels and fusion transcripts. RESULTS: For the well-defined variants from NA12878 by GIAB project, about 95% and 80% of sensitivities were obtained for SNVs and indels, respectively, in matching RNA-seq. Comparison with other variant specific tools showed good performance of the pipeline. For the lung cancer dataset with 41 known and oncogenic mutations, 39 were detected by the pipeline with STAR aligner and all by the GSNAP aligner. An actionable EML4 and ALK fusion was also detected in one of the tumors, which also demonstrated outlier ALK expression. For 9 fusions spiked-into RNA-seq libraries with different concentrations, the pipeline was able to detect all in unfiltered results although some at very low concentrations may be missed when filtering was applied. CONCLUSIONS: The new RNA-seq workflow is an accurate and comprehensive mutation profiler from RNA-seq. Key or actionable mutations are reliably detected from RNA-seq, which makes it a practical alternative source for personalized medicine.


Subject(s)
Biomarkers, Tumor/genetics , High-Throughput Nucleotide Sequencing/methods , INDEL Mutation , Lung Neoplasms/genetics , Polymorphism, Single Nucleotide , Precision Medicine , Sequence Analysis, RNA/methods , Adenocarcinoma/genetics , Humans , Software
16.
BMC Cancer ; 18(1): 743, 2018 07 18.
Article in English | MEDLINE | ID: mdl-30021563

ABSTRACT

Correction to: BMC Cancer (2018) 18:577 DOI https://doi.org/10.1186/s12885-018-4345-2.

17.
BMC Bioinformatics ; 19(1): 271, 2018 07 17.
Article in English | MEDLINE | ID: mdl-30016933

ABSTRACT

BACKGROUND: Transfer of genetic material from microbes or viruses into the host genome is known as horizontal gene transfer (HGT). The integration of viruses into the human genome is associated with multiple cancers, and these can now be detected using next-generation sequencing methods such as whole genome sequencing and RNA-sequencing. RESULTS: We designed a novel computational workflow, HGT-ID, to identify the integration of viruses into the human genome using the sequencing data. The HGT-ID workflow primarily follows a four-step procedure: i) pre-processing of unaligned reads, ii) virus detection using subtraction approach, iii) identification of virus integration site using discordant and soft-clipped reads and iv) HGT candidates prioritization through a scoring function. Annotation and visualization of the events, as well as primer design for experimental validation, are also provided in the final report. We evaluated the tool performance with the well-understood cervical cancer samples. The HGT-ID workflow accurately detected known human papillomavirus (HPV) integration sites with high sensitivity and specificity compared to previous HGT methods. We applied HGT-ID to The Cancer Genome Atlas (TCGA) whole-genome sequencing data (WGS) from liver tumor-normal pairs. Multiple hepatitis B virus (HBV) integration sites were identified in TCGA liver samples and confirmed by HGT-ID using the RNA-Seq data from the matched liver pairs. This shows the applicability of the method in both the data types and cross-validation of the HGT events in liver samples. We also processed 220 breast tumor WGS data through the workflow; however, there were no HGT events detected in those samples. CONCLUSIONS: HGT-ID is a novel computational workflow to detect the integration of viruses in the human genome using the sequencing data. It is fast and accurate with functions such as prioritization, annotation, visualization and primer design for future validation of HGTs. The HGT-ID workflow is released under the MIT License and available at http://kalarikrlab.org/Software/HGT-ID.html .


Subject(s)
Gene Transfer, Horizontal/genetics , Genome, Human , High-Throughput Nucleotide Sequencing/methods , Virus Integration/genetics , Algorithms , Base Sequence , Breast Neoplasms/virology , Cell Line, Tumor , Computer Simulation , Female , Humans , ROC Curve , Software , Whole Genome Sequencing , Workflow
18.
BMC Cancer ; 18(1): 577, 2018 May 21.
Article in English | MEDLINE | ID: mdl-29783934

ABSTRACT

BACKGROUND: The right drug to the right patient at the right time is one of the ideals of Individualized Medicine (IM) and remains one of the most compelling promises of the post-genomic age. The addition of genomic information is expected to increase the precision of an individual patient's treatment, resulting in improved outcomes. While pilot studies have been encouraging, key aspects of interpreting tumor genomics information, such as somatic activation of drug transport or metabolism, have not been systematically evaluated. METHODS: In this work, we developed a simple rule-based approach to classify the therapies administered to each patient from The Cancer Genome Atlas PanCancer dataset (n = 2858) as effective or ineffective. Our Therapy Efficacy model used each patient's drug target and pharmacokinetic (PK) gene expression profile; the specific genes considered for each patient depended on the therapies they received. Patients who received predictably ineffective therapies were considered at high-risk of cancer-related mortality and those who did not receive ineffective therapies were considered at low-risk. The utility of our Therapy Efficacy model was assessed using per-cancer and pan-cancer differential survival. RESULTS: Our simple rule-based Therapy Efficacy model classified 143 (5%) patients as high-risk. High-risk patients had age ranges comparable to low-risk patients of the same cancer type and tended to be later stage and higher grade (odds ratios of 1.6 and 1.4, respectively). A significant pan-cancer association was identified between predictions of our Therapy Efficacy model and poorer overall survival (hazard ratio, HR = 1.47, p = 6.3 × 10- 3). Individually, drug export (HR = 1.49, p = 4.70 × 10- 3) and drug metabolism (HR = 1.73, p = 9.30 × 10- 5) genes demonstrated significant survival associations. Survival associations for target gene expression are mechanism-dependent. Similar results were observed for event-free survival. CONCLUSIONS: While the resolution of clinical information within the dataset is not ideal, and modeling the relative contribution of each gene to the activity of each therapy remains a challenge, our approach demonstrates that somatic PK alterations should be integrated into the interpretation of somatic transcriptomic profiles as they likely have a significant impact on the survival of specific patients. We believe that this approach will aid the prospective design of personalized therapeutic strategies.


Subject(s)
Antineoplastic Agents/pharmacokinetics , Models, Biological , Neoplasms/drug therapy , Precision Medicine/methods , Antineoplastic Agents/therapeutic use , Datasets as Topic , Gene Expression Profiling , Humans , Neoplasms/genetics , Pharmacogenomic Variants/genetics , Progression-Free Survival , Proportional Hazards Models , Treatment Outcome
19.
Clin Cancer Res ; 24(14): 3299-3308, 2018 07 15.
Article in English | MEDLINE | ID: mdl-29618619

ABSTRACT

Purpose: Homozygous deletions play important roles in carcinogenesis. The genome-wide screening for homozygously deleted genes in many different cancer types with a large number of patient specimens representing the tumor heterogeneity has not been done.Experimental Design: We performed integrative analyses of the copy-number profiles of 10,759 patients across 31 cancer types from The Cancer Genome Atlas project.Results: We found that the type-I interferon, α-, and ß-defensin genes were homozygously deleted in 19 cancer types with high frequencies (7%-31%, median = 12%; interquartile range = 10%-16.5%). Patients with homozygous deletion of interferons exhibited significantly shortened overall or disease-free survival time in a number of cancer types, whereas patients with homozygous deletion of defensins did not significantly associate with worse overall or disease-free survival. Gene expression analyses suggested that homozygous deletion of interferon and defensin genes could activate genes involved in oncogenic and cell-cycle pathways but repress other genes involved in immune response pathways, suggesting their roles in promoting tumorigenesis and helping cancer cells evade immune surveillance. Further analysis of the whole exomes of 109 patients with melanoma demonstrated that the homozygous deletion of interferon (P = 0.0029, OR = 11.8) and defensin (P = 0.06, OR = 2.79) genes are significantly associated with resistance to anti-CTLA4 immunotherapy.Conclusions: Our analysis reveals that the homozygous deletion of interferon and defensin genes is prevalent in human cancers, and importantly this feature can be used as a novel prognostic biomarker for immunotherapy resistance. Clin Cancer Res; 24(14); 3299-308. ©2018 AACR.


Subject(s)
Defensins/genetics , Drug Resistance, Neoplasm/genetics , Gene Frequency , Homozygote , Interferon Type I/genetics , Neoplasms/genetics , Sequence Deletion , Antineoplastic Agents, Immunological/pharmacology , Antineoplastic Agents, Immunological/therapeutic use , Computational Biology/methods , Databases, Genetic , Genomics/methods , Humans , Neoplasms/drug therapy , Neoplasms/immunology , Neoplasms/mortality , Prognosis , Treatment Outcome
20.
Mol Carcinog ; 57(1): 114-124, 2018 Jan.
Article in English | MEDLINE | ID: mdl-28926134

ABSTRACT

Chromosome instability (CIN) is widely observed in both sporadic and hereditary colorectal cancer (CRC). Defects in APC and WNT signaling are primarily associated with CIN in hereditary CRC, but the genetic causes for CIN in sporadic CRC remain elusive. Using high-density SNP array and exome data from The Cancer Genome Atlas (TCGA), we characterized loss of heterozygosity (LOH) and copy number variation (CNV) in the peripheral blood, normal colon, and corresponding tumor tissue in 15 CRC patients with proficient mismatch repair (MMR) and 24 CRC patients with deficient MMR. We found a high frequency of 18q LOH in tumors and arm-specific enrichment of genetic aberrations on 18q in the normal colon (primarily copy neutral LOH) and blood (primarily copy gain). These aberrations were specific to the sporadic, pMMR CRC. Though in tumor samples genetic aberrations were observed for genes commonly mutated in hereditary CRC (eg, APC, CTNNB1, SMAD4, BRAF), none of them showed LOH or CNV in the normal colon or blood. DCC located on 18q21.1 topped the list of genes with genetic aberrations in the tumor. In an independent cohort of 13 patients subjected to Whole Genome Sequencing (WGS), we found LOH and CNV on 18q in adenomatous polyp and tumor tissues. Our data suggests that patients with sporadic CRC may have genetic aberrations preferentially enriched on 18q in their blood, normal colon epithelium, and non-malignant polyp lesions that may prove useful as a clinical marker for sporadic CRC detection and risk assessment.


Subject(s)
Colorectal Neoplasms/genetics , DNA Copy Number Variations , DNA Mismatch Repair/genetics , Loss of Heterozygosity , Aged , Aged, 80 and over , Chromosomal Instability , Chromosomes, Human, Pair 18/genetics , Cohort Studies , Colorectal Neoplasms/pathology , Female , Genotype , Humans , Male , Middle Aged , Mutation
SELECTION OF CITATIONS
SEARCH DETAIL
...