Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 58
Filtrar
1.
Curr Hematol Malig Rep ; 18(6): 284-291, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37947937

RESUMO

PURPOSE OF REVIEW: The length of telomeres, protective structures at the chromosome ends, is a well-established biomarker for pathological conditions including multisystemic syndromes called telomere biology disorders. Approaches to measure telomere length (TL) differ on whether they estimate average, distribution, or chromosome-specific TL, and each presents their own advantages and limitations. RECENT FINDINGS: The development of long-read sequencing and publication of the telomere-to-telomere human genome reference has allowed for scalable and high-resolution TL estimation in pre-existing sequencing datasets but is still impractical as a dedicated TL test. As sequencing costs continue to fall and strategies for selectively enriching telomere regions prior to sequencing improve, these approaches may become a promising alternative to classic methods. Measurement methods rely on probe hybridization, qPCR or more recently, computational methods using sequencing data. Refinements of existing techniques and new approaches have been recently developed but a test that is accurate, simple, and scalable is still lacking.


Assuntos
Telômero , Humanos , Previsões , Telômero/genética
2.
Nat Commun ; 13(1): 6286, 2022 10 21.
Artigo em Inglês | MEDLINE | ID: mdl-36271076

RESUMO

A GGGGCC24+ hexanucleotide repeat expansion (HRE) in the C9ORF72 gene is the most common genetic cause of amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD), fatal neurodegenerative diseases with no cure or approved treatments that substantially slow disease progression or extend survival. Mechanistic underpinnings of neuronal death include C9ORF72 haploinsufficiency, sequestration of RNA-binding proteins in the nucleus, and production of dipeptide repeat proteins. Here, we used an adeno-associated viral vector system to deliver CRISPR/Cas9 gene-editing machineries to effectuate the removal of the HRE from the C9ORF72 genomic locus. We demonstrate successful excision of the HRE in primary cortical neurons and brains of three mouse models containing the expansion (500-600 repeats) as well as in patient-derived iPSC motor neurons and brain organoids (450 repeats). This resulted in a reduction of RNA foci, poly-dipeptides and haploinsufficiency, major hallmarks of C9-ALS/FTD, making this a promising therapeutic approach to these diseases.


Assuntos
Esclerose Lateral Amiotrófica , Demência Frontotemporal , Animais , Camundongos , Demência Frontotemporal/genética , Demência Frontotemporal/metabolismo , Esclerose Lateral Amiotrófica/genética , Esclerose Lateral Amiotrófica/metabolismo , Proteína C9orf72/genética , Proteína C9orf72/metabolismo , Expansão das Repetições de DNA/genética , Sistemas CRISPR-Cas , Neurônios Motores/metabolismo , Dipeptídeos/metabolismo , RNA/metabolismo
3.
Nat Commun ; 13(1): 5012, 2022 08 25.
Artigo em Inglês | MEDLINE | ID: mdl-36008405

RESUMO

Conventional therapy for hereditary tyrosinemia type-1 (HT1) with 2-(2-nitro-4-trifluoromethylbenzoyl)-1,3-cyclohexanedione (NTBC) delays and in some cases fails to prevent disease progression to liver fibrosis, liver failure, and activation of tumorigenic pathways. Here we demonstrate cure of HT1 by direct, in vivo administration of a therapeutic lentiviral vector targeting the expression of a human fumarylacetoacetate hydrolase (FAH) transgene in the porcine model of HT1. This therapy is well tolerated and provides stable long-term expression of FAH in pigs with HT1. Genomic integration displays a benign profile, with subsequent fibrosis and tumorigenicity gene expression patterns similar to wild-type animals as compared to NTBC-treated or diseased untreated animals. Indeed, the phenotypic and genomic data following in vivo lentiviral vector administration demonstrate comparative superiority over other therapies including ex vivo cell therapy and therefore support clinical application of this approach.


Assuntos
Lesões Pré-Cancerosas , Tirosinemias , Animais , Modelos Animais de Doenças , Terapia Genética , Humanos , Hidrolases/genética , Hidrolases/metabolismo , Cirrose Hepática/terapia , Nitrobenzoatos/farmacologia , Nitrobenzoatos/uso terapêutico , Suínos , Tirosinemias/genética , Tirosinemias/terapia
5.
Bioinformatics ; 38(7): 1788-1793, 2022 03 28.
Artigo em Inglês | MEDLINE | ID: mdl-35022670

RESUMO

MOTIVATION: Telomeres are the repetitive sequences found at the ends of eukaryotic chromosomes and are often thought of as a 'biological clock,' with their average length shortening during division in most cells. In addition to their association with senescence, abnormal telomere lengths are well known to be associated with multiple cancers, short telomere syndromes and as risk factors for a broad range of diseases. While a majority of methods for measuring telomere length will report average lengths across all chromosomes, it is known that aberrations in specific chromosome arms are biomarkers for certain diseases. Due to their repetitive nature, characterizing telomeres at this resolution is prohibitive for short read sequencing approaches, and is challenging still even with longer reads. RESULTS: We present Telogator: a method for reporting chromosome-specific telomere length from long read sequencing data. We demonstrate Telogator's sensitivity in detecting chromosome-specific telomere length in simulated data across a range of read lengths and error rates. Telogator is then applied to 10 germline samples, yielding a high correlation with short read methods in reporting average telomere length. In addition, we investigate common subtelomere rearrangements and identify the minimum read length required to anchor telomere/subtelomere boundaries in samples with these haplotypes. AVAILABILITY AND IMPLEMENTATION: Telogator is written in Python3 and is available at github.com/zstephens/telogator. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Sequências Repetitivas de Ácido Nucleico , Telômero , Telômero/genética , Haplótipos
6.
PLoS One ; 16(9): e0250915, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34550971

RESUMO

The integration of viruses into the human genome is known to be associated with tumorigenesis in many cancers, but the accurate detection of integration breakpoints from short read sequencing data is made difficult by human-viral homologies, viral genome heterogeneity, coverage limitations, and other factors. To address this, we present Exogene, a sensitive and efficient workflow for detecting viral integrations from paired-end next generation sequencing data. Exogene's read filtering and breakpoint detection strategies yield integration coordinates that are highly concordant with long read validation. We demonstrate this concordance across 6 TCGA Hepatocellular carcinoma (HCC) tumor samples, identifying integrations of hepatitis B virus that are also supported by long reads. Additionally, we applied Exogene to targeted capture data from 426 previously studied HCC samples, achieving 98.9% concordance with existing methods and identifying 238 high-confidence integrations that were not previously reported. Exogene is applicable to multiple types of paired-end sequence data, including genome, exome, RNA-Seq and targeted capture.


Assuntos
Carcinoma Hepatocelular/virologia , Biologia Computacional/métodos , Vírus da Hepatite B/fisiologia , Hepatite B/genética , Neoplasias Hepáticas/virologia , Integração Viral , Carcinoma Hepatocelular/genética , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Neoplasias Hepáticas/genética , Análise de Sequência de DNA , Análise de Sequência de RNA , Software , Sequenciamento do Exoma , Fluxo de Trabalho
7.
Cancer Inform ; 20: 11769351211027592, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34234399

RESUMO

BACKGROUND: Thousands of gene fusions have been reported in prostate cancer, but their authenticity, incidence, and tumor specificity have not been thoroughly evaluated, nor have their genomic characteristics been carefully explored. METHODS: We developed FusionVet to dedicatedly validate known fusion genes using RNA-seq alignments. Using FusionVet, we re-assessed 2727 gene fusions reported from 36 studies using the RNA-seq data generated by The Cancer Genome Atlas (TCGA). We also explored their genomic characteristics and interrogated the transcriptomic and DNA methylomic consequences of the E26 transformation-specific (ETS) fusions. RESULTS: We found that nearly two-thirds of reported fusions are intra-chromosomal, and 80% of them were formed between 2 protein-coding genes. Although most (76%) genes were fused to only 1 partner, we observed many fusion hub genes that have multiple fusion partners, including ETS family genes, androgen receptor signaling pathway genes, tumor suppressor genes, and proto-oncogenes. More than 90% of the reported fusions cannot be validated by TCGA RNA-seq data. For those fusions that can be validated, 5% were detected from tumor and normal samples with similar frequencies, and only 4% (120 fusions) were tumor-specific. The occurrences of ERG, ETV1, and ETV4 fusions were mutually exclusive, and their fusion statuses were tightly associated with overexpressions. Besides, we found ERG fusions were significantly co-occurred with PTEN deletion but mutually exclusive with common genomic alterations such as SPOP mutation and FOXA1 mutation. CONCLUSIONS: Most of the reported fusion genes cannot be validated by TCGA samples. The ETS family and androgen response genes were significantly enriched in prostate cancer-specific fusion genes. Transcription activity was significantly repressed, and the DNA methylation was significantly increased in samples carrying ERG fusion.

8.
Tissue Eng Part A ; 27(23-24): 1503-1516, 2021 12.
Artigo em Inglês | MEDLINE | ID: mdl-33975459

RESUMO

Metal orthopedic implants are largely biocompatible and generally achieve long-term structural fixation. However, some orthopedic implants may loosen over time even in the absence of infection. In vivo fixation failure is multifactorial, but the fundamental biological defect is cellular dysfunction at the host-implant interface. Strategies to reduce the risk of short- and long-term loosening include surface modifications, implant metal alloy type, and adjuvant substances such as polymethylmethacrylate cement. Surface modifications (e.g., increased surface rugosity) can increase osseointegration and biological ingrowth of orthopedic implants. However, the localized responses of cells to implant surface modifications need to be better characterized. As an in vitro model for investigating cellular responses to metallic orthopedic implants, we cultured mesenchymal stromal/stem cells on clinical-grade titanium disks (Ti6Al4V) that differed in surface roughness as high (porous structured), medium (grit blasted), and low (bead blasted). Topological characterization of clinically relevant titanium (Ti) materials combined with differential mRNA expression analyses (RNA-seq and real-time quantitative polymerase chain reaction) revealed alterations to the biological phenotype of cells cultured on titanium structures that favor early extracellular matrix production and observable responses to oxidative stress and heavy metal stress. These results provide a descriptive model for the interpretation of cellular responses at the interface between native host tissues and three-dimensionally printed modular orthopedic implants, and will guide future studies aimed at increasing the long-term retention of such materials after total joint arthroplasty. Impact statement Using an in vitro model of implant-to-cell interactions by culturing mesenchymal stromal cells (MSCs) on clinically relevant titanium materials of varying topological roughness, we identified mRNA expression patterns consistent with early extracellular matrix (ECM) production and responses to oxidative/heavy metal stress. Implants with high surface roughness may delay the differentiation and ECM formation of MSCs and alter the expression of genes sensitive to reactive oxygen species and protein kinases. In combination with ongoing animal studies, these results will guide future studies aimed at increasing the long-term retention of widely used titanium materials after total joint arthroplasty.


Assuntos
Células-Tronco Mesenquimais , Titânio , Ligas/metabolismo , Animais , Humanos , Osseointegração/fisiologia , Fenótipo , Próteses e Implantes , Propriedades de Superfície , Titânio/farmacologia
9.
BMC Med Genomics ; 14(1): 65, 2021 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-33648520

RESUMO

BACKGROUND: Traditionally, mutational burden and mutational signatures have been assessed by tumor-normal pair DNA sequencing. The requirement of having both normal and tumor samples is not always feasible from a clinical perspective, and led us to investigate the efficacy of using RNA sequencing of only the tumor sample to determine the mutational burden and signatures, and subsequently molecular cause of the cancer. The potential advantages include reducing the cost of testing, and simultaneously providing information on the gene expression profile and gene fusions present in the tumor. RESULTS: In this study, we devised supervised and unsupervised learning methods to determine mutational signatures from tumor RNA-seq data. As applications, we applied the methods to a training set of 587 TCGA uterine cancer RNA-seq samples, and examined in an independent testing set of 521 TCGA colorectal cancer RNA-seq samples. Both diseases are known associated with microsatellite instable high (MSI-H) and driver defects in DNA polymerase ɛ (POLɛ). From RNA-seq called variants, we found majority (> 95%) are likely germline variants, leading to C > T enriched germline variants (dbSNP) widely applicable in tumor and normal RNA-seq samples. We found significant associations between RNA-derived mutational burdens and MSI/POLɛ status, and insignificant relationship between RNA-seq total coverage and derived mutational burdens. Additionally we found that over 80% of variants could be explained by using the COSMIC mutational signature-5, -6 and -10, which are implicated in natural aging, MSI-H, and POLɛ, respectively. For classifying tumor type, within UCEC we achieved a recall of 0.56 and 0.78, and specificity of 0.66 and 0.99 for MSI and POLɛ respectively. By applying learnt RNA signatures from UCEC to COAD, we were able to improve our classification of both MSI and POLɛ. CONCLUSIONS: Taken together, our work provides a novel method to detect RNA-seq derived mutational signatures with effective procedures to remove likely germline variants. It can leads to accurate classification of underlying driving mechanisms of DNA damage deficiency.


Assuntos
Instabilidade de Microssatélites , RNA-Seq , Mutação , Sequenciamento do Exoma
11.
Gynecol Oncol ; 156(2): 387-392, 2020 02.
Artigo em Inglês | MEDLINE | ID: mdl-31787246

RESUMO

OBJECTIVE: We aimed to assess whether endometrial cancer (EC) can be detected in shed DNA collected with vaginal tampon by analyzing copy number, methylation markers, and mutations. METHODS: Tampons were collected prior to hysterectomy from 38 EC patients and 28 women with benign indications. Extracted tampon DNA underwent the following: 1) low-coverage whole genome sequencing (LC-WGS) to assess copy number, 2) pyrosequencing to measure percent promotor methylation of HOXA9, RASSF1, and CDH13 and 3) next generation sequencing (NGS) to identify mutations in 19 genes associated with EC identified through The Cancer Genome Atlas. Sensitivity and specificity for each test and test combinations were calculated. RESULTS: Methylation analysis yielded the highest specificities but lowest sensitivities (37-40% sensitivity; 100% specificity for HOXA9, RASSF1 and HTR1B) while mutation analysis had improved sensitivity (50% sensitivity; 83% specificity). Only one "false positive" result for copy number variants was identified among women with benign surgical indications, which was based on detection of copy number changes, and associated with a leiomyosarcoma that was only recognized at hysterectomy. Considering any of the 3 biomarker classes as a positive, resulted in a sensitivity of 92% and specificity of 86%. Mutation analysis did not add sensitivity to the combination of analysis of copy number and methylation. CONCLUSIONS: This study demonstrates a proof-of-principle for non-invasive yet precise detection of endometrial cancer. We propose that with improved biomarker testing, it may be possible to develop a clinically useful test for detecting EC.


Assuntos
Metilação de DNA , Neoplasias do Endométrio/genética , Dosagem de Genes , Produtos de Higiene Menstrual , Biomarcadores Tumorais/genética , Diagnóstico Diferencial , Neoplasias do Endométrio/diagnóstico , Neoplasias do Endométrio/patologia , Feminino , Humanos , Pessoa de Meia-Idade , Mutação , Doenças Uterinas/diagnóstico , Doenças Uterinas/genética , Doenças Uterinas/patologia , Esfregaço Vaginal/métodos
12.
Pac Symp Biocomput ; 25: 599-610, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-31797631

RESUMO

Shallow-depth whole-genome sequencing (WGS) of circulating cell-free DNA (ccfDNA) is a popular approach for non-invasive genomic screening assays, including liquid biopsy for early detection of invasive tumors as well as non-invasive prenatal screening (NIPS) for common fetal trisomies. In contrast to nuclear DNA WGS, ccfDNA WGS exhibits extensive inter- and intra- sample coverage variability that is not fully explained by typical sources of variation in WGS, such as GC content. This variability may inflate false positive and false negative screening rates of copy-number alterations and aneuploidy, particularly if these features are present at a relatively low proportion of total sequenced content. Herein, we propose an empirically-driven coverage correction strategy that leverages prior annotation information in a multi-distance learning context to improve within-sample coverage profile correction. Specifically, we train a weighted k-nearest neighbors-style method on non-pregnant female donor ccfDNA WGS samples, and apply it to NIPS samples to evaluate coverage profile variability reduction. We additionally characterize improvement in the discrimination of positive fetal trisomy cases relative to normal controls, and compare our results against a more traditional regression-based approach to profile coverage correction based on GC content and mappability. Under cross-validation, performance measures indicated benefit to combining the two feature sets relative to either in isolation. We also observed substantial improvement in coverage profile variability reduction in leave-out clinical NIPS samples, with variability reduced by 26.5-53.5% relative to the standard regression-based method as quantified by median absolute deviation. Finally, we observed improvement discrimination for screening positive trisomy cases reducing ccfDNA WGS coverage variability while additionally improving NIPS trisomy screening assay performance. Overall, our results indicate that machine learning approaches can substantially improve ccfDNA WGS coverage profile correction and downstream analyses.


Assuntos
Testes Genéticos , Diagnóstico Pré-Natal , Trissomia , Ácidos Nucleicos Livres/genética , Biologia Computacional , DNA/genética , Variações do Número de Cópias de DNA , Feminino , Feto , Testes Genéticos/métodos , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Aprendizado de Máquina , Gravidez , Diagnóstico Pré-Natal/métodos , Análise de Sequência de DNA
13.
Hepatol Int ; 13(4): 490-500, 2019 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-31214875

RESUMO

BACKGROUND: Although molecular characterization of iCCA has been studied recently, integrative analysis of molecular and clinical characterization has not been fully established. If molecular features of iCCA can be predicted based on clinical findings, we can approach to distinguish targeted treatment. We analyzed RNA sequencing data annotated with clinicopathologic data to clarify molecular-specific clinical features and to evaluate potential therapies for molecular subtypes. METHODS: We performed next-generation RNA sequencing of 30 surgically resected iCCA from Korean patients and the clinicopathologic features were analyzed. The RNA sequences from 32 iCCA resected from US patients were used for validation. RESULTS: Patients were grouped into two subclasses on the basis of unsupervised clustering, which showed a difference in 5-year survival rates (48.5% vs 14.2%, p = 0.007) and similar survival outcome in the US samples. In subclass B (poor prognosis), both data sets were similar in higher carcinoembryonic antigen and cancer antigen 19-9 levels, underlying cholangitis, and bile duct-type pathology; in subclass A (better prognosis), there was more frequent viral hepatitis and cholangiolar-type pathology. On pathway analysis, subclass A had enriched liver-related signatures. Subclass B had enriched inflammation-related and TP53 pathways, with more frequent KRAS mutations. CCA cell lines with similar gene expression patterns of subclass A were sensitive to gemcitabine. CONCLUSIONS: Two molecular subtypes of iCCA with distinct clinicopathological differences were identified. Knowledge of clinical and pathologic characteristics can predict molecular subtypes, and knowledge of different subtype signaling pathways may lead to more rational, targeted approaches to treatment.


Assuntos
Neoplasias dos Ductos Biliares/mortalidade , Colangiocarcinoma/mortalidade , Idoso , Idoso de 80 Anos ou mais , Antimetabólitos Antineoplásicos/uso terapêutico , Neoplasias dos Ductos Biliares/tratamento farmacológico , Neoplasias dos Ductos Biliares/genética , Ductos Biliares Intra-Hepáticos , Colangiocarcinoma/tratamento farmacológico , Colangiocarcinoma/genética , Desoxicitidina/análogos & derivados , Desoxicitidina/uso terapêutico , Feminino , Genes Neoplásicos/genética , Humanos , Masculino , Pessoa de Meia-Idade , Mutação/genética , Prognóstico , RNA Neoplásico/genética , RNA Neoplásico/metabolismo , República da Coreia/epidemiologia , Estudos Retrospectivos , Estados Unidos/epidemiologia , Regulação para Cima , Gencitabina
14.
Blood ; 133(26): 2776-2789, 2019 06 27.
Artigo em Inglês | MEDLINE | ID: mdl-31101622

RESUMO

Anaplastic large cell lymphomas (ALCLs) represent a relatively common group of T-cell non-Hodgkin lymphomas (T-NHLs) that are unified by similar pathologic features but demonstrate marked genetic heterogeneity. ALCLs are broadly classified as being anaplastic lymphoma kinase (ALK)+ or ALK-, based on the presence or absence of ALK rearrangements. Exome sequencing of 62 T-NHLs identified a previously unreported recurrent mutation in the musculin gene, MSC E116K, exclusively in ALK- ALCLs. Additional sequencing for a total of 238 T-NHLs confirmed the specificity of MSC E116K for ALK- ALCL and further demonstrated that 14 of 15 mutated cases (93%) had coexisting DUSP22 rearrangements. Musculin is a basic helix-loop-helix (bHLH) transcription factor that heterodimerizes with other bHLH proteins to regulate lymphocyte development. The E116K mutation localized to the DNA binding domain of musculin and permitted formation of musculin-bHLH heterodimers but prevented their binding to authentic target sequence. Functional analysis showed MSCE116K acted in a dominant-negative fashion, reversing wild-type musculin-induced repression of MYC and cell cycle inhibition. Chromatin immunoprecipitation-sequencing and transcriptome analysis identified the cell cycle regulatory gene E2F2 as a direct transcriptional target of musculin. MSCE116K reversed E2F2-induced cell cycle arrest and promoted expression of the CD30-IRF4-MYC axis, whereas its expression was reciprocally induced by binding of IRF4 to the MSC promoter. Finally, ALCL cells expressing MSC E116K were preferentially targeted by the BET inhibitor JQ1. These findings identify a novel recurrent MSC mutation as a key driver of the CD30-IRF4-MYC axis and cell cycle progression in a unique subset of ALCLs.


Assuntos
Fatores de Transcrição Hélice-Alça-Hélice Básicos/genética , Linfoma Anaplásico de Células Grandes/genética , Quinase do Linfoma Anaplásico/genética , Ciclo Celular/genética , Regulação Neoplásica da Expressão Gênica/genética , Humanos , Mutação
15.
Gastroenterology ; 157(1): 210-226.e12, 2019 07.
Artigo em Inglês | MEDLINE | ID: mdl-30878468

RESUMO

BACKGROUND & AIMS: The CCNE1 locus, which encodes cyclin E1, is amplified in many types of cancer cells and is activated in hepatocellular carcinomas (HCCs) from patients infected with hepatitis B virus or adeno-associated virus type 2, due to integration of the virus nearby. We investigated cell-cycle and oncogenic effects of cyclin E1 overexpression in tissues of mice. METHODS: We generated mice with doxycycline-inducible expression of Ccne1 (Ccne1T mice) and activated overexpression of cyclin E1 from age 3 weeks onward. At 14 months of age, livers were collected from mice that overexpress cyclin E1 and nontransgenic mice (controls) and analyzed for tumor burden and by histology. Mouse embryonic fibroblasts (MEFs) and hepatocytes from Ccne1T and control mice were analyzed to determine the extent to which cyclin E1 overexpression perturbs S-phase entry, DNA replication, and numbers and structures of chromosomes. Tissues from 4-month-old Ccne1T and control mice (at that age were free of tumors) were analyzed for chromosome alterations, to investigate the mechanisms by which cyclin E1 predisposes hepatocytes to transformation. RESULTS: Ccne1T mice developed more hepatocellular adenomas and HCCs than control mice. Tumors developed only in livers of Ccne1T mice, despite high levels of cyclin E1 in other tissues. Ccne1T MEFs had defects that promoted chromosome missegregation and aneuploidy, including incomplete replication of DNA, centrosome amplification, and formation of nonperpendicular mitotic spindles. Whereas Ccne1T mice accumulated near-diploid aneuploid cells in multiple tissues and organs, polyploidization was observed only in hepatocytes, with losses and gains of whole chromosomes, DNA damage, and oxidative stress. CONCLUSIONS: Livers, but not other tissues of mice with inducible overexpression of cyclin E1, develop tumors. More hepatocytes from the cyclin E1-overexpressing mice were polyploid than from control mice, and had losses or gains of whole chromosomes, DNA damage, and oxidative stress; all of these have been observed in human HCC cells. The increased risk of HCC in patients with hepatitis B virus or adeno-associated virus type 2 infection might involve activation of cyclin E1 and its effects on chromosomes and genomes of liver cells.


Assuntos
Adenoma de Células Hepáticas/genética , Carcinoma Hepatocelular/genética , Instabilidade Cromossômica/genética , Ciclina E/genética , Neoplasias Hepáticas/genética , Fígado/metabolismo , Proteínas Oncogênicas/genética , Adenoma de Células Hepáticas/patologia , Adenoma de Células Hepáticas/virologia , Animais , Carcinoma Hepatocelular/patologia , Carcinoma Hepatocelular/virologia , Estruturas Cromossômicas , Dano ao DNA/genética , Replicação do DNA , Dependovirus , Fibroblastos , Hepatite B Crônica , Hepatócitos , Fígado/patologia , Neoplasias Hepáticas/patologia , Neoplasias Hepáticas/virologia , Neoplasias Hepáticas Experimentais/genética , Neoplasias Hepáticas Experimentais/patologia , Camundongos , Estresse Oxidativo/genética , Infecções por Parvoviridae , Parvovirinae , Poliploidia , Pontos de Checagem da Fase S do Ciclo Celular
16.
BMC Med Genomics ; 12(Suppl 1): 15, 2019 01 31.
Artigo em Inglês | MEDLINE | ID: mdl-30704449

RESUMO

BACKGROUND: Predicting cellular responses to drugs has been a major challenge for personalized drug therapy regimen. Recent pharmacogenomic studies measured the sensitivities of heterogeneous cell lines to numerous drugs, and provided valuable data resources to develop and validate computational approaches for the prediction of drug responses. Most of current approaches predict drug sensitivity by building prediction models with individual genes, which suffer from low reproducibility due to biologic variability and difficulty to interpret biological relevance of novel gene-drug associations. As an alternative, pathway activity scores derived from gene expression could predict drug response of cancer cells. METHOD: In this study, pathway-based prediction models were built with four approaches inferring pathway activity in unsupervised manner, including competitive scoring approaches (DiffRank and GSVA) and self-contained scoring approaches (PLAGE and Z-score). These unsupervised pathway activity inference approaches were applied to predict drug responses of cancer cells using data from Cancer Cell Line Encyclopedia (CCLE). RESULTS: Our analysis on all the 24 drugs from CCLE demonstrated that pathway-based models achieved better predictions for 14 out of the 24 drugs, while taking fewer features as inputs. Further investigation on indicated that pathway-based models indeed captured pathways involving drug-related genes (targets, transporters and metabolic enzymes) for majority of drugs, whereas gene-models failed to identify these drug-related genes, in most cases. Among the four approaches, competitive scoring (DiffRank and GSVA) provided more accurate predictions and captured more pathways involving drug-related genes than self-contained scoring (PLAGE and Z-Score). Detailed interpretation of top pathways from the top method (DiffRank) highlights the merit of pathway-based approaches to predict drug response by identifying pathways relevant to drug mechanisms. CONCLUSION: Taken together, pathway-based modeling with inferred pathway activity is a promising alternative to predict drug response, with the ability to easily interpret results and provide biological insights into the mechanisms of drug actions.


Assuntos
Antineoplásicos/farmacologia , Biologia Computacional/métodos , Linhagem Celular Tumoral , Genes Neoplásicos/genética , Humanos , Modelos Biológicos
17.
BMC Bioinformatics ; 19(Suppl 20): 508, 2018 Dec 21.
Artigo em Inglês | MEDLINE | ID: mdl-30577744

RESUMO

BACKGROUND: With applications in cancer, drug metabolism, and disease etiology, understanding structural variation in the human genome is critical in advancing the thrusts of individualized medicine. However, structural variants (SVs) remain challenging to detect with high sensitivity using short read sequencing technologies. This problem is exacerbated when considering complex SVs comprised of multiple overlapping or nested rearrangements. Longer reads, such as those from Pacific Biosciences platforms, often span multiple breakpoints of such events, and thus provide a way to unravel small-scale complexities in SVs with higher confidence. RESULTS: We present CORGi (COmplex Rearrangement detection with Graph-search), a method for the detection and visualization of complex local genomic rearrangements. This method leverages the ability of long reads to span multiple breakpoints to untangle SVs that appear very complicated with respect to a reference genome. We validated our approach against both simulated long reads, and real data from two long read sequencing technologies. We demonstrate the ability of our method to identify breakpoints inserted in synthetic data with high accuracy, and the ability to detect and plot SVs from NA12878 germline, achieving 88.4% concordance between the two sets of sequence data. The patterns of complexity we find in many NA12878 SVs match known mechanisms associated with DNA replication and structural variant formation, and highlight the ability of our method to automatically label complex SVs with an intuitive combination of adjacent or overlapping reference transformations. CONCLUSIONS: CORGi is a method for interrogating genomic regions suspected to contain local rearrangements using long reads. Using pairwise alignments and graph search CORGi produces labels and visualizations for local SVs of arbitrary complexity.


Assuntos
Variação Estrutural do Genoma , Análise de Sequência de DNA/métodos , Simulação por Computador , Duplicação Gênica , Genoma Humano , Humanos , Alinhamento de Sequência , Software
18.
BMC Genomics ; 19(1): 841, 2018 Nov 27.
Artigo em Inglês | MEDLINE | ID: mdl-30482155

RESUMO

BACKGROUND: Copy Number Alternations (CNAs) is defined as somatic gain or loss of DNA regions. The profiles of CNAs may provide a fingerprint specific to a tumor type or tumor grade. Low-coverage sequencing for reporting CNAs has recently gained interest since successfully translated into clinical applications. Ovarian serous carcinomas can be classified into two largely mutually exclusive grades, low grade and high grade, based on their histologic features. The grade classification based on the genomics may provide valuable clue on how to best manage these patients in clinic. Based on the study of ovarian serous carcinomas, we explore the methodology of combining CNAs reporting from low-coverage sequencing with machine learning techniques to stratify tumor biospecimens of different grades. RESULTS: We have developed a data-driven methodology for tumor classification using the profiles of CNAs reported by low-coverage sequencing. The proposed method called Bag-of-Segments is used to summarize fixed-length CNA features predictive of tumor grades. These features are further processed by machine learning techniques to obtain classification models. High accuracy is obtained for classifying ovarian serous carcinoma into high and low grades based on leave-one-out cross-validation experiments. The models that are weakly influenced by the sequence coverage and the purity of the sample can also be built, which would be of higher relevance for clinical applications. The patterns captured by Bag-of-Segments features correlate with current clinical knowledge: low grade ovarian tumors being related to aneuploidy events associated to mitotic errors while high grade ovarian tumors are induced by DNA repair gene malfunction. CONCLUSIONS: The proposed data-driven method obtains high accuracy with various parametrizations for the ovarian serous carcinoma study, indicating that it has good generalization potential towards other CNA classification problems. This method could be applied to the more difficult task of classifying ovarian serous carcinomas with ambiguous histology or in those with low grade tumor co-existing with high grade tumor. The closer genomic relationship of these tumor samples to low or high grade may provide important clinical value.


Assuntos
Cistadenocarcinoma Seroso/classificação , Variações do Número de Cópias de DNA , Ciência de Dados/métodos , Genoma Humano , Neoplasias Ovarianas/classificação , Cistadenocarcinoma Seroso/genética , Cistadenocarcinoma Seroso/patologia , Feminino , Humanos , Gradação de Tumores , Neoplasias Ovarianas/genética , Neoplasias Ovarianas/patologia , Sequenciamento Completo do Genoma
19.
BMC Med Genomics ; 11(Suppl 3): 67, 2018 Sep 14.
Artigo em Inglês | MEDLINE | ID: mdl-30255803

RESUMO

BACKGROUND: RNA-seq is the most commonly used sequencing application. Not only does it measure gene expression but it is also an excellent media to detect important structural variants such as single nucleotide variants (SNVs), insertion/deletion (Indels) or fusion transcripts. However, detection of these variants is challenging and complex from RNA-seq. Here we describe a sensitive and accurate analytical pipeline which detects various mutations at once for translational precision medicine. METHODS: The pipeline incorporates most sensitive aligners for Indels in RNA-Seq, the best practice for data preprocessing and variant calling, and STAR-fusion is for chimeric transcripts. Variants/mutations are annotated, and key genes can be extracted for further investigation and clinical actions. Three datasets were used to evaluate the performance of the pipeline for SNVs, indels and fusion transcripts. RESULTS: For the well-defined variants from NA12878 by GIAB project, about 95% and 80% of sensitivities were obtained for SNVs and indels, respectively, in matching RNA-seq. Comparison with other variant specific tools showed good performance of the pipeline. For the lung cancer dataset with 41 known and oncogenic mutations, 39 were detected by the pipeline with STAR aligner and all by the GSNAP aligner. An actionable EML4 and ALK fusion was also detected in one of the tumors, which also demonstrated outlier ALK expression. For 9 fusions spiked-into RNA-seq libraries with different concentrations, the pipeline was able to detect all in unfiltered results although some at very low concentrations may be missed when filtering was applied. CONCLUSIONS: The new RNA-seq workflow is an accurate and comprehensive mutation profiler from RNA-seq. Key or actionable mutations are reliably detected from RNA-seq, which makes it a practical alternative source for personalized medicine.


Assuntos
Biomarcadores Tumorais/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Mutação INDEL , Neoplasias Pulmonares/genética , Polimorfismo de Nucleotídeo Único , Medicina de Precisão , Análise de Sequência de RNA/métodos , Adenocarcinoma/genética , Humanos , Software
20.
BMC Bioinformatics ; 19(1): 271, 2018 07 17.
Artigo em Inglês | MEDLINE | ID: mdl-30016933

RESUMO

BACKGROUND: Transfer of genetic material from microbes or viruses into the host genome is known as horizontal gene transfer (HGT). The integration of viruses into the human genome is associated with multiple cancers, and these can now be detected using next-generation sequencing methods such as whole genome sequencing and RNA-sequencing. RESULTS: We designed a novel computational workflow, HGT-ID, to identify the integration of viruses into the human genome using the sequencing data. The HGT-ID workflow primarily follows a four-step procedure: i) pre-processing of unaligned reads, ii) virus detection using subtraction approach, iii) identification of virus integration site using discordant and soft-clipped reads and iv) HGT candidates prioritization through a scoring function. Annotation and visualization of the events, as well as primer design for experimental validation, are also provided in the final report. We evaluated the tool performance with the well-understood cervical cancer samples. The HGT-ID workflow accurately detected known human papillomavirus (HPV) integration sites with high sensitivity and specificity compared to previous HGT methods. We applied HGT-ID to The Cancer Genome Atlas (TCGA) whole-genome sequencing data (WGS) from liver tumor-normal pairs. Multiple hepatitis B virus (HBV) integration sites were identified in TCGA liver samples and confirmed by HGT-ID using the RNA-Seq data from the matched liver pairs. This shows the applicability of the method in both the data types and cross-validation of the HGT events in liver samples. We also processed 220 breast tumor WGS data through the workflow; however, there were no HGT events detected in those samples. CONCLUSIONS: HGT-ID is a novel computational workflow to detect the integration of viruses in the human genome using the sequencing data. It is fast and accurate with functions such as prioritization, annotation, visualization and primer design for future validation of HGTs. The HGT-ID workflow is released under the MIT License and available at http://kalarikrlab.org/Software/HGT-ID.html .


Assuntos
Transferência Genética Horizontal/genética , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Integração Viral/genética , Algoritmos , Sequência de Bases , Neoplasias da Mama/virologia , Linhagem Celular Tumoral , Simulação por Computador , Feminino , Humanos , Curva ROC , Software , Sequenciamento Completo do Genoma , Fluxo de Trabalho
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA