Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Blood ; 137(20): 2800-2816, 2021 05 20.
Artigo em Inglês | MEDLINE | ID: mdl-33206936

RESUMO

The transformation of chronic lymphocytic leukemia (CLL) to high-grade B-cell lymphoma is known as Richter syndrome (RS), a rare event with dismal prognosis. In this study, we conducted whole-genome sequencing (WGS) of paired circulating CLL (PB-CLL) and RS biopsies (tissue-RS) from 17 patients recruited into a clinical trial (CHOP-O). We found that tissue-RS was enriched for mutations in poor-risk CLL drivers and genes in the DNA damage response (DDR) pathway. In addition, we identified genomic aberrations not previously implicated in RS, including the protein tyrosine phosphatase receptor (PTPRD) and tumor necrosis factor receptor-associated factor 3 (TRAF3). In the noncoding genome, we discovered activation-induced cytidine deaminase-related and unrelated kataegis in tissue-RS affecting regulatory regions of key immune-regulatory genes. These include BTG2, CXCR4, NFATC1, PAX5, NOTCH-1, SLC44A5, FCRL3, SELL, TNIP2, and TRIM13. Furthermore, differences between the global mutation signatures of pairs of PB-CLL and tissue-RS samples implicate DDR as the dominant mechanism driving transformation. Pathway-based clonal deconvolution analysis showed that genes in the MAPK and DDR pathways demonstrate high clonal-expansion probability. Direct comparison of nodal-CLL and tissue-RS pairs from an independent cohort confirmed differential expression of the same pathways by RNA expression profiling. Our integrated analysis of WGS and RNA expression data significantly extends previous targeted approaches, which were limited by the lack of germline samples, and it facilitates the identification of novel genomic correlates implicated in RS transformation, which could be targeted therapeutically. Our results inform the future selection of investigative agents for a UK clinical platform study. This trial was registered at www.clinicaltrials.gov as #NCT03899337.


Assuntos
Evolução Clonal/genética , Regulação Neoplásica da Expressão Gênica/genética , Leucemia Linfocítica Crônica de Células B/patologia , Linfoma Difuso de Grandes Células B/patologia , RNA Neoplásico/genética , Transcriptoma , Idoso , Idoso de 80 Anos ou mais , Anticorpos Monoclonais Humanizados/uso terapêutico , Protocolos de Quimioterapia Combinada Antineoplásica/administração & dosagem , Protocolos de Quimioterapia Combinada Antineoplásica/uso terapêutico , Sequência de Bases , Células Clonais/patologia , Terapia Combinada , Ciclofosfamida/administração & dosagem , Reparo do DNA , Progressão da Doença , Doxorrubicina/administração & dosagem , Feminino , Redes Reguladoras de Genes , Genes Neoplásicos , Humanos , Leucemia Linfocítica Crônica de Células B/genética , Linfoma Difuso de Grandes Células B/tratamento farmacológico , Linfoma Difuso de Grandes Células B/genética , Masculino , Pessoa de Meia-Idade , Mutação , Proteínas de Neoplasias/genética , Prednisona/administração & dosagem , Estudos Prospectivos , RNA Neoplásico/biossíntese , Síndrome , Vincristina/administração & dosagem , Sequenciamento Completo do Genoma
2.
Bioinformatics ; 37(2): 147-154, 2021 04 19.
Artigo em Inglês | MEDLINE | ID: mdl-32722772

RESUMO

MOTIVATION: Tumours are composed of distinct cancer cell populations (clones), which continuously adapt to their local micro-environment. Standard methods for clonal deconvolution seek to identify groups of mutations and estimate the prevalence of each group in the tumour, while considering its purity and copy number profile. These methods have been applied on cross-sectional data and on longitudinal data after discarding information on the timing of sample collection. Two key questions are how can we incorporate such information in our analyses and is there any benefit in doing so? RESULTS: We developed a clonal deconvolution method, which incorporates explicitly the temporal spacing of longitudinally sampled tumours. By merging a Dirichlet Process Mixture Model with Gaussian Process priors and using as input a sequence of several sparsely collected samples, our method can reconstruct the temporal profile of the abundance of any mutation cluster supported by the data as a continuous function of time. We benchmarked our method on whole genome, whole exome and targeted sequencing data from patients with chronic lymphocytic leukaemia, on liquid biopsy data from a patient with melanoma and on synthetic data and we found that incorporating information on the timing of tissue collection improves model performance, as long as data of sufficient volume and complexity are available for estimating free model parameters. Thus, our approach is particularly useful when collecting a relatively long sequence of tumour samples is feasible, as in liquid cancers (e.g. leukaemia) and liquid biopsies. AVAILABILITY AND IMPLEMENTATION: The statistical methodology presented in this paper is freely available at github.com/dvav/clonosGP. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Neoplasias , Estudos Transversais , Exoma , Humanos , Mutação , Neoplasias/genética , Software , Microambiente Tumoral , Sequenciamento Completo do Genoma
3.
Bioinformatics ; 33(19): 3058-3064, 2017 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-28575251

RESUMO

MOTIVATION: The identification of genetic variants influencing gene expression (known as expression quantitative trait loci or eQTLs) is important in unravelling the genetic basis of complex traits. Detecting multiple eQTLs simultaneously in a population based on paired DNA-seq and RNA-seq assays employs two competing types of models: models which rely on appropriate transformations of RNA-seq data (and are powered by a mature mathematical theory), or count-based models, which represent digital gene expression explicitly, thus rendering such transformations unnecessary. The latter constitutes an immensely popular methodology, which is however plagued by mathematical intractability. RESULTS: We develop tractable count-based models, which are amenable to efficient estimation through the introduction of latent variables and the appropriate application of recent statistical theory in a sparse Bayesian modelling framework. Furthermore, we examine several transformation methods for RNA-seq read counts and we introduce arcsin, logit and Laplace smoothing as preprocessing steps for transformation-based models. Using natural and carefully simulated data from the 1000 Genomes and gEUVADIS projects, we benchmark both approaches under a variety of scenarios, including the presence of noise and violation of basic model assumptions. We demonstrate that an arcsin transformation of Laplace-smoothed data is at least as good as state-of-the-art models, particularly at small samples. Furthermore, we show that an over-dispersed Poisson model is comparable to the celebrated Negative Binomial, but much easier to estimate. These results provide strong support for transformation-based versus count-based (particularly Negative-Binomial-based) models for eQTL mapping. AVAILABILITY AND IMPLEMENTATION: All methods are implemented in the free software eQTLseq: https://github.com/dvav/eQTLseq. CONTACT: dimitris.vavoulis@well.ox.ac.uk. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Modelos Estatísticos , Locos de Características Quantitativas , Análise de Sequência de RNA/métodos , Teorema de Bayes , Variação Genética , Software
4.
Genet Med ; 20(10): 1196-1205, 2018 10.
Artigo em Inglês | MEDLINE | ID: mdl-29388947

RESUMO

PURPOSE: Fresh-frozen (FF) tissue is the optimal source of DNA for whole-genome sequencing (WGS) of cancer patients. However, it is not always available, limiting the widespread application of WGS in clinical practice. We explored the viability of using formalin-fixed, paraffin-embedded (FFPE) tissues, available routinely for cancer patients, as a source of DNA for clinical WGS. METHODS: We conducted a prospective study using DNAs from matched FF, FFPE, and peripheral blood germ-line specimens collected from 52 cancer patients (156 samples) following routine diagnostic protocols. We compared somatic variants detected in FFPE and matching FF samples. RESULTS: We found the single-nucleotide variant agreement reached 71% across the genome and somatic copy-number alterations (CNAs) detection from FFPE samples was suboptimal (0.44 median correlation with FF) due to nonuniform coverage. CNA detection was improved significantly with lower reverse crosslinking temperature in FFPE DNA extraction (80 °C or 65 °C depending on the methods). Our final data showed somatic variant detection from FFPE for clinical decision making is possible. We detected 98% of clinically actionable variants (including 30/31 CNAs). CONCLUSION: We present the first prospective WGS study of cancer patients using FFPE specimens collected in a routine clinical environment proving WGS can be applied in the clinic.


Assuntos
Variações do Número de Cópias de DNA/genética , Genoma Humano/genética , Neoplasias/genética , Sequenciamento Completo do Genoma/métodos , Tomada de Decisões , Feminino , Humanos , Masculino , Neoplasias/sangue , Neoplasias/patologia , Inclusão em Parafina , Polimorfismo de Nucleotídeo Único/genética
5.
PLoS Med ; 14(2): e1002230, 2017 02.
Artigo em Inglês | MEDLINE | ID: mdl-28196074

RESUMO

BACKGROUND: Single gene tests to predict whether cancers respond to specific targeted therapies are performed increasingly often. Advances in sequencing technology, collectively referred to as next generation sequencing (NGS), mean the entire cancer genome or parts of it can now be sequenced at speed with increased depth and sensitivity. However, translation of NGS into routine cancer care has been slow. Healthcare stakeholders are unclear about the clinical utility of NGS and are concerned it could be an expensive addition to cancer diagnostics, rather than an affordable alternative to single gene testing. METHODS AND FINDINGS: We validated a 46-gene hotspot cancer panel assay allowing multiple gene testing from small diagnostic biopsies. From 1 January 2013 to 31 December 2013, solid tumour samples (including non-small-cell lung carcinoma [NSCLC], colorectal carcinoma, and melanoma) were sequenced in the context of the UK National Health Service from 351 consecutively submitted prospective cases for which treating clinicians thought the patient had potential to benefit from more extensive genetic analysis. Following histological assessment, tumour-rich regions of formalin-fixed paraffin-embedded (FFPE) sections underwent macrodissection, DNA extraction, NGS, and analysis using a pipeline centred on Torrent Suite software. With a median turnaround time of seven working days, an integrated clinical report was produced indicating the variants detected, including those with potential diagnostic, prognostic, therapeutic, or clinical trial entry implications. Accompanying phenotypic data were collected, and a detailed cost analysis of the panel compared with single gene testing was undertaken to assess affordability for routine patient care. Panel sequencing was successful for 97% (342/351) of tumour samples in the prospective cohort and showed 100% concordance with known mutations (detected using cobas assays). At least one mutation was identified in 87% (296/342) of tumours. A locally actionable mutation (i.e., available targeted treatment or clinical trial) was identified in 122/351 patients (35%). Forty patients received targeted treatment, in 22/40 (55%) cases solely due to use of the panel. Examination of published data on the potential efficacy of targeted therapies showed theoretically actionable mutations (i.e., mutations for which targeted treatment was potentially appropriate) in 66% (71/107) and 39% (41/105) of melanoma and NSCLC patients, respectively. At a cost of £339 (US$449) per patient, the panel was less expensive locally than performing more than two or three single gene tests. Study limitations include the use of FFPE samples, which do not always provide high-quality DNA, and the use of "real world" data: submission of cases for sequencing did not always follow clinical guidelines, meaning that when mutations were detected, patients were not always eligible for targeted treatments on clinical grounds. CONCLUSIONS: This study demonstrates that more extensive tumour sequencing can identify mutations that could improve clinical decision-making in routine cancer care, potentially improving patient outcomes, at an affordable level for healthcare providers.


Assuntos
Carcinoma Pulmonar de Células não Pequenas/diagnóstico , Neoplasias Colorretais/diagnóstico , Genômica , Melanoma/diagnóstico , Patologia/métodos , Patologia/normas , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Carcinoma Pulmonar de Células não Pequenas/economia , Carcinoma Pulmonar de Células não Pequenas/genética , Carcinoma Pulmonar de Células não Pequenas/terapia , Criança , Tomada de Decisão Clínica , Neoplasias Colorretais/economia , Neoplasias Colorretais/genética , Neoplasias Colorretais/terapia , Feminino , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Masculino , Melanoma/economia , Melanoma/genética , Melanoma/terapia , Pessoa de Meia-Idade , Programas Nacionais de Saúde , Estudos Prospectivos , Estudos Retrospectivos , Reino Unido , Adulto Jovem
6.
Nucleic Acids Res ; 43(Database issue): D227-33, 2015 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-25414345

RESUMO

We present updates to the SUPERFAMILY 1.75 (http://supfam.org) online resource and protein sequence collection. The hidden Markov model library that provides sequence homology to SCOP structural domains remains unchanged at version 1.75. In the last 4 years SUPERFAMILY has more than doubled its holding of curated complete proteomes over all cellular life, from 1400 proteomes reported previously in 2010 up to 3258 at present. Outside of the main sequence collection, SUPERFAMILY continues to provide domain annotation for sequences provided by other resources such as: UniProt, Ensembl, PDB, much of JGI Phytozome and selected subcollections of NCBI RefSeq. Despite this growth in data volume, SUPERFAMILY now provides users with an expanded and daily updated phylogenetic tree of life (sTOL). This tree is built with genomic-scale domain annotation data as before, but constantly updated when new species are introduced to the sequence library. Our Gene Ontology and other functional and phenotypic annotations previously reported have stood up to critical assessment by the function prediction community. We have now introduced these data in an integrated manner online at the level of an individual sequence, and--in the case of whole genomes--with enrichment analysis against a taxonomically defined background.


Assuntos
Bases de Dados de Proteínas , Estrutura Terciária de Proteína , Ontologia Genética , Anotação de Sequência Molecular , Filogenia , Proteínas/classificação , Proteínas/genética , Proteoma/química , Análise de Sequência de Proteína
8.
PLoS Comput Biol ; 8(3): e1002401, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22396632

RESUMO

Traditional approaches to the problem of parameter estimation in biophysical models of neurons and neural networks usually adopt a global search algorithm (for example, an evolutionary algorithm), often in combination with a local search method (such as gradient descent) in order to minimize the value of a cost function, which measures the discrepancy between various features of the available experimental data and model output. In this study, we approach the problem of parameter estimation in conductance-based models of single neurons from a different perspective. By adopting a hidden-dynamical-systems formalism, we expressed parameter estimation as an inference problem in these systems, which can then be tackled using a range of well-established statistical inference methods. The particular method we used was Kitagawa's self-organizing state-space model, which was applied on a number of Hodgkin-Huxley-type models using simulated or actual electrophysiological data. We showed that the algorithm can be used to estimate a large number of parameters, including maximal conductances, reversal potentials, kinetics of ionic currents, measurement and intrinsic noise, based on low-dimensional experimental data and sufficiently informative priors in the form of pre-defined constraints imposed on model parameters. The algorithm remained operational even when very noisy experimental data were used. Importantly, by combining the self-organizing state-space model with an adaptive sampling algorithm akin to the Covariance Matrix Adaptation Evolution Strategy, we achieved a significant reduction in the variance of parameter estimates. The algorithm did not require the explicit formulation of a cost function and it was straightforward to apply on compartmental models and multiple data sets. Overall, the proposed methodology is particularly suitable for resolving high-dimensional inference problems based on noisy electrophysiological data and, therefore, a potentially useful tool in the construction of biophysical neuron models.


Assuntos
Potenciais de Ação/fisiologia , Algoritmos , Modelos Neurológicos , Modelos Estatísticos , Neurônios/fisiologia , Animais , Simulação por Computador , Humanos
9.
Genome Med ; 15(1): 94, 2023 11 09.
Artigo em Inglês | MEDLINE | ID: mdl-37946251

RESUMO

BACKGROUND: Whole genome sequencing is increasingly being used for the diagnosis of patients with rare diseases. However, the diagnostic yields of many studies, particularly those conducted in a healthcare setting, are often disappointingly low, at 25-30%. This is in part because although entire genomes are sequenced, analysis is often confined to in silico gene panels or coding regions of the genome. METHODS: We undertook WGS on a cohort of 122 unrelated rare disease patients and their relatives (300 genomes) who had been pre-screened by gene panels or arrays. Patients were recruited from a broad spectrum of clinical specialties. We applied a bioinformatics pipeline that would allow comprehensive analysis of all variant types. We combined established bioinformatics tools for phenotypic and genomic analysis with our novel algorithms (SVRare, ALTSPLICE and GREEN-DB) to detect and annotate structural, splice site and non-coding variants. RESULTS: Our diagnostic yield was 43/122 cases (35%), although 47/122 cases (39%) were considered solved when considering novel candidate genes with supporting functional data into account. Structural, splice site and deep intronic variants contributed to 20/47 (43%) of our solved cases. Five genes that are novel, or were novel at the time of discovery, were identified, whilst a further three genes are putative novel disease genes with evidence of causality. We identified variants of uncertain significance in a further fourteen candidate genes. The phenotypic spectrum associated with RMND1 was expanded to include polymicrogyria. Two patients with secondary findings in FBN1 and KCNQ1 were confirmed to have previously unidentified Marfan and long QT syndromes, respectively, and were referred for further clinical interventions. Clinical diagnoses were changed in six patients and treatment adjustments made for eight individuals, which for five patients was considered life-saving. CONCLUSIONS: Genome sequencing is increasingly being considered as a first-line genetic test in routine clinical settings and can make a substantial contribution to rapidly identifying a causal aetiology for many patients, shortening their diagnostic odyssey. We have demonstrated that structural, splice site and intronic variants make a significant contribution to diagnostic yield and that comprehensive analysis of the entire genome is essential to maximise the value of clinical genome sequencing.


Assuntos
Variação Genética , Doenças Raras , Humanos , Doenças Raras/diagnóstico , Doenças Raras/genética , Sequenciamento Completo do Genoma , Testes Genéticos , Mutação , Proteínas de Ciclo Celular
10.
Nat Genet ; 54(11): 1675-1689, 2022 11.
Artigo em Inglês | MEDLINE | ID: mdl-36333502

RESUMO

The value of genome-wide over targeted driver analyses for predicting clinical outcomes of cancer patients is debated. Here, we report the whole-genome sequencing of 485 chronic lymphocytic leukemia patients enrolled in clinical trials as part of the United Kingdom's 100,000 Genomes Project. We identify an extended catalog of recurrent coding and noncoding genetic mutations that represents a source for future studies and provide the most complete high-resolution map of structural variants, copy number changes and global genome features including telomere length, mutational signatures and genomic complexity. We demonstrate the relationship of these features with clinical outcome and show that integration of 186 distinct recurrent genomic alterations defines five genomic subgroups that associate with response to therapy, refining conventional outcome prediction. While requiring independent validation, our findings highlight the potential of whole-genome sequencing to inform future risk stratification in chronic lymphocytic leukemia.


Assuntos
Leucemia Linfocítica Crônica de Células B , Humanos , Leucemia Linfocítica Crônica de Células B/genética , Sequenciamento Completo do Genoma , Mutação , Genômica , Prognóstico
11.
Methods Mol Biol ; 2082: 123-146, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-31849012

RESUMO

The discovery of genomic polymorphisms influencing gene expression (also known as expression quantitative trait loci or eQTLs) can be formulated as a sparse Bayesian multivariate/multiple regression problem. An important aspect in the development of such models is the implementation of bespoke inference methodologies, a process which can become quite laborious, when multiple candidate models are being considered. We describe automatic, black-box inference in such models using Stan, a popular probabilistic programming language. The utilization of systems like Stan can facilitate model prototyping and testing, thus accelerating the data modeling process. The code described in this chapter can be found at https://github.com/dvav/eQTLBookChapter .


Assuntos
Teorema de Bayes , Mapeamento Cromossômico , Biologia Computacional/métodos , Expressão Gênica , Locos de Características Quantitativas , Software , Algoritmos , Perfilação da Expressão Gênica/métodos , Polimorfismo de Nucleotídeo Único , Linguagens de Programação
12.
Genome Biol ; 16: 39, 2015 Feb 20.
Artigo em Inglês | MEDLINE | ID: mdl-25853652

RESUMO

We present a statistical methodology, DGEclust, for differential expression analysis of digital expression data. Our method treats differential expression as a form of clustering, thus unifying these two concepts. Furthermore, it simultaneously addresses the problem of how many clusters are supported by the data and uncertainty in parameter estimation. DGEclust successfully identifies differentially expressed genes under a number of different scenarios, maintaining a low error rate and an excellent control of its false discovery rate with reasonable computational requirements. It is formulated to perform particularly well on low-replicated data and be applicable to multi-group data. DGEclust is available at http://dvav.github.io/dgeclust/.


Assuntos
Perfilação da Expressão Gênica/métodos , Expressão Gênica/genética , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Algoritmos , Análise por Conglomerados , Humanos , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA