Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 30
Filtrar
1.
Nucleic Acids Res ; 49(3): e17, 2021 02 22.
Artículo en Inglés | MEDLINE | ID: mdl-33347581

RESUMEN

Chromatin immunoprecipitation (IP) followed by sequencing (ChIP-seq) is the gold standard to detect transcription-factor (TF) binding sites in the genome. Its success depends on appropriate controls removing systematic biases. The predominantly used controls, i.e. DNA input, correct for uneven sonication, but not for nonspecific interactions of the IP antibody. Another type of controls, 'mock' IP, corrects for both of the issues, but is not widely used because it is considered susceptible to technical noise. The tradeoff between the two control types has not been investigated systematically. Therefore, we generated comparable DNA input and mock IP experiments. Because mock IPs contain only nonspecific interactions, the sites predicted from them using DNA input indicate the spurious-site abundance. This abundance is highly correlated with the 'genomic activity' (e.g. chromatin openness). In particular, compared to cell lines, complex samples such as whole organisms have more spurious sites-probably because they contain multiple cell types, resulting in more expressed genes and more open chromatin. Consequently, DNA input and mock IP controls performed similarly for cell lines, whereas for complex samples, mock IP substantially reduced the number of spurious sites. However, DNA input is still informative; thus, we developed a simple framework integrating both controls, improving binding site detection.


Asunto(s)
Secuenciación de Inmunoprecipitación de Cromatina/métodos , Factores de Transcripción/metabolismo , Anticuerpos , Sitios de Unión , Línea Celular , ADN , Humanos
2.
Nucleic Acids Res ; 49(D1): D916-D923, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-33270111

RESUMEN

The GENCODE project annotates human and mouse genes and transcripts supported by experimental data with high accuracy, providing a foundational resource that supports genome biology and clinical genomics. GENCODE annotation processes make use of primary data and bioinformatic tools and analysis generated both within the consortium and externally to support the creation of transcript structures and the determination of their function. Here, we present improvements to our annotation infrastructure, bioinformatics tools, and analysis, and the advances they support in the annotation of the human and mouse genomes including: the completion of first pass manual annotation for the mouse reference genome; targeted improvements to the annotation of genes associated with SARS-CoV-2 infection; collaborative projects to achieve convergence across reference annotation databases for the annotation of human and mouse protein-coding genes; and the first GENCODE manually supervised automated annotation of lncRNAs. Our annotation is accessible via Ensembl, the UCSC Genome Browser and https://www.gencodegenes.org.


Asunto(s)
COVID-19/prevención & control , Biología Computacional/métodos , Bases de Datos Genéticas , Genómica/métodos , Anotación de Secuencia Molecular/métodos , SARS-CoV-2/genética , Animales , COVID-19/epidemiología , COVID-19/virología , Epidemias , Humanos , Internet , Ratones , Seudogenes/genética , ARN Largo no Codificante/genética , SARS-CoV-2/metabolismo , SARS-CoV-2/fisiología , Transcripción Genética/genética
3.
Bioinformatics ; 36(21): 5145-5150, 2021 01 29.
Artículo en Inglés | MEDLINE | ID: mdl-32726397

RESUMEN

MOTIVATION: Functional genomics data are becoming clinically actionable, raising privacy concerns. However, quantifying privacy leakage via genotyping is difficult due to the heterogeneous nature of sequencing techniques. Thus, we present FANCY, a tool that rapidly estimates the number of leaking variants from raw RNA-Seq, ATAC-Seq and ChIP-Seq reads, without explicit genotyping. FANCY employs supervised regression using overall sequencing statistics as features and provides an estimate of the overall privacy risk before data release. RESULTS: FANCY can predict the cumulative number of leaking SNVs with an average 0.95 R2 for all independent test sets. We realize the importance of accurate prediction when the number of leaked variants is low. Thus, we develop a special version of the model, which can make predictions with higher accuracy when the number of leaking variants is low. AVAILABILITY AND IMPLEMENTATION: A python and MATLAB implementation of FANCY, as well as custom scripts to generate the features can be found at https://github.com/gersteinlab/FANCY. We also provide jupyter notebooks so that users can optimize the parameters in the regression model based on their own data. An easy-to-use webserver that takes inputs and displays results can be found at fancy.gersteinlab.org. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Privacidad , Programas Informáticos , Genómica , Humanos , RNA-Seq , Secuenciación del Exoma
4.
Genome Res ; 28(4): 448-459, 2018 04.
Artículo en Inglés | MEDLINE | ID: mdl-29563166

RESUMEN

Understanding the mechanisms driving lineage-specific evolution in both primates and rodents has been hindered by the lack of sister clades with a similar phylogenetic structure having high-quality genome assemblies. Here, we have created chromosome-level assemblies of the Mus caroli and Mus pahari genomes. Together with the Mus musculus and Rattus norvegicus genomes, this set of rodent genomes is similar in divergence times to the Hominidae (human-chimpanzee-gorilla-orangutan). By comparing the evolutionary dynamics between the Muridae and Hominidae, we identified punctate events of chromosome reshuffling that shaped the ancestral karyotype of Mus musculus and Mus caroli between 3 and 6 million yr ago, but that are absent in the Hominidae. Hominidae show between four- and sevenfold lower rates of nucleotide change and feature turnover in both neutral and functional sequences, suggesting an underlying coherence to the Muridae acceleration. Our system of matched, high-quality genome assemblies revealed how specific classes of repeats can play lineage-specific roles in related species. Recent LINE activity has remodeled protein-coding loci to a greater extent across the Muridae than the Hominidae, with functional consequences at the species level such as reproductive isolation. Furthermore, we charted a Muridae-specific retrotransposon expansion at unprecedented resolution, revealing how a single nucleotide mutation transformed a specific SINE element into an active CTCF binding site carrier specifically in Mus caroli, which resulted in thousands of novel, species-specific CTCF binding sites. Our results show that the comparison of matched phylogenetic sets of genomes will be an increasingly powerful strategy for understanding mammalian biology.


Asunto(s)
Evolución Molecular , Genoma/genética , Muridae/genética , Filogenia , Animales , Sitios de Unión , Factor de Unión a CCCTC/genética , Cromosomas/genética , Cariotipificación/métodos , Elementos de Nucleótido Esparcido Largo/genética , Ratones , Retroelementos/genética , Especificidad de la Especie
5.
Nucleic Acids Res ; 47(D1): D766-D773, 2019 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-30357393

RESUMEN

The accurate identification and description of the genes in the human and mouse genomes is a fundamental requirement for high quality analysis of data informing both genome biology and clinical genomics. Over the last 15 years, the GENCODE consortium has been producing reference quality gene annotations to provide this foundational resource. The GENCODE consortium includes both experimental and computational biology groups who work together to improve and extend the GENCODE gene annotation. Specifically, we generate primary data, create bioinformatics tools and provide analysis to support the work of expert manual gene annotators and automated gene annotation pipelines. In addition, manual and computational annotation workflows use any and all publicly available data and analysis, along with the research literature to identify and characterise gene loci to the highest standard. GENCODE gene annotations are accessible via the Ensembl and UCSC Genome Browsers, the Ensembl FTP site, Ensembl Biomart, Ensembl Perl and REST APIs as well as https://www.gencodegenes.org.


Asunto(s)
Bases de Datos Genéticas , Genoma Humano/genética , Genómica , Seudogenes/genética , Animales , Biología Computacional , Humanos , Internet , Ratones , Anotación de Secuencia Molecular , Programas Informáticos
6.
BMC Bioinformatics ; 21(1): 227, 2020 Jun 04.
Artículo en Inglés | MEDLINE | ID: mdl-32498674

RESUMEN

BACKGROUND: Mutations arise in the human genome in two major settings: the germline and the soma. These settings involve different inheritance patterns, time scales, chromatin structures, and environmental exposures, all of which impact the resulting distribution of substitutions. Nonetheless, many of the same single nucleotide variants (SNVs) are shared between germline and somatic mutation databases, such as between the gnomAD database of 120,000 germline exomes and the TCGA database of 10,000 somatic exomes. Here, we sought to explain this overlap. RESULTS: After strict filtering to exclude common germline polymorphisms and sites with poor coverage or mappability, we found 336,987 variants shared between the somatic and germline databases. A uniform statistical model explains 34% of these shared variants; a model that incorporates the varying mutation rates of the basic mutation types explains another 50% of shared variants; and a model that includes extended nucleotide contexts (e.g. surrounding 3 bases on either side) explains an additional 4% of shared variants. Analysis of read depth finds mixed evidence that up to 4% of the shared variants may represent germline variants leaked into somatic call sets. 9% of the shared variants are not explained by any model. Sequencing errors and convergent evolution did not account for these. We surveyed other factors as well: Cancers driven by endogenous mutational processes share a greater fraction of variants with the germline, and recently derived germline variants were more likely to be somatically shared than were ancient germline ones. CONCLUSIONS: Overall, we find that shared variants largely represent bona fide biological occurrences of the same variant in the germline and somatic setting and arise primarily because DNA has some of the same basic chemical vulnerabilities in either setting. Moreover, we find mixed evidence that somatic call-sets leak appreciable numbers of germline variants, which is relevant to genomic privacy regulations. In future studies, the similar chemical vulnerability of DNA between the somatic and germline settings might be used to help identify disease-related genes by guiding the development of background-mutation models that are informed by both somatic and germline patterns of variation.


Asunto(s)
Bases de Datos Genéticas , Mutación de Línea Germinal/genética , Alelos , Evolución Biológica , Epigénesis Genética , Frecuencia de los Genes/genética , Humanos , Tasa de Mutación , Neoplasias/genética , Nucleótidos/genética , Filogenia , Análisis de Secuencia de ADN
7.
Nature ; 553(7689): 405, 2018 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-32094824
8.
Nature ; 553(7689): 405, 2018 01 25.
Artículo en Inglés | MEDLINE | ID: mdl-29368714
9.
PLoS Genet ; 9(1): e1003242, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23359205

RESUMEN

The era of whole-genome sequencing has revealed that gene copy-number changes caused by duplication and deletion events have important evolutionary, functional, and phenotypic consequences. Recent studies have therefore focused on revealing the extent of variation in copy-number within natural populations of humans and other species. These studies have found a large number of copy-number variants (CNVs) in humans, many of which have been shown to have clinical or evolutionary importance. For the most part, these studies have failed to detect an important class of gene copy-number polymorphism: gene duplications caused by retrotransposition, which result in a new intron-less copy of the parental gene being inserted into a random location in the genome. Here we describe a computational approach leveraging next-generation sequence data to detect gene copy-number variants caused by retrotransposition (retroCNVs), and we report the first genome-wide analysis of these variants in humans. We find that retroCNVs account for a substantial fraction of gene copy-number differences between any two individuals. Moreover, we show that these variants may often result in expressed chimeric transcripts, underscoring their potential for the evolution of novel gene functions. By locating the insertion sites of these duplicates, we are able to show that retroCNVs have had an important role in recent human adaptation, and we also uncover evidence that positive selection may currently be driving multiple retroCNVs toward fixation. Together these findings imply that retroCNVs are an especially important class of polymorphism, and that future studies of copy-number variation should search for these variants in order to illuminate their potential evolutionary and functional relevance.


Asunto(s)
Biología Computacional/métodos , Variaciones en el Número de Copia de ADN/genética , Duplicación de Gen , Retroelementos/genética , Secuencia de Bases , Evolución Biológica , Mapeo Cromosómico , Humanos , Intrones , Fenotipo , Análisis de Secuencia de ADN , Eliminación de Secuencia
10.
Genomics ; 105(5-6): 265-72, 2015 May.
Artículo en Inglés | MEDLINE | ID: mdl-25666663

RESUMEN

Somatically acquired chromosomal rearrangements occur at early stages during tumorigenesis and can be used to indirectly detect tumor cells, serving as highly sensitive and tumor-specific biomarkers. Advances in high-throughput sequencing have allowed the genome-wide identification of patient-specific chromosomal rearrangements to be used as personalized biomarkers to efficiently assess response to treatment, detect residual disease and monitor disease recurrence. However, sequencing and data processing costs still represent major obstacles for the widespread application of personalized biomarkers in oncology. We developed a computational pipeline (ICRmax) for the cost-effective identification of a minimal set of tumor-specific interchromosomal rearrangements (ICRs). We examined ICRmax performance on sequencing data from rectal tumors and simulated data achieving an average accuracy of 68% for ICR identification. ICRmax identifies ICRs from low-coverage sequenced tumors, eliminates the need to sequence a matched normal tissue and significantly reduces the costs that limit the utilization of personalized biomarkers in the clinical setting.


Asunto(s)
Biomarcadores de Tumor/metabolismo , Aberraciones Cromosómicas , Biología Computacional/métodos , Neoplasias/diagnóstico , Humanos
11.
Bioinformatics ; 29(9): 1235-7, 2013 May 01.
Artículo en Inglés | MEDLINE | ID: mdl-23457042

RESUMEN

MOTIVATION: Retrocopies are copies of mature RNAs that are usually devoid of regulatory sequences and introns. They have routinely been classified as processed pseudo-genes with little or no biological relevance. However, recent findings have revealed functional roles for retrocopies, as well as their high frequency in some organisms, such as primates. Despite their increasing importance, there is no user-friendly and publicly available resource for the study of retrocopies. RESULTS: Here, we present RCPedia, an integrative and user-friendly database designed for the study of retrocopied genes. RCPedia contains a complete catalogue of the retrocopies that are known to be present in human and five other primate genomes, their genomic context, inter-species conservation and gene expression data. RCPedia also offers a streamlined data representation and an efficient query system. AVAILABILITY AND IMPLEMENTATION: RCPedia is available at http://www.bioinfo.mochsl.org.br/rcpedia.


Asunto(s)
Bases de Datos Genéticas , Genes , Animales , Exones , Genoma , Humanos , Primates
12.
Oncotarget ; 15: 200-218, 2024 Mar 14.
Artículo en Inglés | MEDLINE | ID: mdl-38484152

RESUMEN

We describe the analytical validation of NeXT Personal®, an ultra-sensitive, tumor-informed circulating tumor DNA (ctDNA) assay for detecting residual disease, monitoring therapy response, and detecting recurrence in patients diagnosed with solid tumor cancers. NeXT Personal uses whole genome sequencing of tumor and matched normal samples combined with advanced analytics to accurately identify up to ~1,800 somatic variants specific to the patient's tumor. A personalized panel is created, targeting these variants and then used to sequence cell-free DNA extracted from patient plasma samples for ultra-sensitive detection of ctDNA. The NeXT Personal analytical validation is based on panels designed from tumor and matched normal samples from two cell lines, and from 123 patients across nine cancer types. Analytical measurements demonstrated a detection threshold of 1.67 parts per million (PPM) with a limit of detection at 95% (LOD95) of 3.45 PPM. NeXT Personal showed linearity over a range of 0.8 to 300,000 PPM (Pearson correlation coefficient = 0.9998). Precision varied from a coefficient of variation of 12.8% to 3.6% over a range of 25 to 25,000 PPM. The assay targets 99.9% specificity, with this validation study measuring 100% specificity and in silico methods giving us a confidence interval of 99.92 to 100%. In summary, this study demonstrates NeXT Personal as an ultra-sensitive, highly quantitative and robust ctDNA assay that can be used to detect residual disease, monitor treatment response, and detect recurrence in patients.


Asunto(s)
ADN Tumoral Circulante , Neoplasias , Humanos , ADN Tumoral Circulante/genética , Mutación , Neoplasias/diagnóstico , Neoplasias/genética , ADN de Neoplasias/genética , Bioensayo , Biomarcadores de Tumor/genética
13.
Nucleic Acids Res ; 39(14): 6056-68, 2011 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-21493686

RESUMEN

Although patterns of somatic alterations have been reported for tumor genomes, little is known on how they compare with alterations present in non-tumor genomes. A comparison of the two would be crucial to better characterize the genetic alterations driving tumorigenesis. We sequenced the genomes of a lymphoblastoid (HCC1954BL) and a breast tumor (HCC1954) cell line derived from the same patient and compared the somatic alterations present in both. The lymphoblastoid genome presents a comparable number and similar spectrum of nucleotide substitutions to that found in the tumor genome. However, a significant difference in the ratio of non-synonymous to synonymous substitutions was observed between both genomes (P = 0.031). Protein-protein interaction analysis revealed that mutations in the tumor genome preferentially affect hub-genes (P = 0.0017) and are co-selected to present synergistic functions (P < 0.0001). KEGG analysis showed that in the tumor genome most mutated genes were organized into signaling pathways related to tumorigenesis. No such organization or synergy was observed in the lymphoblastoid genome. Our results indicate that endogenous mutagens and replication errors can generate the overall number of mutations required to drive tumorigenesis and that it is the combination rather than the frequency of mutations that is crucial to complete tumorigenic transformation.


Asunto(s)
Neoplasias de la Mama/genética , Variación Genética , Genoma Humano , Línea Celular Transformada , Línea Celular Tumoral , Aberraciones Cromosómicas , Femenino , Humanos , Linfocitos , Persona de Mediana Edad , Mutación , Mutación Puntual , Mapeo de Interacción de Proteínas , Análisis de Secuencia de ADN
14.
RNA Biol ; 9(11): 1339-43, 2012 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-23064119

RESUMEN

Understanding alternative splicing is crucial to elucidate the mechanisms behind several biological phenomena, including diseases. The huge amount of expressed sequences available nowadays represents an opportunity and a challenge to catalog and display alternative splicing events (ASEs). Although several groups have faced this challenge with relative success, we still lack a computational tool that uses a simple and straightforward method to retrieve, name and present ASEs. Here we present SPLOOCE, a portal for the analysis of human splicing variants. SPLOOCE uses a method based on regular expressions for retrieval of ASEs. We propose a simple syntax that is able to capture the complexity of ASEs.


Asunto(s)
Empalme Alternativo , Biología Computacional , Bases de Datos de Ácidos Nucleicos , Sitios de Empalme de ARN , Humanos , Internet , Análisis de Secuencia por Matrices de Oligonucleótidos
15.
Clin Cancer Res ; 27(15): 4265-4276, 2021 08 01.
Artículo en Inglés | MEDLINE | ID: mdl-34341053

RESUMEN

PURPOSE: While immune checkpoint blockade (ICB) has become a pillar of cancer treatment, biomarkers that consistently predict patient response remain elusive due to the complex mechanisms driving immune response to tumors. We hypothesized that a multi-dimensional approach modeling both tumor and immune-related molecular mechanisms would better predict ICB response than simpler mutation-focused biomarkers, such as tumor mutational burden (TMB). EXPERIMENTAL DESIGN: Tumors from a cohort of patients with late-stage melanoma (n = 51) were profiled using an immune-enhanced exome and transcriptome platform. We demonstrate increasing predictive power with deeper modeling of neoantigens and immune-related resistance mechanisms to ICB. RESULTS: Our neoantigen burden score, which integrates both exome and transcriptome features, more significantly stratified responders and nonresponders (P = 0.016) than TMB alone (P = 0.049). Extension of this model to include immune-related resistance mechanisms affecting the antigen presentation machinery, such as HLA allele-specific LOH, resulted in a composite neoantigen presentation score (NEOPS) that demonstrated further increased association with therapy response (P = 0.002). CONCLUSIONS: NEOPS proved the statistically strongest biomarker compared with all single-gene biomarkers, expression signatures, and TMB biomarkers evaluated in this cohort. Subsequent confirmation of these findings in an independent cohort of patients (n = 110) suggests that NEOPS is a robust, novel biomarker of ICB response in melanoma.


Asunto(s)
Resistencia a Antineoplásicos/inmunología , Melanoma/tratamiento farmacológico , Melanoma/inmunología , Modelos Inmunológicos , Predicción , Humanos , Resultado del Tratamiento
16.
Cancer Res ; 81(16): 4194-4204, 2021 08 15.
Artículo en Inglés | MEDLINE | ID: mdl-34045189

RESUMEN

STK11 (liver kinase B1, LKB1) is the fourth most frequently mutated gene in lung adenocarcinoma, with loss of function observed in up to 30% of all cases. Our previous work identified a 16-gene signature for LKB1 loss of function through mutational and nonmutational mechanisms. In this study, we applied this genetic signature to The Cancer Genome Atlas (TCGA) lung adenocarcinoma samples and discovered a novel association between LKB1 loss and widespread DNA demethylation. LKB1-deficient tumors showed depletion of S-adenosyl-methionine (SAM-e), which is the primary substrate for DNMT1 activity. Lower methylation following LKB1 loss involved repetitive elements (RE) and altered RE transcription, as well as decreased sensitivity to azacytidine. Demethylated CpGs were enriched for FOXA family consensus binding sites, and nuclear expression, localization, and turnover of FOXA was dependent upon LKB1. Overall, these findings demonstrate that a large number of lung adenocarcinomas exhibit global hypomethylation driven by LKB1 loss, which has implications for both epigenetic therapy and immunotherapy in these cancers. SIGNIFICANCE: Lung adenocarcinomas with LKB1 loss demonstrate global genomic hypomethylation associated with depletion of SAM-e, reduced expression of DNMT1, and increased transcription of repetitive elements.


Asunto(s)
Quinasas de la Proteína-Quinasa Activada por el AMP/fisiología , Adenocarcinoma/genética , Metilación de ADN , Neoplasias Pulmonares/genética , S-Adenosilmetionina/metabolismo , Quinasas de la Proteína-Quinasa Activada por el AMP/genética , Adenocarcinoma/metabolismo , Línea Celular , Supervivencia Celular , Análisis por Conglomerados , Biología Computacional , Islas de CpG , Bases de Datos Genéticas , Epigénesis Genética , Genes ras , Humanos , Neoplasias Pulmonares/metabolismo , Metionina , Mutación , Análisis de Secuencia por Matrices de Oligonucleótidos , Proteínas Proto-Oncogénicas p21(ras)/genética , Secuencias Repetitivas de Ácidos Nucleicos
17.
Nat Genet ; 52(3): 306-319, 2020 03.
Artículo en Inglés | MEDLINE | ID: mdl-32024998

RESUMEN

About half of all cancers have somatic integrations of retrotransposons. Here, to characterize their role in oncogenesis, we analyzed the patterns and mechanisms of somatic retrotransposition in 2,954 cancer genomes from 38 histological cancer subtypes within the framework of the Pan-Cancer Analysis of Whole Genomes (PCAWG) project. We identified 19,166 somatically acquired retrotransposition events, which affected 35% of samples and spanned a range of event types. Long interspersed nuclear element (LINE-1; L1 hereafter) insertions emerged as the first most frequent type of somatic structural variation in esophageal adenocarcinoma, and the second most frequent in head-and-neck and colorectal cancers. Aberrant L1 integrations can delete megabase-scale regions of a chromosome, which sometimes leads to the removal of tumor-suppressor genes, and can induce complex translocations and large-scale duplications. Somatic retrotranspositions can also initiate breakage-fusion-bridge cycles, leading to high-level amplification of oncogenes. These observations illuminate a relevant role of L1 retrotransposition in remodeling the cancer genome, with potential implications for the development of human tumors.


Asunto(s)
Carcinogénesis/genética , Reordenamiento Génico/genética , Genoma Humano/genética , Elementos de Nucleótido Esparcido Largo/genética , Neoplasias/genética , Retroelementos/genética , Humanos , Neoplasias/patología
18.
Genome Biol ; 20(1): 109, 2019 05 29.
Artículo en Inglés | MEDLINE | ID: mdl-31142351

RESUMEN

Data science allows the extraction of practical insights from large-scale data. Here, we contextualize it as an umbrella term, encompassing several disparate subdomains. We focus on how genomics fits as a specific application subdomain, in terms of well-known 3 V data and 4 M process frameworks (volume-velocity-variety and measurement-mining-modeling-manipulation, respectively). We further analyze the technical and cultural "exports" and "imports" between genomics and other data-science subdomains (e.g., astronomy). Finally, we discuss how data value, privacy, and ownership are pressing issues for data science applications, in general, and are especially relevant to genomics, due to the persistent nature of DNA.


Asunto(s)
Ciencia de los Datos , Genómica
19.
BMC Med Genomics ; 12(1): 104, 2019 07 09.
Artículo en Inglés | MEDLINE | ID: mdl-31288802

RESUMEN

BACKGROUND: Different pathogenic germline mutations in the RET oncogene are identified in MEN 2, a hereditary syndrome characterized by medullary thyroid carcinoma (MTC) and other endocrine tumors. Although genetic predisposition is recognized, not all RET mutation carriers will develop the disease during their lifetime or, likewise, RET mutation carriers belonging to the same family may present clinical heterogeneity. It has been suggested that a single germline mutation might not be sufficient for development of MEN 2-associated tumors and a somatic bi-allelic alteration might be required. Here we investigated the presence of somatic second hit mutation in the RET gene in MTC. METHODS: We integrated Multiplex Ligation-dependent Probe Amplification (MLPA) and whole exome sequencing (WES) to search for copy number alteration (CNA) in the RET gene in MTC samples and medullary thyroid cell lines (TT and MZ-CR-1). We next found reads spanning exon-exon boundaries on RET, an indicative of retrocopy. We subsequently searched for RET retrocopies in the human reference genome (GRCh37) and in the 1000 Genomes Project data, by looking for reads reporting joined exons in the RET locus or distinct genomic regions. To determine RET retrocopy specificity and recurrence, DNA isolated from sporadic and MEN 2-associated MTC (n = 37), peripheral blood (n = 3) and papillary thyroid carcinomas with RET fusion (n = 10) samples were tested using PCR-sequencing methodology. RESULTS: Through MLPA we have found evidence of CNA in the RET gene in MTC samples and MTC cell lines. WES analysis reinforced the presence of the CNA and hinted for a retroposed copy of RET not found in the human reference genome and 1.000 Genomes Project. Extended analysis confirmed the presence of a somatic MTC-related retrocopy of RET in both sporadic and hereditary tumors. We further unveiled a recurrent (28%) novel point mutation (p.G548 V) found exclusively in the retrocopy of RET. The mutation was also found in cDNA of mutated samples, suggesting it might be functional. CONCLUSION: We here report a somatic specific RET retroposed copy in MTC samples and cell lines. Our results support the idea that generation of retrocopies in somatic cells is likely to contribute to MTC genesis and progression.


Asunto(s)
Carcinoma Neuroendocrino/genética , Dosificación de Gen/genética , Proteínas Proto-Oncogénicas c-ret/genética , Retroelementos/genética , Neoplasias de la Tiroides/genética , Carcinoma Neuroendocrino/patología , Línea Celular Tumoral , Femenino , Humanos , Masculino , Neoplasias de la Tiroides/patología
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA