Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 67
Filtrar
1.
Database (Oxford) ; 20242024 Apr 30.
Artículo en Inglés | MEDLINE | ID: mdl-38687868

RESUMEN

Cancer cell lines are an important component in biological and medical research, enabling studies of cellular mechanisms as well as the development and testing of pharmaceuticals. Genomic alterations in cancer cell lines are widely studied as models for oncogenetic events and are represented in a wide range of primary resources. We have created a comprehensive, curated knowledge resource-cancercelllines.org-with the aim to enable easy access to genomic profiling data in cancer cell lines, curated from a variety of resources and integrating both copy number and single nucleotide variants data. We have gathered over 5600 copy number profiles as well as single nucleotide variant annotations for 16 000 cell lines and provide these data with mappings to the GRCh38 reference genome. Both genomic variations and associated curated metadata can be queried through the GA4GH Beacon v2 Application Programming Interface (API) and a graphical user interface with extensive data retrieval enabled using GA4GH data schemas under a permissive licensing scheme. Database URL: https://cancercelllines.org.


Asunto(s)
Bases de Datos Genéticas , Genómica , Neoplasias , Humanos , Línea Celular Tumoral , Neoplasias/genética , Genómica/métodos , Variaciones en el Número de Copia de ADN/genética , Interfaz Usuario-Computador , Polimorfismo de Nucleótido Simple
2.
Bioinform Adv ; 4(1): vbae045, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38560553

RESUMEN

Motivation: With the proliferation of research means and computational methodologies, published biomedical literature is growing exponentially in numbers and volume. Cancer cell lines are frequently used models in biological and medical research that are currently applied for a wide range of purposes, from studies of cellular mechanisms to drug development, which has led to a wealth of related data and publications. Sifting through large quantities of text to gather relevant information on cell lines of interest is tedious and extremely slow when performed by humans. Hence, novel computational information extraction and correlation mechanisms are required to boost meaningful knowledge extraction. Results: In this work, we present the design, implementation, and application of a novel data extraction and exploration system. This system extracts deep semantic relations between textual entities from scientific literature to enrich existing structured clinical data concerning cancer cell lines. We introduce a new public data exploration portal, which enables automatic linking of genomic copy number variants plots with ranked, related entities such as affected genes. Each relation is accompanied by literature-derived evidences, allowing for deep, yet rapid, literature search, using existing structured data as a springboard. Availability and implementation: Our system is publicly available on the web at https://cancercelllines.org.

4.
Brief Bioinform ; 25(2)2024 Jan 22.
Artículo en Inglés | MEDLINE | ID: mdl-38300514

RESUMEN

Somatic copy number alterations (SCNAs) are a predominant type of oncogenomic alterations that affect a large proportion of the genome in the majority of cancer samples. Current technologies allow high-throughput measurement of such copy number aberrations, generating results consisting of frequently large sets of SCNA segments. However, the automated annotation and integration of such data are particularly challenging because the measured signals reflect biased, relative copy number ratios. In this study, we introduce labelSeg, an algorithm designed for rapid and accurate annotation of CNA segments, with the aim of enhancing the interpretation of tumor SCNA profiles. Leveraging density-based clustering and exploiting the length-amplitude relationships of SCNA, our algorithm proficiently identifies distinct relative copy number states from individual segment profiles. Its compatibility with most CNA measurement platforms makes it suitable for large-scale integrative data analysis. We confirmed its performance on both simulated and sample-derived data from The Cancer Genome Atlas reference dataset, and we demonstrated its utility in integrating heterogeneous segment profiles from different data sources and measurement platforms. Our comparative and integrative analysis revealed common SCNA patterns in cancer and protein-coding genes with a strong correlation between SCNA and messenger RNA expression, promoting the investigation into the role of SCNA in cancer development.


Asunto(s)
Variaciones en el Número de Copia de ADN , Neoplasias , Humanos , Neoplasias/genética , Algoritmos , Análisis por Conglomerados , Análisis de Datos
5.
Sci Rep ; 14(1): 3331, 2024 02 09.
Artículo en Inglés | MEDLINE | ID: mdl-38336885

RESUMEN

Short tandem repeat (STR) mutations are prevalent in colorectal cancer (CRC), especially in tumours with the microsatellite instability (MSI) phenotype. While STR length variations are known to regulate gene expression under physiological conditions, the functional impact of STR mutations in CRC remains unclear. Here, we integrate STR mutation data with clinical information and gene expression data to study the gene regulatory effects of STR mutations in CRC. We confirm that STR mutability in CRC highly depends on the MSI status, repeat unit size, and repeat length. Furthermore, we present a set of 1244 putative expression STRs (eSTRs) for which the STR length is associated with gene expression levels in CRC tumours. The length of 73 eSTRs is associated with expression levels of cancer-related genes, nine of which are CRC-specific genes. We show that linear models describing eSTR-gene expression relationships allow for predictions of gene expression changes in response to eSTR mutations. Moreover, we found an increased mutability of eSTRs in MSI tumours. Our evidence of gene regulatory roles for eSTRs in CRC highlights a mostly overlooked way through which tumours may modulate their phenotypes. Future extensions of these findings could uncover new STR-based targets in the treatment of cancer.


Asunto(s)
Neoplasias Colorrectales , Repeticiones de Microsatélite , Humanos , Repeticiones de Microsatélite/genética , Mutación , Inestabilidad de Microsatélites , Neoplasias Colorrectales/patología , Expresión Génica
6.
PLoS One ; 18(5): e0285433, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37196000

RESUMEN

The Global Alliance for Genomics and Health (GA4GH) is a standards-setting organization that is developing a suite of coordinated standards for genomics. The GA4GH Phenopacket Schema is a standard for sharing disease and phenotype information that characterizes an individual person or biosample. The Phenopacket Schema is flexible and can represent clinical data for any kind of human disease including rare disease, complex disease, and cancer. It also allows consortia or databases to apply additional constraints to ensure uniform data collection for specific goals. We present phenopacket-tools, an open-source Java library and command-line application for construction, conversion, and validation of phenopackets. Phenopacket-tools simplifies construction of phenopackets by providing concise builders, programmatic shortcuts, and predefined building blocks (ontology classes) for concepts such as anatomical organs, age of onset, biospecimen type, and clinical modifiers. Phenopacket-tools can be used to validate the syntax and semantics of phenopackets as well as to assess adherence to additional user-defined requirements. The documentation includes examples showing how to use the Java library and the command-line tool to create and validate phenopackets. We demonstrate how to create, convert, and validate phenopackets using the library or the command-line application. Source code, API documentation, comprehensive user guide and a tutorial can be found at https://github.com/phenopackets/phenopacket-tools. The library can be installed from the public Maven Central artifact repository and the application is available as a standalone archive. The phenopacket-tools library helps developers implement and standardize the collection and exchange of phenotypic and other clinical data for use in phenotype-driven genomic diagnostics, translational research, and precision medicine applications.


Asunto(s)
Neoplasias , Programas Informáticos , Humanos , Genómica , Bases de Datos Factuales , Biblioteca de Genes
7.
Adv Genet (Hoboken) ; 4(1): 2200016, 2023 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-36910590

RESUMEN

The Global Alliance for Genomics and Health (GA4GH) is developing a suite of coordinated standards for genomics for healthcare. The Phenopacket is a new GA4GH standard for sharing disease and phenotype information that characterizes an individual person, linking that individual to detailed phenotypic descriptions, genetic information, diagnoses, and treatments. A detailed example is presented that illustrates how to use the schema to represent the clinical course of a patient with retinoblastoma, including demographic information, the clinical diagnosis, phenotypic features and clinical measurements, an examination of the extirpated tumor, therapies, and the results of genomic analysis. The Phenopacket Schema, together with other GA4GH data and technical standards, will enable data exchange and provide a foundation for the computational analysis of disease and phenotype information to improve our ability to diagnose and conduct research on all types of disorders, including cancer and rare diseases.

9.
Hum Mutat ; 43(6): 791-799, 2022 06.
Artículo en Inglés | MEDLINE | ID: mdl-35297548

RESUMEN

Beacon is a basic data discovery protocol issued by the Global Alliance for Genomics and Health (GA4GH). The main goal addressed by version 1 of the Beacon protocol was to test the feasibility of broadly sharing human genomic data, through providing simple "yes" or "no" responses to queries about the presence of a given variant in datasets hosted by Beacon providers. The popularity of this concept has fostered the design of a version 2, that better serves real-world requirements and addresses the needs of clinical genomics research and healthcare, as assessed by several contributing projects and organizations. Particularly, rare disease genetics and cancer research will benefit from new case level and genomic variant level requests and the enabling of richer phenotype and clinical queries as well as support for fuzzy searches. Beacon is designed as a "lingua franca" to bridge data collections hosted in software solutions with different and rich interfaces. Beacon version 2 works alongside popular standards like Phenopackets, OMOP, or FHIR, allowing implementing consortia to return matches in beacon responses and provide a handover to their preferred data exchange format. The protocol is being explored by other research domains and is being tested in several international projects.


Asunto(s)
Genómica , Difusión de la Información , Humanos , Difusión de la Información/métodos , Fenotipo , Enfermedades Raras , Programas Informáticos
10.
Front Genet ; 13: 1017657, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36726722

RESUMEN

Genome variation is the direct cause of cancer and driver of its clonal evolution. While the impact of many point mutations can be evaluated through their modification of individual genomic elements, even a single copy number aberration (CNA) may encompass hundreds of genes and therefore pose challenges to untangle potentially complex functional effects. However, consistent, recurring and disease-specific patterns in the genome-wide CNA landscape imply that particular CNA may promote cancer-type-specific characteristics. Discerning essential cancer-promoting alterations from the inherent co-dependency in CNA would improve the understanding of mechanisms of CNA and provide new insights into cancer biology and potential therapeutic targets. Here we implement a model using segmental breakpoints to discover non-random gene coverage by copy number deletion (CND). With a diverse set of cancer types from multiple resources, this model identified common and cancer-type-specific oncogenes and tumor suppressor genes as well as cancer-promoting functional pathways. Confirmed by differential expression analysis of data from corresponding cancer types, the results show that for most cancer types, despite dissimilarity of their CND landscapes, similar canonical pathways are affected. In 25 analyses of 17 cancer types, we have identified 19 to 169 significant genes by copy deletion, including RB1, PTEN and CDKN2A as the most significantly deleted genes among all cancer types. We have also shown a shared dependence on core pathways for cancer progression in different cancers as well as cancer type separation by genome-wide significance scores. While this work provides a reference for gene specific significance in many cancers, it chiefly contributes a general framework to derive genome-wide significance and molecular insights in CND profiles with a potential for the analysis of rare cancer types as well as non-coding regions.

11.
Database (Oxford) ; 20212021 07 17.
Artículo en Inglés | MEDLINE | ID: mdl-34272855

RESUMEN

In cancer, copy number aberrations (CNAs) represent a type of nearly ubiquitous and frequently extensive structural genome variations. To disentangle the molecular mechanisms underlying tumorigenesis as well as identify and characterize molecular subtypes, the comparative and meta-analysis of large genomic variant collections can be of immense importance. Over the last decades, cancer genomic profiling projects have resulted in a large amount of somatic genome variation profiles, however segregated in a multitude of individual studies and datasets. The Progenetix project, initiated in 2001, curates individual cancer CNA profiles and associated metadata from published oncogenomic studies and data repositories with the aim to empower integrative analyses spanning all different cancer biologies. During the last few years, the fields of genomics and cancer research have seen significant advancement in terms of molecular genetics technology, disease concepts, data standard harmonization as well as data availability, in an increasingly structured and systematic manner. For the Progenetix resource, continuous data integration, curation and maintenance have resulted in the most comprehensive representation of cancer genome CNA profiling data with 138 663 (including 115 357 tumor) copy number variation (CNV) profiles. In this article, we report a 4.5-fold increase in sample number since 2013, improvements in data quality, ontology representation with a CNV landscape summary over 51 distinctive National Cancer Institute Thesaurus cancer terms as well as updates in database schemas, and data access including new web front-end and programmatic data access. Database URL: progenetix.org.


Asunto(s)
Variaciones en el Número de Copia de ADN , Neoplasias , Variaciones en el Número de Copia de ADN/genética , Genoma , Genómica , Humanos , Neoplasias/genética
12.
Front Genet ; 12: 654887, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34054918

RESUMEN

Copy number aberrations (CNA) are one of the most important classes of genomic mutations related to oncogenetic effects. In the past three decades, a vast amount of CNA data has been generated by molecular-cytogenetic and genome sequencing based methods. While this data has been instrumental in the identification of cancer-related genes and promoted research into the relation between CNA and histo-pathologically defined cancer types, the heterogeneity of source data and derived CNV profiles pose great challenges for data integration and comparative analysis. Furthermore, a majority of existing studies have been focused on the association of CNA to pre-selected "driver" genes with limited application to rare drivers and other genomic elements. In this study, we developed a bioinformatics pipeline to integrate a collection of 44,988 high-quality CNA profiles of high diversity. Using a hybrid model of neural networks and attention algorithm, we generated the CNA signatures of 31 cancer subtypes, depicting the uniqueness of their respective CNA landscapes. Finally, we constructed a multi-label classifier to identify the cancer type and the organ of origin from copy number profiling data. The investigation of the signatures suggested common patterns, not only of physiologically related cancer types but also of clinico-pathologically distant cancer types such as different cancers originating from the neural crest. Further experiments of classification models confirmed the effectiveness of the signatures in distinguishing different cancer types and demonstrated their potential in tumor classification.

13.
Cell Genom ; 1(2)2021 Nov 10.
Artículo en Inglés | MEDLINE | ID: mdl-35128509

RESUMEN

We promote a shared vision and guide for how and when to federate genomic and health-related data sharing, enabling connections and insights across independent, secure databases. The GA4GH encourages a federated approach wherein data providers have the mandate and resources to share, but where data cannot move for legal or technical reasons. We recommend a federated approach to connect national genomics initiatives into a global network and precision medicine resource.

14.
Cell Genom ; 1(2)2021 Nov 10.
Artículo en Inglés | MEDLINE | ID: mdl-35072136

RESUMEN

The Global Alliance for Genomics and Health (GA4GH) aims to accelerate biomedical advances by enabling the responsible sharing of clinical and genomic data through both harmonized data aggregation and federated approaches. The decreasing cost of genomic sequencing (along with other genome-wide molecular assays) and increasing evidence of its clinical utility will soon drive the generation of sequence data from tens of millions of humans, with increasing levels of diversity. In this perspective, we present the GA4GH strategies for addressing the major challenges of this data revolution. We describe the GA4GH organization, which is fueled by the development efforts of eight Work Streams and informed by the needs of 24 Driver Projects and other key stakeholders. We present the GA4GH suite of secure, interoperable technical standards and policy frameworks and review the current status of standards, their relevance to key domains of research and clinical care, and future plans of GA4GH. Broad international participation in building, adopting, and deploying GA4GH standards and frameworks will catalyze an unprecedented effort in data sharing that will be critical to advancing genomic medicine and ensuring that all populations can access its benefits.

15.
Cell Genom ; 1(2)2021 Nov 10.
Artículo en Inglés | MEDLINE | ID: mdl-35311178

RESUMEN

Maximizing the personal, public, research, and clinical value of genomic information will require the reliable exchange of genetic variation data. We report here the Variation Representation Specification (VRS, pronounced "verse"), an extensible framework for the computable representation of variation that complements contemporary human-readable and flat file standards for genomic variation representation. VRS provides semantically precise representations of variation and leverages this design to enable federated identification of biomolecular variation with globally consistent and unique computed identifiers. The VRS framework includes a terminology and information model, machine-readable schema, data sharing conventions, and a reference implementation, each of which is intended to be broadly useful and freely available for community use. VRS was developed by a partnership among national information resource providers, public initiatives, and diagnostic testing laboratories under the auspices of the Global Alliance for Genomics and Health (GA4GH).

16.
Front Oncol ; 10: 584095, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-33344238

RESUMEN

Copy number aberrations (CNV/CNA) represent a major contribution to the somatic mutation landscapes in cancers, and their identification can lead to the discovery of oncogenetic targets as well as improved disease (sub-) classification. Diffuse large B cell lymphoma (DLBCL) is the most common lymphoma in Western Countries and up to 40% of the affected individuals still succumb to the disease. DLBCL is an heterogenous group of disorders, and we call DLBCL today is not necessarily the same disease of a few years ago. This review focuses on types and frequencies of regional DNA CNVs in DLBCL, not otherwise specified, and in two particular conditions, the transformation from indolent lymphomas and the DLBCL in individuals with immunodeficiency.

17.
Cell Rep ; 32(5): 107985, 2020 08 04.
Artículo en Inglés | MEDLINE | ID: mdl-32755579

RESUMEN

PARP inhibitors (PARPi) cause synthetic lethality in BRCA-deficient tumors. Whether specific vulnerabilities to PARPi exist beyond BRCA mutations and related defects in homology-directed repair (HDR) is not well understood. Here, we identify the ubiquitin E3 ligase TRIP12 as negative regulator of PARPi sensitivity. We show that TRIP12 controls steady-state PARP1 levels and limits PARPi-induced cytotoxic PARP1 trapping. Upon loss of TRIP12, elevated PARPi-induced PARP1 trapping causes increased DNA replication stress, DNA damage, cell cycle arrest, and cell death. Mechanistically, we demonstrate that TRIP12 binds PARP1 via a central PAR-binding WWE domain and, using its carboxy-terminal HECT domain, catalyzes polyubiquitylation of PARP1, triggering proteasomal degradation and preventing supra-physiological PARP1 accumulation. Further, in cohorts of breast and ovarian cancer patients, PARP1 abundance is negatively correlated with TRIP12 expression. We thus propose TRIP12 as regulator of PARP1 stability and PARPi-induced PARP trapping, with potential implications for PARPi sensitivity and resistance.


Asunto(s)
Proteínas Portadoras/metabolismo , Poli(ADP-Ribosa) Polimerasa-1/metabolismo , Inhibidores de Poli(ADP-Ribosa) Polimerasas/farmacología , Ubiquitina-Proteína Ligasas/metabolismo , Secuencia de Aminoácidos , Proteínas Portadoras/química , Línea Celular Tumoral , Daño del ADN , Regulación hacia Abajo/efectos de los fármacos , Células HEK293 , Humanos , Modelos Biológicos , Mutágenos/toxicidad , Neoplasias/patología , Poli ADP Ribosilación/efectos de los fármacos , Poli Adenosina Difosfato Ribosa/metabolismo , Complejo de la Endopetidasa Proteasomal/metabolismo , Unión Proteica/efectos de los fármacos , Dominios Proteicos , Estabilidad Proteica/efectos de los fármacos , Proteolisis/efectos de los fármacos , Transducción de Señal/efectos de los fármacos , Ubiquitina-Proteína Ligasas/química , Ubiquitinación/efectos de los fármacos
18.
Genomics ; 112(5): 3331-3341, 2020 09.
Artículo en Inglés | MEDLINE | ID: mdl-32413400

RESUMEN

BACKGROUND: Copy number variations (CNV) are regional deviations from the normal autosomal bi-allelic DNA content. While germline CNVs are a major contributor to genomic syndromes and inherited diseases, the majority of cancers accumulate extensive "somatic" CNV (sCNV or CNA) during the process of oncogenetic transformation and progression. While specific sCNV have closely been associated with tumorigenesis, intriguingly many neoplasias exhibit recurrent sCNV patterns beyond the involvement of a few cancer driver genes. Currently, CNV profiles of tumor samples are generated using genomic micro-arrays or high-throughput DNA sequencing. Regardless of the underlying technology, genomic copy number data is derived from the relative assessment and integration of multiple signals, with the data generation process being prone to contamination from several sources. Estimated copy number values have no absolute or strictly linear correlation to their corresponding DNA levels, and the extent of deviation differs between sample profiles, which poses a great challenge for data integration and comparison in large scale genome analysis. RESULTS: In this study, we present a novel method named "Minimum Error Calibration and Normalization for Copy Numbers Analysis" (Mecan4CNA). It only requires CNV segmentation files as input, is platform independent, and has a high performance with limited hardware requirements. For a given multi-sample copy number dataset, Mecan4CNA can batch-normalize all samples to the corresponding true copy number levels of the main tumor clones. Experiments of Mecan4CNA on simulated data showed an overall accuracy of 93% and 91% in determining the normal level and single copy alteration (i.e. duplication or loss of one allele), respectively. Comparison of estimated normal levels and single copy alternations with existing methods and karyotyping data on the NCI-60 tumor cell line produced coherent results. To estimate the method's impact on downstream analyses, we performed GISTIC analyses on the original and Mecan4CNA normalized data from the Cancer Genome Atlas (TCGA) where the normalized data showed prominent improvements of both sensitivity and specificity in detecting focal regions. CONCLUSIONS: Mecan4CNA provides an advanced method for CNA data normalization, especially in meta-analyses involving large profile numbers and heterogeneous source data quality. With its informative output and visualization options, Mecan4CNA also can improve the interpretation of individual CNA profiles. Mecan4CNA is freely available as a Python package and through its code repository on Github.


Asunto(s)
Variaciones en el Número de Copia de ADN , Genómica/métodos , Línea Celular Tumoral , Humanos , Programas Informáticos
19.
Oncology ; 98(6): 329-331, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32408309

RESUMEN

Oncology has undergone rapid progress, with emerging developments in areas including cancer stem cells, molecularly targeted therapies, genomic analyses, and individually tailored immunotherapy. These advances have expanded the tools available in the fight against cancer. Some of these have seen broad media coverage resulting in justified public attention. However, these achievements have only been possible due to rapid developments in the expanding field of biomedical informatics and information technology (IT). Artificial intelligence, radiomics, electronic health records, and electronic patient-reported outcome measures (ePROMS) are only a few of the developments enabling further progress in oncology. The promising impact of IT in oncology will only become reality through a multidisciplinary approach to the complex challenges ahead.


Asunto(s)
Oncología Médica/métodos , Neoplasias/inmunología , Neoplasias/terapia , Inteligencia Artificial , Comunicación , Humanos , Inmunoterapia/métodos , Medición de Resultados Informados por el Paciente
20.
Nat Genet ; 52(4): 448-457, 2020 04.
Artículo en Inglés | MEDLINE | ID: mdl-32246132

RESUMEN

Precision oncology relies on accurate discovery and interpretation of genomic variants, enabling individualized diagnosis, prognosis and therapy selection. We found that six prominent somatic cancer variant knowledgebases were highly disparate in content, structure and supporting primary literature, impeding consensus when evaluating variants and their relevance in a clinical setting. We developed a framework for harmonizing variant interpretations to produce a meta-knowledgebase of 12,856 aggregate interpretations. We demonstrated large gains in overlap between resources across variants, diseases and drugs as a result of this harmonization. We subsequently demonstrated improved matching between a patient cohort and harmonized interpretations of potential clinical significance, observing an increase from an average of 33% per individual knowledgebase to 57% in aggregate. Our analyses illuminate the need for open, interoperable sharing of variant interpretation data. We also provide a freely available web interface (search.cancervariants.org) for exploring the harmonized interpretations from these six knowledgebases.


Asunto(s)
Variación Genética/genética , Neoplasias/genética , Bases de Datos Genéticas , Diploidia , Genómica/métodos , Humanos , Bases del Conocimiento , Medicina de Precisión/métodos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA