Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 67
Filtrar
1.
Bioinform Adv ; 4(1): vbae045, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38560553

RESUMO

Motivation: With the proliferation of research means and computational methodologies, published biomedical literature is growing exponentially in numbers and volume. Cancer cell lines are frequently used models in biological and medical research that are currently applied for a wide range of purposes, from studies of cellular mechanisms to drug development, which has led to a wealth of related data and publications. Sifting through large quantities of text to gather relevant information on cell lines of interest is tedious and extremely slow when performed by humans. Hence, novel computational information extraction and correlation mechanisms are required to boost meaningful knowledge extraction. Results: In this work, we present the design, implementation, and application of a novel data extraction and exploration system. This system extracts deep semantic relations between textual entities from scientific literature to enrich existing structured clinical data concerning cancer cell lines. We introduce a new public data exploration portal, which enables automatic linking of genomic copy number variants plots with ranked, related entities such as affected genes. Each relation is accompanied by literature-derived evidences, allowing for deep, yet rapid, literature search, using existing structured data as a springboard. Availability and implementation: Our system is publicly available on the web at https://cancercelllines.org.

2.
Database (Oxford) ; 20242024 Apr 30.
Artigo em Inglês | MEDLINE | ID: mdl-38687868

RESUMO

Cancer cell lines are an important component in biological and medical research, enabling studies of cellular mechanisms as well as the development and testing of pharmaceuticals. Genomic alterations in cancer cell lines are widely studied as models for oncogenetic events and are represented in a wide range of primary resources. We have created a comprehensive, curated knowledge resource-cancercelllines.org-with the aim to enable easy access to genomic profiling data in cancer cell lines, curated from a variety of resources and integrating both copy number and single nucleotide variants data. We have gathered over 5600 copy number profiles as well as single nucleotide variant annotations for 16 000 cell lines and provide these data with mappings to the GRCh38 reference genome. Both genomic variations and associated curated metadata can be queried through the GA4GH Beacon v2 Application Programming Interface (API) and a graphical user interface with extensive data retrieval enabled using GA4GH data schemas under a permissive licensing scheme. Database URL: https://cancercelllines.org.


Assuntos
Bases de Dados Genéticas , Genômica , Neoplasias , Humanos , Linhagem Celular Tumoral , Neoplasias/genética , Genômica/métodos , Variações do Número de Cópias de DNA/genética , Interface Usuário-Computador , Polimorfismo de Nucleotídeo Único
4.
Brief Bioinform ; 25(2)2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38300514

RESUMO

Somatic copy number alterations (SCNAs) are a predominant type of oncogenomic alterations that affect a large proportion of the genome in the majority of cancer samples. Current technologies allow high-throughput measurement of such copy number aberrations, generating results consisting of frequently large sets of SCNA segments. However, the automated annotation and integration of such data are particularly challenging because the measured signals reflect biased, relative copy number ratios. In this study, we introduce labelSeg, an algorithm designed for rapid and accurate annotation of CNA segments, with the aim of enhancing the interpretation of tumor SCNA profiles. Leveraging density-based clustering and exploiting the length-amplitude relationships of SCNA, our algorithm proficiently identifies distinct relative copy number states from individual segment profiles. Its compatibility with most CNA measurement platforms makes it suitable for large-scale integrative data analysis. We confirmed its performance on both simulated and sample-derived data from The Cancer Genome Atlas reference dataset, and we demonstrated its utility in integrating heterogeneous segment profiles from different data sources and measurement platforms. Our comparative and integrative analysis revealed common SCNA patterns in cancer and protein-coding genes with a strong correlation between SCNA and messenger RNA expression, promoting the investigation into the role of SCNA in cancer development.


Assuntos
Variações do Número de Cópias de DNA , Neoplasias , Humanos , Neoplasias/genética , Algoritmos , Análise por Conglomerados , Análise de Dados
5.
Sci Rep ; 14(1): 3331, 2024 02 09.
Artigo em Inglês | MEDLINE | ID: mdl-38336885

RESUMO

Short tandem repeat (STR) mutations are prevalent in colorectal cancer (CRC), especially in tumours with the microsatellite instability (MSI) phenotype. While STR length variations are known to regulate gene expression under physiological conditions, the functional impact of STR mutations in CRC remains unclear. Here, we integrate STR mutation data with clinical information and gene expression data to study the gene regulatory effects of STR mutations in CRC. We confirm that STR mutability in CRC highly depends on the MSI status, repeat unit size, and repeat length. Furthermore, we present a set of 1244 putative expression STRs (eSTRs) for which the STR length is associated with gene expression levels in CRC tumours. The length of 73 eSTRs is associated with expression levels of cancer-related genes, nine of which are CRC-specific genes. We show that linear models describing eSTR-gene expression relationships allow for predictions of gene expression changes in response to eSTR mutations. Moreover, we found an increased mutability of eSTRs in MSI tumours. Our evidence of gene regulatory roles for eSTRs in CRC highlights a mostly overlooked way through which tumours may modulate their phenotypes. Future extensions of these findings could uncover new STR-based targets in the treatment of cancer.


Assuntos
Neoplasias Colorretais , Repetições de Microssatélites , Humanos , Repetições de Microssatélites/genética , Mutação , Instabilidade de Microssatélites , Neoplasias Colorretais/patologia , Expressão Gênica
6.
PLoS One ; 18(5): e0285433, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37196000

RESUMO

The Global Alliance for Genomics and Health (GA4GH) is a standards-setting organization that is developing a suite of coordinated standards for genomics. The GA4GH Phenopacket Schema is a standard for sharing disease and phenotype information that characterizes an individual person or biosample. The Phenopacket Schema is flexible and can represent clinical data for any kind of human disease including rare disease, complex disease, and cancer. It also allows consortia or databases to apply additional constraints to ensure uniform data collection for specific goals. We present phenopacket-tools, an open-source Java library and command-line application for construction, conversion, and validation of phenopackets. Phenopacket-tools simplifies construction of phenopackets by providing concise builders, programmatic shortcuts, and predefined building blocks (ontology classes) for concepts such as anatomical organs, age of onset, biospecimen type, and clinical modifiers. Phenopacket-tools can be used to validate the syntax and semantics of phenopackets as well as to assess adherence to additional user-defined requirements. The documentation includes examples showing how to use the Java library and the command-line tool to create and validate phenopackets. We demonstrate how to create, convert, and validate phenopackets using the library or the command-line application. Source code, API documentation, comprehensive user guide and a tutorial can be found at https://github.com/phenopackets/phenopacket-tools. The library can be installed from the public Maven Central artifact repository and the application is available as a standalone archive. The phenopacket-tools library helps developers implement and standardize the collection and exchange of phenotypic and other clinical data for use in phenotype-driven genomic diagnostics, translational research, and precision medicine applications.


Assuntos
Neoplasias , Software , Humanos , Genômica , Bases de Dados Factuais , Biblioteca Gênica
7.
Adv Genet (Hoboken) ; 4(1): 2200016, 2023 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-36910590

RESUMO

The Global Alliance for Genomics and Health (GA4GH) is developing a suite of coordinated standards for genomics for healthcare. The Phenopacket is a new GA4GH standard for sharing disease and phenotype information that characterizes an individual person, linking that individual to detailed phenotypic descriptions, genetic information, diagnoses, and treatments. A detailed example is presented that illustrates how to use the schema to represent the clinical course of a patient with retinoblastoma, including demographic information, the clinical diagnosis, phenotypic features and clinical measurements, an examination of the extirpated tumor, therapies, and the results of genomic analysis. The Phenopacket Schema, together with other GA4GH data and technical standards, will enable data exchange and provide a foundation for the computational analysis of disease and phenotype information to improve our ability to diagnose and conduct research on all types of disorders, including cancer and rare diseases.

9.
Hum Mutat ; 43(6): 791-799, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-35297548

RESUMO

Beacon is a basic data discovery protocol issued by the Global Alliance for Genomics and Health (GA4GH). The main goal addressed by version 1 of the Beacon protocol was to test the feasibility of broadly sharing human genomic data, through providing simple "yes" or "no" responses to queries about the presence of a given variant in datasets hosted by Beacon providers. The popularity of this concept has fostered the design of a version 2, that better serves real-world requirements and addresses the needs of clinical genomics research and healthcare, as assessed by several contributing projects and organizations. Particularly, rare disease genetics and cancer research will benefit from new case level and genomic variant level requests and the enabling of richer phenotype and clinical queries as well as support for fuzzy searches. Beacon is designed as a "lingua franca" to bridge data collections hosted in software solutions with different and rich interfaces. Beacon version 2 works alongside popular standards like Phenopackets, OMOP, or FHIR, allowing implementing consortia to return matches in beacon responses and provide a handover to their preferred data exchange format. The protocol is being explored by other research domains and is being tested in several international projects.


Assuntos
Genômica , Disseminação de Informação , Humanos , Disseminação de Informação/métodos , Fenótipo , Doenças Raras , Software
10.
Front Genet ; 13: 1017657, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36726722

RESUMO

Genome variation is the direct cause of cancer and driver of its clonal evolution. While the impact of many point mutations can be evaluated through their modification of individual genomic elements, even a single copy number aberration (CNA) may encompass hundreds of genes and therefore pose challenges to untangle potentially complex functional effects. However, consistent, recurring and disease-specific patterns in the genome-wide CNA landscape imply that particular CNA may promote cancer-type-specific characteristics. Discerning essential cancer-promoting alterations from the inherent co-dependency in CNA would improve the understanding of mechanisms of CNA and provide new insights into cancer biology and potential therapeutic targets. Here we implement a model using segmental breakpoints to discover non-random gene coverage by copy number deletion (CND). With a diverse set of cancer types from multiple resources, this model identified common and cancer-type-specific oncogenes and tumor suppressor genes as well as cancer-promoting functional pathways. Confirmed by differential expression analysis of data from corresponding cancer types, the results show that for most cancer types, despite dissimilarity of their CND landscapes, similar canonical pathways are affected. In 25 analyses of 17 cancer types, we have identified 19 to 169 significant genes by copy deletion, including RB1, PTEN and CDKN2A as the most significantly deleted genes among all cancer types. We have also shown a shared dependence on core pathways for cancer progression in different cancers as well as cancer type separation by genome-wide significance scores. While this work provides a reference for gene specific significance in many cancers, it chiefly contributes a general framework to derive genome-wide significance and molecular insights in CND profiles with a potential for the analysis of rare cancer types as well as non-coding regions.

11.
Database (Oxford) ; 20212021 07 17.
Artigo em Inglês | MEDLINE | ID: mdl-34272855

RESUMO

In cancer, copy number aberrations (CNAs) represent a type of nearly ubiquitous and frequently extensive structural genome variations. To disentangle the molecular mechanisms underlying tumorigenesis as well as identify and characterize molecular subtypes, the comparative and meta-analysis of large genomic variant collections can be of immense importance. Over the last decades, cancer genomic profiling projects have resulted in a large amount of somatic genome variation profiles, however segregated in a multitude of individual studies and datasets. The Progenetix project, initiated in 2001, curates individual cancer CNA profiles and associated metadata from published oncogenomic studies and data repositories with the aim to empower integrative analyses spanning all different cancer biologies. During the last few years, the fields of genomics and cancer research have seen significant advancement in terms of molecular genetics technology, disease concepts, data standard harmonization as well as data availability, in an increasingly structured and systematic manner. For the Progenetix resource, continuous data integration, curation and maintenance have resulted in the most comprehensive representation of cancer genome CNA profiling data with 138 663 (including 115 357 tumor) copy number variation (CNV) profiles. In this article, we report a 4.5-fold increase in sample number since 2013, improvements in data quality, ontology representation with a CNV landscape summary over 51 distinctive National Cancer Institute Thesaurus cancer terms as well as updates in database schemas, and data access including new web front-end and programmatic data access. Database URL: progenetix.org.


Assuntos
Variações do Número de Cópias de DNA , Neoplasias , Variações do Número de Cópias de DNA/genética , Genoma , Genômica , Humanos , Neoplasias/genética
12.
Front Genet ; 12: 654887, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34054918

RESUMO

Copy number aberrations (CNA) are one of the most important classes of genomic mutations related to oncogenetic effects. In the past three decades, a vast amount of CNA data has been generated by molecular-cytogenetic and genome sequencing based methods. While this data has been instrumental in the identification of cancer-related genes and promoted research into the relation between CNA and histo-pathologically defined cancer types, the heterogeneity of source data and derived CNV profiles pose great challenges for data integration and comparative analysis. Furthermore, a majority of existing studies have been focused on the association of CNA to pre-selected "driver" genes with limited application to rare drivers and other genomic elements. In this study, we developed a bioinformatics pipeline to integrate a collection of 44,988 high-quality CNA profiles of high diversity. Using a hybrid model of neural networks and attention algorithm, we generated the CNA signatures of 31 cancer subtypes, depicting the uniqueness of their respective CNA landscapes. Finally, we constructed a multi-label classifier to identify the cancer type and the organ of origin from copy number profiling data. The investigation of the signatures suggested common patterns, not only of physiologically related cancer types but also of clinico-pathologically distant cancer types such as different cancers originating from the neural crest. Further experiments of classification models confirmed the effectiveness of the signatures in distinguishing different cancer types and demonstrated their potential in tumor classification.

13.
Cell Genom ; 1(2)2021 Nov 10.
Artigo em Inglês | MEDLINE | ID: mdl-35128509

RESUMO

We promote a shared vision and guide for how and when to federate genomic and health-related data sharing, enabling connections and insights across independent, secure databases. The GA4GH encourages a federated approach wherein data providers have the mandate and resources to share, but where data cannot move for legal or technical reasons. We recommend a federated approach to connect national genomics initiatives into a global network and precision medicine resource.

14.
Cell Genom ; 1(2)2021 Nov 10.
Artigo em Inglês | MEDLINE | ID: mdl-35311178

RESUMO

Maximizing the personal, public, research, and clinical value of genomic information will require the reliable exchange of genetic variation data. We report here the Variation Representation Specification (VRS, pronounced "verse"), an extensible framework for the computable representation of variation that complements contemporary human-readable and flat file standards for genomic variation representation. VRS provides semantically precise representations of variation and leverages this design to enable federated identification of biomolecular variation with globally consistent and unique computed identifiers. The VRS framework includes a terminology and information model, machine-readable schema, data sharing conventions, and a reference implementation, each of which is intended to be broadly useful and freely available for community use. VRS was developed by a partnership among national information resource providers, public initiatives, and diagnostic testing laboratories under the auspices of the Global Alliance for Genomics and Health (GA4GH).

15.
Cell Genom ; 1(2)2021 Nov 10.
Artigo em Inglês | MEDLINE | ID: mdl-35072136

RESUMO

The Global Alliance for Genomics and Health (GA4GH) aims to accelerate biomedical advances by enabling the responsible sharing of clinical and genomic data through both harmonized data aggregation and federated approaches. The decreasing cost of genomic sequencing (along with other genome-wide molecular assays) and increasing evidence of its clinical utility will soon drive the generation of sequence data from tens of millions of humans, with increasing levels of diversity. In this perspective, we present the GA4GH strategies for addressing the major challenges of this data revolution. We describe the GA4GH organization, which is fueled by the development efforts of eight Work Streams and informed by the needs of 24 Driver Projects and other key stakeholders. We present the GA4GH suite of secure, interoperable technical standards and policy frameworks and review the current status of standards, their relevance to key domains of research and clinical care, and future plans of GA4GH. Broad international participation in building, adopting, and deploying GA4GH standards and frameworks will catalyze an unprecedented effort in data sharing that will be critical to advancing genomic medicine and ensuring that all populations can access its benefits.

16.
Front Oncol ; 10: 584095, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33344238

RESUMO

Copy number aberrations (CNV/CNA) represent a major contribution to the somatic mutation landscapes in cancers, and their identification can lead to the discovery of oncogenetic targets as well as improved disease (sub-) classification. Diffuse large B cell lymphoma (DLBCL) is the most common lymphoma in Western Countries and up to 40% of the affected individuals still succumb to the disease. DLBCL is an heterogenous group of disorders, and we call DLBCL today is not necessarily the same disease of a few years ago. This review focuses on types and frequencies of regional DNA CNVs in DLBCL, not otherwise specified, and in two particular conditions, the transformation from indolent lymphomas and the DLBCL in individuals with immunodeficiency.

17.
Cell Rep ; 32(5): 107985, 2020 08 04.
Artigo em Inglês | MEDLINE | ID: mdl-32755579

RESUMO

PARP inhibitors (PARPi) cause synthetic lethality in BRCA-deficient tumors. Whether specific vulnerabilities to PARPi exist beyond BRCA mutations and related defects in homology-directed repair (HDR) is not well understood. Here, we identify the ubiquitin E3 ligase TRIP12 as negative regulator of PARPi sensitivity. We show that TRIP12 controls steady-state PARP1 levels and limits PARPi-induced cytotoxic PARP1 trapping. Upon loss of TRIP12, elevated PARPi-induced PARP1 trapping causes increased DNA replication stress, DNA damage, cell cycle arrest, and cell death. Mechanistically, we demonstrate that TRIP12 binds PARP1 via a central PAR-binding WWE domain and, using its carboxy-terminal HECT domain, catalyzes polyubiquitylation of PARP1, triggering proteasomal degradation and preventing supra-physiological PARP1 accumulation. Further, in cohorts of breast and ovarian cancer patients, PARP1 abundance is negatively correlated with TRIP12 expression. We thus propose TRIP12 as regulator of PARP1 stability and PARPi-induced PARP trapping, with potential implications for PARPi sensitivity and resistance.


Assuntos
Proteínas de Transporte/metabolismo , Poli(ADP-Ribose) Polimerase-1/metabolismo , Inibidores de Poli(ADP-Ribose) Polimerases/farmacologia , Ubiquitina-Proteína Ligases/metabolismo , Sequência de Aminoácidos , Proteínas de Transporte/química , Linhagem Celular Tumoral , Dano ao DNA , Regulação para Baixo/efeitos dos fármacos , Células HEK293 , Humanos , Modelos Biológicos , Mutagênicos/toxicidade , Neoplasias/patologia , Poli ADP Ribosilação/efeitos dos fármacos , Poli Adenosina Difosfato Ribose/metabolismo , Complexo de Endopeptidases do Proteassoma/metabolismo , Ligação Proteica/efeitos dos fármacos , Domínios Proteicos , Estabilidade Proteica/efeitos dos fármacos , Proteólise/efeitos dos fármacos , Transdução de Sinais/efeitos dos fármacos , Ubiquitina-Proteína Ligases/química , Ubiquitinação/efeitos dos fármacos
18.
Genomics ; 112(5): 3331-3341, 2020 09.
Artigo em Inglês | MEDLINE | ID: mdl-32413400

RESUMO

BACKGROUND: Copy number variations (CNV) are regional deviations from the normal autosomal bi-allelic DNA content. While germline CNVs are a major contributor to genomic syndromes and inherited diseases, the majority of cancers accumulate extensive "somatic" CNV (sCNV or CNA) during the process of oncogenetic transformation and progression. While specific sCNV have closely been associated with tumorigenesis, intriguingly many neoplasias exhibit recurrent sCNV patterns beyond the involvement of a few cancer driver genes. Currently, CNV profiles of tumor samples are generated using genomic micro-arrays or high-throughput DNA sequencing. Regardless of the underlying technology, genomic copy number data is derived from the relative assessment and integration of multiple signals, with the data generation process being prone to contamination from several sources. Estimated copy number values have no absolute or strictly linear correlation to their corresponding DNA levels, and the extent of deviation differs between sample profiles, which poses a great challenge for data integration and comparison in large scale genome analysis. RESULTS: In this study, we present a novel method named "Minimum Error Calibration and Normalization for Copy Numbers Analysis" (Mecan4CNA). It only requires CNV segmentation files as input, is platform independent, and has a high performance with limited hardware requirements. For a given multi-sample copy number dataset, Mecan4CNA can batch-normalize all samples to the corresponding true copy number levels of the main tumor clones. Experiments of Mecan4CNA on simulated data showed an overall accuracy of 93% and 91% in determining the normal level and single copy alteration (i.e. duplication or loss of one allele), respectively. Comparison of estimated normal levels and single copy alternations with existing methods and karyotyping data on the NCI-60 tumor cell line produced coherent results. To estimate the method's impact on downstream analyses, we performed GISTIC analyses on the original and Mecan4CNA normalized data from the Cancer Genome Atlas (TCGA) where the normalized data showed prominent improvements of both sensitivity and specificity in detecting focal regions. CONCLUSIONS: Mecan4CNA provides an advanced method for CNA data normalization, especially in meta-analyses involving large profile numbers and heterogeneous source data quality. With its informative output and visualization options, Mecan4CNA also can improve the interpretation of individual CNA profiles. Mecan4CNA is freely available as a Python package and through its code repository on Github.


Assuntos
Variações do Número de Cópias de DNA , Genômica/métodos , Linhagem Celular Tumoral , Humanos , Software
19.
Oncology ; 98(6): 329-331, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32408309

RESUMO

Oncology has undergone rapid progress, with emerging developments in areas including cancer stem cells, molecularly targeted therapies, genomic analyses, and individually tailored immunotherapy. These advances have expanded the tools available in the fight against cancer. Some of these have seen broad media coverage resulting in justified public attention. However, these achievements have only been possible due to rapid developments in the expanding field of biomedical informatics and information technology (IT). Artificial intelligence, radiomics, electronic health records, and electronic patient-reported outcome measures (ePROMS) are only a few of the developments enabling further progress in oncology. The promising impact of IT in oncology will only become reality through a multidisciplinary approach to the complex challenges ahead.


Assuntos
Oncologia/métodos , Neoplasias/imunologia , Neoplasias/terapia , Inteligência Artificial , Comunicação , Humanos , Imunoterapia/métodos , Medidas de Resultados Relatados pelo Paciente
20.
Database (Oxford) ; 20202020 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-32239182

RESUMO

Cancers arise from the accumulation of somatic genome mutations, which can be influenced by inherited genomic variants and external factors such as environmental or lifestyle-related exposure. Due to the heterogeneity of cancers, precise information about the genomic composition of germline and malignant tissues has to be correlated with morphological, clinical and extrinsic features to advance medical knowledge and treatment options. With global differences in cancer frequencies and disease types, geographic data is of importance to understand the interplay between genetic ancestry and environmental influence in cancer incidence, progression and treatment outcome. In this study, we analyzed the current landscape of oncogenomic screening publications for geographic information content and quality, to address underrepresented study populations and thereby to fill prominent gaps in our understanding of interactions between somatic variations, population genetics and environmental factors in oncogenesis. We conclude that while the use of proxy-derived geographic annotations can be useful for coarse-grained associations, the study of geo-correlated factors in cancer causation and progression will benefit from standardized geographic provenance annotations. Additionally, publication-derived geographic provenance data allowed us to highlight stark inequality in the geographies of cancer genome profiling, with a near lack of sizable studies from Africa and other large regions.


Assuntos
Bases de Dados Genéticas , Perfilação da Expressão Gênica/métodos , Genoma Humano/genética , Genômica/métodos , Neoplasias/genética , Curadoria de Dados/métodos , Mineração de Dados/métodos , Europa (Continente) , Geografia , Humanos , Internet , Metadados , Publicações/estatística & dados numéricos , Estados Unidos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA