Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
Más filtros











Intervalo de año de publicación
1.
Cancer Res ; 84(9): 1396-1403, 2024 May 02.
Artículo en Inglés | MEDLINE | ID: mdl-38488504

RESUMEN

The NCI's Cloud Resources (CR) are the analytical components of the Cancer Research Data Commons (CRDC) ecosystem. This review describes how the three CRs (Broad Institute FireCloud, Institute for Systems Biology Cancer Gateway in the Cloud, and Seven Bridges Cancer Genomics Cloud) provide access and availability to large, cloud-hosted, multimodal cancer datasets, as well as offer tools and workspaces for performing data analysis where the data resides, without download or storage. In addition, users can upload their own data and tools into their workspaces, allowing researchers to create custom analysis workflows and integrate CRDC-hosted data with their own. See related articles by Brady et al., p. 1384, Wang et al., p. 1388, and Kim et al., p. 1404.


Asunto(s)
Nube Computacional , National Cancer Institute (U.S.) , Neoplasias , Humanos , Neoplasias/genética , Estados Unidos , Investigación Biomédica , Genómica/métodos , Biología Computacional/métodos
2.
Cancer Res ; 84(9): 1404-1409, 2024 May 02.
Artículo en Inglés | MEDLINE | ID: mdl-38488510

RESUMEN

More than ever, scientific progress in cancer research hinges on our ability to combine datasets and extract meaningful interpretations to better understand diseases and ultimately inform the development of better treatments and diagnostic tools. To enable the successful sharing and use of big data, the NCI developed the Cancer Research Data Commons (CRDC), providing access to a large, comprehensive, and expanding collection of cancer data. The CRDC is a cloud-based data science infrastructure that eliminates the need for researchers to download and store large-scale datasets by allowing them to perform analysis where data reside. Over the past 10 years, the CRDC has made significant progress in providing access to data and tools along with training and outreach to support the cancer research community. In this review, we provide an overview of the history and the impact of the CRDC to date, lessons learned, and future plans to further promote data sharing, accessibility, interoperability, and reuse. See related articles by Brady et al., p. 1384, Wang et al., p. 1388, and Pot et al., p. 1396.


Asunto(s)
Difusión de la Información , National Cancer Institute (U.S.) , Neoplasias , Humanos , Estados Unidos , Neoplasias/terapia , Difusión de la Información/métodos , Investigación Biomédica/tendencias , Bases de Datos Factuales , Macrodatos
3.
Radiographics ; 43(12): e230180, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-37999984

RESUMEN

The remarkable advances of artificial intelligence (AI) technology are revolutionizing established approaches to the acquisition, interpretation, and analysis of biomedical imaging data. Development, validation, and continuous refinement of AI tools requires easy access to large high-quality annotated datasets, which are both representative and diverse. The National Cancer Institute (NCI) Imaging Data Commons (IDC) hosts large and diverse publicly available cancer image data collections. By harmonizing all data based on industry standards and colocalizing it with analysis and exploration resources, the IDC aims to facilitate the development, validation, and clinical translation of AI tools and address the well-documented challenges of establishing reproducible and transparent AI processing pipelines. Balanced use of established commercial products with open-source solutions, interconnected by standard interfaces, provides value and performance, while preserving sufficient agility to address the evolving needs of the research community. Emphasis on the development of tools, use cases to demonstrate the utility of uniform data representation, and cloud-based analysis aim to ease adoption and help define best practices. Integration with other data in the broader NCI Cancer Research Data Commons infrastructure opens opportunities for multiomics studies incorporating imaging data to further empower the research community to accelerate breakthroughs in cancer detection, diagnosis, and treatment. Published under a CC BY 4.0 license.


Asunto(s)
Inteligencia Artificial , Neoplasias , Estados Unidos , Humanos , National Cancer Institute (U.S.) , Reproducibilidad de los Resultados , Diagnóstico por Imagen , Multiómica , Neoplasias/diagnóstico por imagen
4.
Genes Chromosomes Cancer ; 62(8): 441-448, 2023 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-36695636

RESUMEN

Cytogenetic analysis provides important information on the genetic mechanisms of cancer. The Mitelman Database of Chromosome Aberrations and Gene Fusions in Cancer (Mitelman DB) is the largest catalog of acquired chromosome aberrations, presently comprising >70 000 cases across multiple cancer types. Although this resource has enabled the identification of chromosome abnormalities leading to specific cancers and cancer mechanisms, a large-scale, systematic analysis of these aberrations and their downstream implications has been difficult due to the lack of a standard, automated mapping from aberrations to genomic coordinates. We previously introduced CytoConverter as a tool that automates such conversions. CytoConverter has now been updated with improved interpretation of karyotypes and has been integrated with the Mitelman DB, providing a comprehensive mapping of the 70 000+ cases to genomic coordinates, as well as visualization of the frequencies of chromosomal gains and losses. Importantly, all CytoConverter-generated genomic coordinates are publicly available in Google BigQuery, a cloud-based data warehouse, facilitating data exploration and integration with other datasets hosted by the Institute for Systems Biology Cancer Gateway in the Cloud (ISB-CGC) Resource. We demonstrate the use of BigQuery for integrative analysis of Mitelman DB with other cancer datasets, including a comparison of the frequency of imbalances identified in Mitelman DB cases with those found in The Cancer Genome Atlas (TCGA) copy number datasets. This solution provides opportunities to leverage the power of cloud computing for low-cost, scalable, and integrated analysis of chromosome aberrations and gene fusions in cancer.


Asunto(s)
Nube Computacional , Neoplasias , Humanos , Aberraciones Cromosómicas , Cariotipificación , Neoplasias/genética , Fusión Génica
5.
F1000Res ; 11: 493, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36761837

RESUMEN

Synthetic lethal interactions (SLIs), genetic interactions in which the simultaneous inactivation of two genes leads to a lethal phenotype, are promising targets for therapeutic intervention in cancer, as exemplified by the recent success of PARP inhibitors in treating BRCA1/2-deficient tumors. We present SL-Cloud, a new component of the Institute for Systems Biology Cancer Gateway in the Cloud (ISB-CGC), that provides an integrated framework of cloud-hosted data resources and curated workflows to enable facile prediction of SLIs. This resource addresses two main challenges related to SLI inference: the need to wrangle and preprocess large multi-omic datasets and the availability of multiple comparable prediction approaches. SL-Cloud enables customizable computational inference of SLIs and testing of prediction approaches across multiple datasets. We anticipate that cancer researchers will find utility in this tool for discovery of SLIs to support further investigation into potential drug targets for anticancer therapies.


Asunto(s)
Nube Computacional , Neoplasias , Humanos , Neoplasias/genética , Biología de Sistemas , Multiómica
6.
Onco (Basel) ; 2(2): 129-144, 2022 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-37841494

RESUMEN

Whole genome sequencing (WGS) has helped to revolutionize biology, but the computational challenge remains for extracting valuable inferences from this information. Here, we present the cancer-associated variants from the Cancer Genome Atlas (TCGA) WGS dataset. This set of data will allow cancer researchers to further expand their analysis beyond the exomic regions of the genome to the entire genome. A total of 1342 WGS alignments available from the consortium were processed with VarScan2 and deposited to the NCI Cancer Cloud. The sample set covers 18 different cancers and reveals 157,313,519 pooled (non-unique) cancer-associated single-nucleotide variations (SNVs) across all samples. There was an average of 117,223 SNVs per sample, with a range from 1111 to 775,470 and a standard deviation of 163,273. The dataset was incorporated into BigQuery, which allows for fast access and cross-mapping, which will allow researchers to enrich their current studies with a plethora of newly available genomic data.

7.
Cancer Res ; 81(16): 4188-4193, 2021 08 15.
Artículo en Inglés | MEDLINE | ID: mdl-34185678

RESUMEN

The National Cancer Institute (NCI) Cancer Research Data Commons (CRDC) aims to establish a national cloud-based data science infrastructure. Imaging Data Commons (IDC) is a new component of CRDC supported by the Cancer Moonshot. The goal of IDC is to enable a broad spectrum of cancer researchers, with and without imaging expertise, to easily access and explore the value of deidentified imaging data and to support integrated analyses with nonimaging data. We achieve this goal by colocating versatile imaging collections with cloud-based computing resources and data exploration, visualization, and analysis tools. The IDC pilot was released in October 2020 and is being continuously populated with radiology and histopathology collections. IDC provides access to curated imaging collections, accompanied by documentation, a user forum, and a growing number of analysis use cases that aim to demonstrate the value of a data commons framework applied to cancer imaging research. SIGNIFICANCE: This study introduces NCI Imaging Data Commons, a new repository of the NCI Cancer Research Data Commons, which will support cancer imaging research on the cloud.


Asunto(s)
Diagnóstico por Imagen/métodos , National Cancer Institute (U.S.) , Neoplasias/diagnóstico por imagen , Neoplasias/genética , Investigación Biomédica/tendencias , Nube Computacional , Biología Computacional/métodos , Gráficos por Computador , Seguridad Computacional , Interpretación Estadística de Datos , Bases de Datos Factuales , Diagnóstico por Imagen/normas , Humanos , Procesamiento de Imagen Asistido por Computador , Proyectos Piloto , Lenguajes de Programación , Radiología/métodos , Radiología/normas , Reproducibilidad de los Resultados , Programas Informáticos , Estados Unidos , Interfaz Usuario-Computador
8.
Cancer Res ; 77(21): e7-e10, 2017 11 01.
Artículo en Inglés | MEDLINE | ID: mdl-29092928

RESUMEN

The ISB Cancer Genomics Cloud (ISB-CGC) is one of three pilot projects funded by the National Cancer Institute to explore new approaches to computing on large cancer datasets in a cloud environment. With a focus on Data as a Service, the ISB-CGC offers multiple avenues for accessing and analyzing The Cancer Genome Atlas, TARGET, and other important references such as GENCODE and COSMIC using the Google Cloud Platform. The open approach allows researchers to choose approaches best suited to the task at hand: from analyzing terabytes of data using complex workflows to developing new analysis methods in common languages such as Python, R, and SQL; to using an interactive web application to create synthetic patient cohorts and to explore the wealth of available genomic data. Links to resources and documentation can be found at www.isb-cgc.org Cancer Res; 77(21); e7-10. ©2017 AACR.


Asunto(s)
Nube Computacional , Biología Computacional , Genómica , Neoplasias/genética , Conjuntos de Datos como Asunto , Genoma Humano , Humanos , Internet , National Cancer Institute (U.S.) , Investigación/tendencias , Programas Informáticos , Estados Unidos
9.
Science ; 345(6201): 1181-4, 2014 Sep 05.
Artículo en Inglés | MEDLINE | ID: mdl-25190796

RESUMEN

Coffee is a valuable beverage crop due to its characteristic flavor, aroma, and the stimulating effects of caffeine. We generated a high-quality draft genome of the species Coffea canephora, which displays a conserved chromosomal gene order among asterid angiosperms. Although it shows no sign of the whole-genome triplication identified in Solanaceae species such as tomato, the genome includes several species-specific gene family expansions, among them N-methyltransferases (NMTs) involved in caffeine production, defense-related genes, and alkaloid and flavonoid enzymes involved in secondary compound synthesis. Comparative analyses of caffeine NMTs demonstrate that these genes expanded through sequential tandem duplications independently of genes from cacao and tea, suggesting that caffeine in eudicots is of polyphyletic origin.


Asunto(s)
Cafeína/genética , Coffea/genética , Evolución Molecular , Genoma de Planta , Metiltransferasas/fisiología , Proteínas de Plantas/fisiología , Cafeína/biosíntesis , Coffea/clasificación , Metiltransferasas/genética , Filogenia , Proteínas de Plantas/genética
10.
Plant Physiol ; 154(3): 1053-66, 2010 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-20864545

RESUMEN

Polyploidization constitutes a common mode of evolution in flowering plants. This event provides the raw material for the divergence of function in homeologous genes, leading to phenotypic novelty that can contribute to the success of polyploids in nature or their selection for use in agriculture. Mounting evidence underlined the existence of homeologous expression biases in polyploid genomes; however, strategies to analyze such transcriptome regulation remained scarce. Important factors regarding homeologous expression biases remain to be explored, such as whether this phenomenon influences specific genes, how paralogs are affected by genome doubling, and what is the importance of the variability of homeologous expression bias to genotype differences. This study reports the expressed sequence tag assembly of the allopolyploid Coffea arabica and one of its direct ancestors, Coffea canephora. The assembly was used for the discovery of single nucleotide polymorphisms through the identification of high-quality discrepancies in overlapped expressed sequence tags and for gene expression information indirectly estimated by the transcript redundancy. Sequence diversity profiles were evaluated within C. arabica (Ca) and C. canephora (Cc) and used to deduce the transcript contribution of the Coffea eugenioides (Ce) ancestor. The assignment of the C. arabica haplotypes to the C. canephora (CaCc) or C. eugenioides (CaCe) ancestral genomes allowed us to analyze gene expression contributions of each subgenome in C. arabica. In silico data were validated by the quantitative polymerase chain reaction and allele-specific combination TaqMAMA-based method. The presence of differential expression of C. arabica homeologous genes and its implications in coffee gene expression, ontology, and physiology are discussed.


Asunto(s)
Coffea/genética , Etiquetas de Secuencia Expresada , Genoma de Planta , Polimorfismo de Nucleótido Simple , ADN de Plantas/genética , Minería de Datos , Regulación de la Expresión Génica de las Plantas , Frecuencia de los Genes , Haplotipos , Análisis de Secuencia de ADN , Tetraploidía
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA