Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
1.
Cancer Cell ; 37(4): 551-568.e14, 2020 04 13.
Artículo en Inglés | MEDLINE | ID: mdl-32289277

RESUMEN

The development of precision medicine approaches for diffuse large B cell lymphoma (DLBCL) is confounded by its pronounced genetic, phenotypic, and clinical heterogeneity. Recent multiplatform genomic studies revealed the existence of genetic subtypes of DLBCL using clustering methodologies. Here, we describe an algorithm that determines the probability that a patient's lymphoma belongs to one of seven genetic subtypes based on its genetic features. This classification reveals genetic similarities between these DLBCL subtypes and various indolent and extranodal lymphoma types, suggesting a shared pathogenesis. These genetic subtypes also have distinct gene expression profiles, immune microenvironments, and outcomes following immunochemotherapy. Functional analysis of genetic subtype models highlights distinct vulnerabilities to targeted therapy, supporting the use of this classification in precision medicine trials.


Asunto(s)
Biomarcadores de Tumor/genética , Heterogeneidad Genética , Linfoma de Células B Grandes Difuso/clasificación , Linfoma de Células B Grandes Difuso/genética , Terapia Molecular Dirigida , Animales , Apoptosis , Proliferación Celular , Femenino , Perfilación de la Expresión Génica , Regulación Neoplásica de la Expresión Génica , Humanos , Linfoma de Células B Grandes Difuso/tratamiento farmacológico , Linfoma de Células B Grandes Difuso/patología , Ratones , Ratones Endogámicos NOD , Ratones SCID , Medicina de Precisión , Células Tumorales Cultivadas , Microambiente Tumoral , Ensayos Antitumor por Modelo de Xenoinjerto
2.
Methods Mol Biol ; 1956: 283-303, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-30779040

RESUMEN

High-throughput mRNA sequencing (RNA-Seq) provides both qualitative and quantitative evaluation of the transcriptome. This method uses complementary DNA (cDNA) to generate several millions of short sequence reads that are aligned to a reference genome allowing the comprehensive characterization of the transcripts in a cell. RNA-Seq has a wide variety of applications which lead to a pervasive adoption of this method well beyond the genomics community and a deployment of this technique as a standard part of the toolkit applied in life sciences. This chapter describes a protocol to perform mRNA sequencing using the Illumina NextSeq or MiSeq platforms, presents sequencing data quality metrics, and outlines a bioinformatic pipeline for sequence alignment, digital gene expression, identification of gene fusions, detection of transcript isoforms, description and annotation of genetic variants, and de novo immunoglobulin gene assembly.


Asunto(s)
Genómica/métodos , Linfoma de Células B/genética , ARN Mensajero/genética , Análisis de Secuencia de ARN/métodos , Empalme Alternativo , Perfilación de la Expresión Génica/métodos , Fusión Génica , Genes de Inmunoglobulinas , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Mutación , Polimorfismo de Nucleótido Simple , Programas Informáticos , Transcriptoma
3.
N Engl J Med ; 378(15): 1396-1407, 2018 04 12.
Artículo en Inglés | MEDLINE | ID: mdl-29641966

RESUMEN

BACKGROUND: Diffuse large B-cell lymphomas (DLBCLs) are phenotypically and genetically heterogeneous. Gene-expression profiling has identified subgroups of DLBCL (activated B-cell-like [ABC], germinal-center B-cell-like [GCB], and unclassified) according to cell of origin that are associated with a differential response to chemotherapy and targeted agents. We sought to extend these findings by identifying genetic subtypes of DLBCL based on shared genomic abnormalities and to uncover therapeutic vulnerabilities based on tumor genetics. METHODS: We studied 574 DLBCL biopsy samples using exome and transcriptome sequencing, array-based DNA copy-number analysis, and targeted amplicon resequencing of 372 genes to identify genes with recurrent aberrations. We developed and implemented an algorithm to discover genetic subtypes based on the co-occurrence of genetic alterations. RESULTS: We identified four prominent genetic subtypes in DLBCL, termed MCD (based on the co-occurrence of MYD88L265P and CD79B mutations), BN2 (based on BCL6 fusions and NOTCH2 mutations), N1 (based on NOTCH1 mutations), and EZB (based on EZH2 mutations and BCL2 translocations). Genetic aberrations in multiple genes distinguished each genetic subtype from other DLBCLs. These subtypes differed phenotypically, as judged by differences in gene-expression signatures and responses to immunochemotherapy, with favorable survival in the BN2 and EZB subtypes and inferior outcomes in the MCD and N1 subtypes. Analysis of genetic pathways suggested that MCD and BN2 DLBCLs rely on "chronic active" B-cell receptor signaling that is amenable to therapeutic inhibition. CONCLUSIONS: We uncovered genetic subtypes of DLBCL with distinct genotypic, epigenetic, and clinical characteristics, providing a potential nosology for precision-medicine strategies in DLBCL. (Funded by the Intramural Research Program of the National Institutes of Health and others.).


Asunto(s)
Perfilación de la Expresión Génica , Heterogeneidad Genética , Linfoma de Células B Grandes Difuso/genética , Mutación , Protocolos de Quimioterapia Combinada Antineoplásica/uso terapéutico , Biopsia , Epigénesis Genética , Exoma , Genotipo , Humanos , Estimación de Kaplan-Meier , Linfoma de Células B Grandes Difuso/clasificación , Linfoma de Células B Grandes Difuso/tratamiento farmacológico , Linfoma de Células B Grandes Difuso/mortalidad , Pronóstico , Análisis de Secuencia de ADN , Transcriptoma
4.
Occup Environ Med ; 73(6): 417-24, 2016 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-27102331

RESUMEN

BACKGROUND: Mapping job titles to standardised occupation classification (SOC) codes is an important step in identifying occupational risk factors in epidemiological studies. Because manual coding is time-consuming and has moderate reliability, we developed an algorithm called SOCcer (Standardized Occupation Coding for Computer-assisted Epidemiologic Research) to assign SOC-2010 codes based on free-text job description components. METHODS: Job title and task-based classifiers were developed by comparing job descriptions to multiple sources linking job and task descriptions to SOC codes. An industry-based classifier was developed based on the SOC prevalence within an industry. These classifiers were used in a logistic model trained using 14 983 jobs with expert-assigned SOC codes to obtain empirical weights for an algorithm that scored each SOC/job description. We assigned the highest scoring SOC code to each job. SOCcer was validated in 2 occupational data sources by comparing SOC codes obtained from SOCcer to expert assigned SOC codes and lead exposure estimates obtained by linking SOC codes to a job-exposure matrix. RESULTS: For 11 991 case-control study jobs, SOCcer-assigned codes agreed with 44.5% and 76.3% of manually assigned codes at the 6-digit and 2-digit level, respectively. Agreement increased with the score, providing a mechanism to identify assignments needing review. Good agreement was observed between lead estimates based on SOCcer and manual SOC assignments (κ 0.6-0.8). Poorer performance was observed for inspection job descriptions, which included abbreviations and worksite-specific terminology. CONCLUSIONS: Although some manual coding will remain necessary, using SOCcer may improve the efficiency of incorporating occupation into large-scale epidemiological studies.


Asunto(s)
Industrias/clasificación , Perfil Laboral , Procesamiento de Lenguaje Natural , Ocupaciones/clasificación , Algoritmos , Carcinoma de Células Renales , Estudios de Casos y Controles , Métodos Epidemiológicos , Estudios Epidemiológicos , Humanos , Modelos Logísticos , Reproducibilidad de los Resultados , Programas Informáticos , Estados Unidos , United States Occupational Safety and Health Administration
5.
IEEE Int Conf Bioinform Biomed Workshops ; 2015: 1586-1590, 2015 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-27042700

RESUMEN

Longitudinal studies play a key role in various fields, including epidemiology, clinical research, and genomic analysis. Currently, the most popular methods in longitudinal data analysis are model-driven regression approaches, which impose strong prior assumptions and are unable to scale to large problems in the manner of machine learning algorithms. In this work, we propose a novel longitudinal support vector regression (LSVR) algorithm that not only takes the advantage of one of the most popular machine learning methods, but also is able to model the temporal nature of longitudinal data by taking into account observational dependence within subjects. We test LSVR on publicly available data from the DREAM-Phil Bowen ALS Prediction Prize4Life challenge. Results suggest that LSVR is at a minimum competitive with favored machine learning methods and is able to outperform those methods in predicting ALS score one month in advance.

6.
Artículo en Inglés | MEDLINE | ID: mdl-25221787

RESUMEN

Mapping job titles to standardized occupation classification (SOC) codes is an important step in evaluating changes in health risks over time as measured in inspection databases. However, manual SOC coding is cost prohibitive for very large studies. Computer based SOC coding systems can improve the efficiency of incorporating occupational risk factors into large-scale epidemiological studies. We present a novel method of mapping verbatim job titles to SOC codes using a large table of prior knowledge available in the public domain that included detailed description of the tasks and activities and their synonyms relevant to each SOC code. Job titles are compared to our knowledge base to find the closest matching SOC code. A soft Jaccard index is used to measure the similarity between a previously unseen job title and the knowledge base. Additional information such as standardized industrial codes can be incorporated to improve the SOC code determination by providing additional context to break ties in matches.

7.
Artículo en Inglés | MEDLINE | ID: mdl-25599092

RESUMEN

Research into modeling the progression of Alzheimer's disease (AD) has made recent progress in identifying plasma proteomic biomarkers to identify the disease at the pre-clinical stage. In contrast with cerebral spinal fluid (CSF) biomarkers and PET imaging, plasma biomarker diagnoses have the advantage of being cost-effective and minimally invasive, thereby improving our understanding of AD and hopefully leading to early interventions as research into this subject advances. The Alzheimer's Disease Neuroimaging Initiative* (ADNI) has collected data on 190 plasma analytes from individuals diagnosed with AD as well subjects with mild cognitive impairment and cognitively normal (CN) controls. We propose an approach to classify subjects as AD or CN via an ensemble of classifiers trained and validated on ADNI data. Classifier performance is enhanced by an augmentation of a selective biomarker feature space with principal components obtained from the entire set of biomarkers. This procedure yields accuracy of 89% and area under the ROC curve of 94%.

8.
Artículo en Inglés | MEDLINE | ID: mdl-25621319

RESUMEN

Lysosomes are subcellular organelles playing a vital role in the endocytosis process of the cell. Lysosomal acidity is an important factor in assuring proper functioning of the enzymes within the organelle, and can be assessed by labeling the lysosomes with pH-sensitive fluorescence probes. To enhance our understanding of the acidification mechanisms, the goal of this work is to develop a method that can accurately detect and characterize the acidity of each lysosome captured in ratiometric fluorescence images. We present an algorithm that utilizes the h-dome transformation and reconciles spots detected independently from two wavelength channels. We evaluated our algorithm using simulated images for which the exact locations were known. The h-dome algorithm achieved an f-score as high as 0.890. We also computed the fluorescence ratios from lysosomes in live HeLa cell images with known lysosomal pHs. Using leave-one-out cross-validation, we demonstrated that the new algorithm was able to achieve much better pH prediction accuracy than the conventional method.

9.
Int J Biomed Imaging ; 2009: 528639, 2009.
Artículo en Inglés | MEDLINE | ID: mdl-19672315

RESUMEN

Three-dimensional Oximetric Electron Paramagnetic Resonance Imaging using the Single Point Imaging modality generates unpaired spin density and oxygen images that can readily distinguish between normal and tumor tissues in small animals. It is also possible with fast imaging to track the changes in tissue oxygenation in response to the oxygen content in the breathing air. However, this involves dealing with gigabytes of data for each 3D oximetric imaging experiment involving digital band pass filtering and background noise subtraction, followed by 3D Fourier reconstruction. This process is rather slow in a conventional uniprocessor system. This paper presents a parallelization framework using OpenMP runtime support and parallel MATLAB to execute such computationally intensive programs. The Intel compiler is used to develop a parallel C++ code based on OpenMP. The code is executed on four Dual-Core AMD Opteron shared memory processors, to reduce the computational burden of the filtration task significantly. The results show that the parallel code for filtration has achieved a speed up factor of 46.66 as against the equivalent serial MATLAB code. In addition, a parallel MATLAB code has been developed to perform 3D Fourier reconstruction. Speedup factors of 4.57 and 4.25 have been achieved during the reconstruction process and oximetry computation, for a data set with 23 x 23 x 23 gradient steps. The execution time has been computed for both the serial and parallel implementations using different dimensions of the data and presented for comparison. The reported system has been designed to be easily accessible even from low-cost personal computers through local internet (NIHnet). The experimental results demonstrate that the parallel computing provides a source of high computational power to obtain biophysical parameters from 3D EPR oximetric imaging, almost in real-time.

10.
Genome Biol ; 9 Suppl 2: S6, 2008.
Artículo en Inglés | MEDLINE | ID: mdl-18834497

RESUMEN

We introduce the first meta-service for information extraction in molecular biology, the BioCreative MetaServer (BCMS; http://bcms.bioinfo.cnio.es/). This prototype platform is a joint effort of 13 research groups and provides automatically generated annotations for PubMed/Medline abstracts. Annotation types cover gene names, gene IDs, species, and protein-protein interactions. The annotations are distributed by the meta-server in both human and machine readable formats (HTML/XML). This service is intended to be used by biomedical researchers and database annotators, and in biomedical language processing. The platform allows direct comparison, unified access, and result aggregation of the annotations.


Asunto(s)
Investigación Biomédica/métodos , Biología Computacional/métodos , Almacenamiento y Recuperación de la Información , Internet , Humanos
11.
Artículo en Inglés | MEDLINE | ID: mdl-17951839

RESUMEN

The ability to identify gene mentions in text and normalize them to the proper unique identifiers is crucial for "down-stream" text mining applications in bioinformatics. We have developed a rule-based algorithm that divides the normalization task into two steps. The first step includes pattern matching for gene symbols and an approximate term searching technique for gene names. Next, the algorithm measures several features based on morphological, statistical, and contextual information to estimate the level of confidence that the correct identifier is selected for a potential mention. Uniqueness, inverse distance, and coverage are three novel features we quantified. The algorithm was evaluated against the BioCreAtIvE datasets. The feature weights were tuned by the Nealder-Mead simplex method. An F-score of .7622 and an AUC (area under the recall-precision curve) of .7461 were achieved on the test data using the set of weights optimized to the training data.


Asunto(s)
Indización y Redacción de Resúmenes/métodos , Inteligencia Artificial , Sistemas de Administración de Bases de Datos , Genes , Almacenamiento y Recuperación de la Información/métodos , Procesamiento de Lenguaje Natural , PubMed , Gráficos por Computador , Intervalos de Confianza , Interpretación Estadística de Datos , Documentación/métodos , Internet , Interfaz Usuario-Computador
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...