Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
1.
Clin Pharmacol Ther ; 108(3): 542-552, 2020 09.
Artigo em Inglês | MEDLINE | ID: mdl-32535886

RESUMO

Personalized medicine, or the tailoring of health interventions to an individual's nuanced and often unique genetic, biochemical, physiological, behavioral, and/or exposure profile, is seen by many as a biological necessity given the great heterogeneity of pathogenic processes underlying most diseases. However, testing and ultimately proving the benefit of strategies or algorithms connecting the mechanisms of action of specific interventions to patient pathophysiological profiles (referred to here as "intervention matching schemes" (IMS)) is complex for many reasons. We argue that IMS are likely to be pervasive, if not ubiquitous, in future health care, but raise important questions about their broad deployment and the contexts within which their utility can be proven. For example, one could question the need to, the efficiency associated with, and the reliability of, strategies for comparing competing or perhaps complementary IMS. We briefly summarize some of the more salient issues surrounding the vetting of IMS in cancer contexts and argue that IMS are at the foundation of many modern clinical trials and intervention strategies, such as basket, umbrella, and adaptive trials. In addition, IMS are at the heart of proposed "rapid learning systems" in hospitals, and implicit in cell replacement strategies, such as cytotoxic T-cell therapies targeting patient-specific neo-antigen profiles. We also consider the need for sensitivity to issues surrounding the deployment of IMS and comment on directions for future research.


Assuntos
Antineoplásicos/uso terapêutico , Inteligência Artificial , Biomarcadores Tumorais/genética , Técnicas de Apoio para a Decisão , Modelos Teóricos , Neoplasias/tratamento farmacológico , Antineoplásicos/efeitos adversos , Tomada de Decisão Clínica , Humanos , Técnicas de Diagnóstico Molecular , Terapia de Alvo Molecular , Neoplasias/genética , Neoplasias/patologia , Medicina de Precisão , Valor Preditivo dos Testes
2.
Prostate ; 72(4): 376-85, 2012 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-21671247

RESUMO

BACKGROUND: Genome-wide association studies (GWAS) have identified approximately three dozen single nucleotide polymorphisms (SNPs) consistently associated with prostate cancer (PCa) risk. Despite the reproducibility of these associations, the molecular mechanism for most of these SNPs has not been well elaborated as most lie within non-coding regions of the genome. Androgens play a key role in prostate carcinogenesis. Recently, using ChIP-on-chip technology, 22,447 androgen receptor (AR) binding sites have been mapped throughout the genome, greatly expanding the genomic regions potentially involved in androgen-mediated activity. METHODOLOGY/PRINCIPAL FINDINGS: To test the hypothesis that sequence variants in AR binding sites are associated with PCa risk, we performed a systematic evaluation among two existing PCa GWAS cohorts; the Johns Hopkins Hospital and the Cancer Genetic Markers of Susceptibility (CGEMS) study population. We demonstrate that regions containing AR binding sites are significantly enriched for PCa risk-associated SNPs, that is, more than expected by chance alone. In addition, compared with the entire genome, these newly observed risk-associated SNPs in these regions are significantly more likely to overlap with established PCa risk-associated SNPs from previous GWAS. These results are consistent with our previous finding from a bioinformatics analysis that one-third of the 33 known PCa risk-associated SNPs discovered by GWAS are located in regions of the genome containing AR binding sites. CONCLUSIONS/SIGNIFICANCE: The results to date provide novel statistical evidence suggesting an androgen-mediated mechanism by which some PCa associated SNPs act to influence PCa risk. However, these results are hypothesis generating and ultimately warrant testing through in-depth molecular analyses.


Assuntos
Predisposição Genética para Doença/genética , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único/genética , Neoplasias da Próstata/genética , Receptores Androgênicos/genética , Sequência de Bases , Sítios de Ligação/genética , Estudos de Casos e Controles , Estudos de Coortes , DNA de Neoplasias/genética , Humanos , Masculino , Dados de Sequência Molecular , Análise de Sequência com Séries de Oligonucleotídeos , Fatores de Risco
3.
Genome Res ; 21(1): 47-55, 2011 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-21147910

RESUMO

Advanced prostate cancer can progress to systemic metastatic tumors, which are generally androgen insensitive and ultimately lethal. Here, we report a comprehensive genomic survey for somatic events in systemic metastatic prostate tumors using both high-resolution copy number analysis and targeted mutational survey of 3508 exons from 577 cancer-related genes using next generation sequencing. Focal homozygous deletions were detected at 8p22, 10q23.31, 13q13.1, 13q14.11, and 13q14.12. Key genes mapping within these deleted regions include PTEN, BRCA2, C13ORF15, and SIAH3. Focal high-level amplifications were detected at 5p13.2-p12, 14q21.1, 7q22.1, and Xq12. Key amplified genes mapping within these regions include SKP2, FOXA1, and AR. Furthermore, targeted mutational analysis of normal-tumor pairs has identified somatic mutations in genes known to be associated with prostate cancer including AR and TP53, but has also revealed novel somatic point mutations in genes including MTOR, BRCA2, ARHGEF12, and CHD5. Finally, in one patient where multiple independent metastatic tumors were available, we show common and divergent somatic alterations that occur at both the copy number and point mutation level, supporting a model for a common clonal progenitor with metastatic tumor-specific divergence. Our study represents a deep genomic analysis of advanced metastatic prostate tumors and has revealed candidate somatic alterations, possibly contributing to lethal prostate cancer.


Assuntos
Análise Mutacional de DNA , Dosagem de Genes/genética , Genes Neoplásicos/genética , Metástase Neoplásica/genética , Neoplasias da Próstata/genética , Hibridização Genômica Comparativa , DNA de Neoplasias/análise , Éxons/genética , Genes Supressores de Tumor , Humanos , Masculino , Metástase Neoplásica/patologia , Análise de Sequência com Séries de Oligonucleotídeos , Oncogenes/genética , Mutação Puntual/genética , Neoplasias da Próstata/patologia , Análise de Sequência de DNA
4.
Bioinformatics ; 26(17): 2192-4, 2010 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-20605925

RESUMO

SUMMARY: Large volumes of data generated by high-throughput sequencing instruments present non-trivial challenges in data storage, content access and transfer. We present G-SQZ, a Huffman coding-based sequencing-reads-specific representation scheme that compresses data without altering the relative order. G-SQZ has achieved from 65% to 81% compression on benchmark datasets, and it allows selective access without scanning and decoding from start. This article focuses on describing the underlying encoding scheme and its software implementation, and a more theoretical problem of optimal compression is out of scope. The immediate practical benefits include reduced infrastructure and informatics costs in managing and analyzing large sequencing data. AVAILABILITY: http://public.tgen.org/sqz. Academic/non-profit: SOURCE: available at no cost under a non-open-source license by requesting from the web-site; Binary: available for direct download at no cost. For-Profit: Submit request for for-profit license from the web-site.


Assuntos
Compressão de Dados , Análise de Sequência de DNA/métodos , Software , Algoritmos , Biologia Computacional/métodos
5.
J Comput Biol ; 16(4): 565-77, 2009 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-19361328

RESUMO

As a first step in analyzing high-throughput data in genome-wide studies, several algorithms are available to identify and prioritize candidates lists for downstream fine-mapping. The prioritized candidates could be differentially expressed genes, aberrations in comparative genomics hybridization studies, or single nucleotide polymorphisms (SNPs) in association studies. Different analysis algorithms are subject to various experimental artifacts and analytical features that lead to different candidate lists. However, little research has been carried out to theoretically quantify the consensus between different candidate lists and to compare the study specific accuracy of the analytical methods based on a known reference candidate list. Within the context of genome-wide studies, we propose a generic mathematical framework to statistically compare ranked lists of candidates from different algorithms with each other or, if available, with a reference candidate list. To cope with the growing need for intuitive visualization of high-throughput data in genome-wide studies, we describe a complementary customizable visualization tool. As a case study, we demonstrate application of our framework to the comparison and visualization of candidate lists generated in a DNA-pooling based genome-wide association study of CEPH data in the HapMap project, where prior knowledge from individual genotyping can be used to generate a true reference candidate list. The results provide a theoretical basis to compare the accuracy of various methods and to identify redundant methods, thus providing guidance for selecting the most suitable analysis method in genome-wide studies.


Assuntos
Algoritmos , Estudo de Associação Genômica Ampla/métodos , Modelos Estatísticos , Alelos , Estudos de Casos e Controles , DNA/genética , Predisposição Genética para Doença , Haplótipos , Humanos , Análise de Sequência com Séries de Oligonucleotídeos
6.
Nat Genet ; 40(10): 1153-5, 2008 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-18758462

RESUMO

We carried out a fine-mapping study in the HNF1B gene at 17q12 in two study populations and identified a second locus associated with prostate cancer risk, approximately 26 kb centromeric to the first known locus (rs4430796); these loci are separated by a recombination hot spot. We confirmed the association with a SNP in the second locus (rs11649743) in five additional populations, with P = 1.7 x 10(-9) for an allelic test of the seven studies combined. The association at each SNP remained significant after adjustment for the other SNP.


Assuntos
Cromossomos Humanos Par 17/genética , Predisposição Genética para Doença/genética , Haplótipos/genética , Fator 1-beta Nuclear de Hepatócito/genética , Polimorfismo de Nucleotídeo Único/genética , Neoplasias da Próstata/genética , Idoso , Mapeamento Cromossômico , Ligação Genética , Humanos , Masculino , Pessoa de Meia-Idade , Neoplasias da Próstata/patologia , Fatores de Risco
7.
BMC Bioinformatics ; 7: 274, 2006 May 31.
Artigo em Inglês | MEDLINE | ID: mdl-16737545

RESUMO

BACKGROUND: Overfitting the data is a salient issue for classifier design in small-sample settings. This is why selecting a classifier from a constrained family of classifiers, ones that do not possess the potential to too finely partition the feature space, is typically preferable. But overfitting is not merely a consequence of the classifier family; it is highly dependent on the classification rule used to design a classifier from the sample data. Thus, it is possible to consider families that are rather complex but for which there are classification rules that perform well for small samples. Such classification rules can be advantageous because they facilitate satisfactory classification when the class-conditional distributions are not easily separated and the sample is not large. Here we consider neural networks, from the perspectives of classical design based solely on the sample data and from noise-injection-based design. RESULTS: This paper provides an extensive simulation-based comparative study of noise-injected neural-network design. It considers a number of different feature-label models across various small sample sizes using varying amounts of noise injection. Besides comparing noise-injected neural-network design to classical neural-network design, the paper compares it to a number of other classification rules. Our particular interest is with the use of microarray data for expression-based classification for diagnosis and prognosis. To that end, we consider noise-injected neural-network design as it relates to a study of survivability of breast cancer patients. CONCLUSION: The conclusion is that in many instances noise-injected neural network design is superior to the other tested methods, and in almost all cases it does not perform substantially worse than the best of the other methods. Since the amount of noise injected is consequential, the effect of differing amounts of injected noise must be considered.


Assuntos
Neoplasias da Mama/genética , Interpretação Estatística de Dados , Regulação Neoplásica da Expressão Gênica , Redes Neurais de Computação , Algoritmos , Neoplasias da Mama/diagnóstico , Neoplasias da Mama/mortalidade , Análise por Conglomerados , Simulação por Computador , Diagnóstico por Computador , Feminino , Perfilação da Expressão Gênica/métodos , Humanos , Modelos Lineares , Dinâmica não Linear , Análise de Sequência com Séries de Oligonucleotídeos , Prognóstico , Análise de Sobrevida
8.
Bioinformatics ; 22(7): 837-42, 2006 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-16428263

RESUMO

MOTIVATION: Given a large set of potential features, such as the set of all gene-expression values from a microarray, it is necessary to find a small subset with which to classify. The task of finding an optimal feature set of a given size is inherently combinatoric because to assure optimality all feature sets of a given size must be checked. Thus, numerous suboptimal feature-selection algorithms have been proposed. There are strong impediments to evaluate feature-selection algorithms using real data when data are limited, a common situation in genetic classification. The difficulty is compound. First, there are no class-conditional distributions from which to draw data points, only a single small labeled sample. Second, there are no test data with which to estimate the feature-set errors, and one must depend on a training-data-based error estimator. Finally, there is no optimal feature set with which to compare the feature sets found by the algorithms. RESULTS: This paper describes a genetic test bed for the evaluation of feature-selection algorithms. It begins with a large biological feature-label dataset that is used as an empirical distribution and, using massively parallel computation, finds the top feature sets of various sizes based on a given sample size and classification rule. The user can draw random samples from the data, apply a proposed algorithm, and evaluate the proficiency of the proposed algorithm via three different measures (code provided). A key feature of the test bed is that, once a dataset is input, a single command creates the entire test bed relative to the dataset. The particular dataset used for the first version of the test bed comes from a microarray-based classification study that analyzes a large number of microarrays, prepared with RNA from breast tumor samples from each of 295 patients. AVAILABILITY: The software and supplementary material are available at http://public.tgen.org/tgen-cb/support/testbed/ CONTACT: edward@ece.tamu.edu.


Assuntos
Algoritmos , Simulação por Computador , Perfilação da Expressão Gênica/métodos , Neoplasias da Mama , Coleta de Dados , Bases de Dados Genéticas , Feminino , Humanos , Modelos Estatísticos , Análise de Sequência com Séries de Oligonucleotídeos/métodos
9.
Artigo em Inglês | MEDLINE | ID: mdl-18427588

RESUMO

When using cDNA microarrays, normalization to correct labeling bias is a common preliminary step before further data analysis is applied, its objective being to reduce the variation between arrays. To date, assessment of the effectiveness of normalization has mainly been confined to the ability to detect differentially expressed genes. Since a major use of microarrays is the expression-based phenotype classification, it is important to evaluate microarray normalization procedures relative to classification. Using a model-based approach, we model the systemic-error process to generate synthetic gene-expression values with known ground truth. These synthetic expression values are subjected to typical normalization methods and passed through a set of classification rules, the objective being to carry out a systematic study of the effect of normalization on classification. Three normalization methods are considered: offset, linear regression, and Lowess regression. Seven classification rules are considered: 3-nearest neighbor, linear support vector machine, linear discriminant analysis, regular histogram, Gaussian kernel, perceptron, and multiple perceptron with majority voting. The results of the first three are presented in the paper, with the full results being given on a complementary website. The conclusion from the different experiment models considered in the study is that normalization can have a significant benefit for classification under difficult experimental conditions, with linear and Lowess regression slightly outperforming the offset method.

10.
Cancer Epidemiol Biomarkers Prev ; 14(11 Pt 1): 2563-8, 2005 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-16284379

RESUMO

It is widely hypothesized that the interactions of multiple genes influence individual risk to prostate cancer. However, current efforts at identifying prostate cancer risk genes primarily rely on single-gene approaches. In an attempt to fill this gap, we carried out a study to explore the joint effect of multiple genes in the inflammation pathway on prostate cancer risk. We studied 20 genes in the Toll-like receptor signaling pathway as well as several cytokines. For each of these genes, we selected and genotyped haplotype-tagging single nucleotide polymorphisms (SNP) among 1,383 cases and 780 controls from the CAPS (CAncer Prostate in Sweden) study population. A total of 57 SNPs were included in the final analysis. A data mining method, multifactor dimensionality reduction, was used to explore the interaction effects of SNPs on prostate cancer risk. Interaction effects were assessed for all possible n SNP combinations, where n = 2, 3, or 4. For each n SNP combination, the model providing lowest prediction error among 100 cross-validations was chosen. The statistical significance levels of the best models in each n SNP combination were determined using permutation tests. A four-SNP interaction (one SNP each from IL-10, IL-1RN, TIRAP, and TLR5) had the lowest prediction error (43.28%, P = 0.019). Our ability to analyze a large number of SNPs in a large sample size is one of the first efforts in exploring the effect of high-order gene-gene interactions on prostate cancer risk, and this is an important contribution to this new and quickly evolving field.


Assuntos
Inflamação , Polimorfismo de Nucleotídeo Único , Neoplasias da Próstata/genética , Neoplasias da Próstata/imunologia , Receptores Toll-Like/genética , Estudos de Casos e Controles , Predisposição Genética para Doença , Genótipo , Haplótipos , Humanos , Masculino , Prognóstico , Neoplasias da Próstata/etiologia , Sistema de Registros/estatística & dados numéricos , Fatores de Risco , Transdução de Sinais
11.
Bioinformatics ; 21(8): 1509-15, 2005 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-15572470

RESUMO

MOTIVATION: Given the joint feature-label distribution, increasing the number of features always results in decreased classification error; however, this is not the case when a classifier is designed via a classification rule from sample data. Typically (but not always), for fixed sample size, the error of a designed classifier decreases and then increases as the number of features grows. The potential downside of using too many features is most critical for small samples, which are commonplace for gene-expression-based classifiers for phenotype discrimination. For fixed sample size and feature-label distribution, the issue is to find an optimal number of features. RESULTS: Since only in rare cases is there a known distribution of the error as a function of the number of features and sample size, this study employs simulation for various feature-label distributions and classification rules, and across a wide range of sample and feature-set sizes. To achieve the desired end, finding the optimal number of features as a function of sample size, it employs massively parallel computation. Seven classifiers are treated: 3-nearest-neighbor, Gaussian kernel, linear support vector machine, polynomial support vector machine, perceptron, regular histogram and linear discriminant analysis. Three Gaussian-based models are considered: linear, nonlinear and bimodal. In addition, real patient data from a large breast-cancer study is considered. To mitigate the combinatorial search for finding optimal feature sets, and to model the situation in which subsets of genes are co-regulated and correlation is internal to these subsets, we assume that the covariance matrix of the features is blocked, with each block corresponding to a group of correlated features. Altogether there are a large number of error surfaces for the many cases. These are provided in full on a companion website, which is meant to serve as resource for those working with small-sample classification. AVAILABILITY: For the companion website, please visit http://public.tgen.org/tamu/ofs/ CONTACT: e-dougherty@ee.tamu.edu.


Assuntos
Algoritmos , Inteligência Artificial , DNA/genética , Perfilação da Expressão Gênica/métodos , Modelos Genéticos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Reconhecimento Automatizado de Padrão/métodos , Análise de Sequência de DNA/métodos , Simulação por Computador , Modelos Estatísticos , Tamanho da Amostra
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...