Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 31
Filtrar
1.
Cell Rep ; 20(3): 572-585, 2017 07 18.
Artículo en Inglés | MEDLINE | ID: mdl-28723562

RESUMEN

Myelodysplastic syndromes and chronic myelomonocytic leukemia are blood disorders characterized by ineffective hematopoiesis and progressive marrow failure that can transform into acute leukemia. The DNA methyltransferase inhibitor 5-azacytidine (AZA) is the most effective pharmacological option, but only ∼50% of patients respond. A response only manifests after many months of treatment and is transient. The reasons underlying AZA resistance are unknown, and few alternatives exist for non-responders. Here, we show that AZA responders have more hematopoietic progenitor cells (HPCs) in the cell cycle. Non-responder HPC quiescence is mediated by integrin α5 (ITGA5) signaling and their hematopoietic potential improved by combining AZA with an ITGA5 inhibitor. AZA response is associated with the induction of an inflammatory response in HPCs in vivo. By molecular bar coding and tracking individual clones, we found that, although AZA alters the sub-clonal contribution to different lineages, founder clones are not eliminated and continue to drive hematopoiesis even in complete responders.


Asunto(s)
Azacitidina/administración & dosificación , Resistencia a Medicamentos , Genómica , Síndromes Mielodisplásicos , Anciano , Anciano de 80 o más Años , Resistencia a Medicamentos/efectos de los fármacos , Resistencia a Medicamentos/genética , Femenino , Humanos , Cadenas alfa de Integrinas/genética , Cadenas alfa de Integrinas/metabolismo , Persona de Mediana Edad , Síndromes Mielodisplásicos/tratamiento farmacológico , Síndromes Mielodisplásicos/genética , Síndromes Mielodisplásicos/metabolismo
2.
J Proteome Res ; 16(7): 2359-2369, 2017 07 07.
Artículo en Inglés | MEDLINE | ID: mdl-28580786

RESUMEN

Tandem mass spectrometry is one of the most popular techniques for quantitation of proteomes. There exists a large variety of options in each stage of data preprocessing that impact the bias and variance of the summarized protein-level values. Using a newly released data set satisfying a replicated Latin squares design, a diverse set of performance metrics has been developed and implemented in a web-based application, Quantitative Performance Evaluator for Proteomics (QPEP). QPEP has the flexibility to allow users to apply their own method to preprocess this data set and share the results, allowing direct and straightforward comparison of new methodologies. Application of these new metrics to three case studies highlights that (i) the summarization of peptides to proteins is robust to the choice of peptide summary used, (ii) the differences between iTRAQ labels are stronger than the differences between experimental runs, and (iii) the commercial software ProteinPilot performs equivalently well at between-sample normalization to more complicated methods developed by academics. Importantly, finding (ii) underscores the benefits of using the principles of randomization and blocking to avoid the experimental measurements being confounded by technical factors. Data are available via ProteomeXchange with identifier PXD003608.


Asunto(s)
Péptidos/análisis , Proteoma/análisis , Proteómica/estadística & datos numéricos , Proteínas de Saccharomyces cerevisiae/aislamiento & purificación , Programas Informáticos , Espectrometría de Masas en Tándem/normas , Benchmarking , Internet , Reproducibilidad de los Resultados , Saccharomyces cerevisiae/química
3.
Stat Appl Genet Mol Biol ; 16(1): 31-45, 2017 03 01.
Artículo en Inglés | MEDLINE | ID: mdl-28284040

RESUMEN

Output from analysis of a high-throughput 'omics' experiment very often is a ranked list. One commonly encountered example is a ranked list of differentially expressed genes from a gene expression experiment, with a length of many hundreds of genes. There are numerous situations where interest is in the comparison of outputs following, say, two (or more) different experiments, or of different approaches to the analysis that produce different ranked lists. Rather than considering exact agreement between the rankings, following others, we consider two ranked lists to be in agreement if the rankings differ by some fixed distance. Generally only a relatively small subset of the k top-ranked items will be in agreement. So the aim is to find the point k at which the probability of agreement in rankings changes from being greater than 0.5 to being less than 0.5. We use penalized splines and a Bayesian logit model, to give a nonparametric smooth to the sequence of agreements, as well as pointwise credible intervals for the probability of agreement. Our approach produces a point estimate and a credible interval for k. R code is provided. The method is applied to rankings of genes from breast cancer microarray experiments.


Asunto(s)
Biología Computacional/métodos , Perfilación de la Expresión Génica , Teorema de Bayes , Expresión Génica , Humanos , Análisis por Micromatrices , Análisis de Secuencia por Matrices de Oligonucleótidos , Probabilidad
4.
BMC Bioinformatics ; 16: 145, 2015 May 06.
Artículo en Inglés | MEDLINE | ID: mdl-25943746

RESUMEN

BACKGROUND: Bisulphite sequencing enables the detection of cytosine methylation. The sequence of the methylation states of cytosines on any given read forms a methylation pattern that carries substantially more information than merely studying the average methylation level at individual positions. In order to understand better the complexity of DNA methylation landscapes in biological samples, it is important to study the diversity of these methylation patterns. However, the accurate quantification of methylation patterns is subject to sequencing errors and spurious signals due to incomplete bisulphite conversion of cytosines. RESULTS: A statistical model is developed which accounts for the distribution of DNA methylation patterns at any given locus. The model incorporates the effects of sequencing errors and spurious reads, and enables estimation of the true underlying distribution of methylation patterns. CONCLUSIONS: Calculation of the estimated distribution over methylation patterns is implemented in the R Bioconductor package MPFE. Source code and documentation of the package are also available for download at http://bioconductor.org/packages/3.0/bioc/html/MPFE.html .


Asunto(s)
Algoritmos , Abejas/fisiología , Encéfalo/metabolismo , Metilación de ADN , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Modelos Estadísticos , Animales , Citosina/química , Documentación , Lenguajes de Programación , Sulfitos/química
5.
PeerJ ; 2: e576, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-25337456

RESUMEN

Background. A number of algorithms exist for analysing RNA-sequencing data to infer profiles of differential gene expression. Problems inherent in building algorithms around statistical models of over dispersed count data are formidable and frequently lead to non-uniform p-value distributions for null-hypothesis data and to inaccurate estimates of false discovery rates (FDRs). This can lead to an inaccurate measure of significance and loss of power to detect differential expression. Results. We use synthetic and real biological data to assess the ability of several available R packages to accurately estimate FDRs. The packages surveyed are based on statistical models of overdispersed Poisson data and include edgeR, DESeq, DESeq2, PoissonSeq and QuasiSeq. Also tested is an add-on package to edgeR and DESeq which we introduce called Polyfit. Polyfit aims to address the problem of a non-uniform null p-value distribution for two-class datasets by adapting the Storey-Tibshirani procedure. Conclusions. We find the best performing package in the sense that it achieves a low FDR which is accurately estimated over the full range of p-values, albeit with a very slow run time, is the QLSpline implementation of QuasiSeq. This finding holds provided the number of biological replicates in each condition is at least 4. The next best performing packages are edgeR and DESeq2. When the number of biological replicates is sufficiently high, and within a range accessible to multiplexed experimental designs, the Polyfit extension improves the performance DESeq (for approximately 6 or more replicates per condition), making its performance comparable with that of edgeR and DESeq2 in our tests with synthetic data.

6.
BMC Bioinformatics ; 14 Suppl 2: S24, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23369017

RESUMEN

BACKGROUND: Mass spectrometry-based protein identification is a very challenging task. The main identification approaches include de novo sequencing and database searching. Both approaches have shortcomings, so an integrative approach has been developed. The integrative approach firstly infers partial peptide sequences, known as tags, directly from tandem spectra through de novo sequencing, and then puts these sequences into a database search to see if a close peptide match can be found. However the current implementation of this integrative approach has several limitations. Firstly, simplistic de novo sequencing is applied and only very short sequence tags are used. Secondly, most integrative methods apply an algorithm similar to BLAST to search for exact sequence matches and do not accommodate sequence errors well. Thirdly, by applying these methods the integrated de novo sequencing makes a limited contribution to the scoring model which is still largely based on database searching. RESULTS: We have developed a new integrative protein identification method which can integrate de novo sequencing more efficiently into database searching. Evaluated on large real datasets, our method outperforms popular identification methods.


Asunto(s)
Algoritmos , Espectrometría de Masas/métodos , Proteínas/análisis , Análisis de Secuencia de Proteína/métodos , Programas Informáticos , Secuencia de Aminoácidos , Biología Computacional/métodos , Bases de Datos de Proteínas
7.
BMC Genomics ; 13: 484, 2012 Sep 17.
Artículo en Inglés | MEDLINE | ID: mdl-22985019

RESUMEN

BACKGROUND: RNA sequencing (RNA-Seq) has emerged as a powerful approach for the detection of differential gene expression with both high-throughput and high resolution capabilities possible depending upon the experimental design chosen. Multiplex experimental designs are now readily available, these can be utilised to increase the numbers of samples or replicates profiled at the cost of decreased sequencing depth generated per sample. These strategies impact on the power of the approach to accurately identify differential expression. This study presents a detailed analysis of the power to detect differential expression in a range of scenarios including simulated null and differential expression distributions with varying numbers of biological or technical replicates, sequencing depths and analysis methods. RESULTS: Differential and non-differential expression datasets were simulated using a combination of negative binomial and exponential distributions derived from real RNA-Seq data. These datasets were used to evaluate the performance of three commonly used differential expression analysis algorithms and to quantify the changes in power with respect to true and false positive rates when simulating variations in sequencing depth, biological replication and multiplex experimental design choices. CONCLUSIONS: This work quantitatively explores comparisons between contemporary analysis tools and experimental design choices for the detection of differential expression using RNA-Seq. We found that the DESeq algorithm performs more conservatively than edgeR and NBPSeq. With regard to testing of various experimental designs, this work strongly suggests that greater power is gained through the use of biological replicates relative to library (technical) replicates and sequencing depth. Strikingly, sequencing depth could be reduced as low as 15% without substantial impacts on false positive or true positive rates.


Asunto(s)
Perfilación de la Expresión Génica/métodos , Análisis de Secuencia de ARN/métodos , Estadística como Asunto/métodos , Algoritmos
8.
Stat Appl Genet Mol Biol ; 11(1): Article 3, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22624182

RESUMEN

The D(2) statistic, defined as the number of matches of words of some pre-specified length k, is a computationally fast alignment-free measure of biological sequence similarity. However there is some debate about its suitability for this purpose as the variability in D(2) may be dominated by the terms that reflect the noise in each of the single sequences only. We examine the extent of the problem and the effectiveness of overcoming it by using two mean-centred variants of this statistic, D(2)* and D(2c). We conclude that all three statistics are potentially useful measures of sequence similarity, for which reasonably accurate p-values can be estimated under a null hypothesis of sequences composed of identically and independently distributed letters. We show that D(2) and D(2)c, and to a somewhat lesser extent D(2)*, perform well in tests to classify moderate length query sequences as putative cis-regulatory modules.


Asunto(s)
Alineación de Secuencia , Análisis de Secuencia de ADN/métodos , Secuencia de Bases , Bases de Datos Factuales , Análisis de Secuencia de ADN/estadística & datos numéricos
9.
BMC Proc ; 5 Suppl 9: S24, 2011 Nov 29.
Artículo en Inglés | MEDLINE | ID: mdl-22373266

RESUMEN

Model selection procedures for simultaneous analysis of all single-nucleotide polymorphisms in genome-wide association studies are most suitable for making full use of the data for a complex disease study. In this paper we consider a penalized regression using the LASSO procedure and show that post-processing of the penalized-regression results with subsequent stepwise selection may lead to improved identification of causal single-nucleotide polymorphisms.

10.
Stat Appl Genet Mol Biol ; 8: Article 43, 2009.
Artículo en Inglés | MEDLINE | ID: mdl-19883369

RESUMEN

Word matches are often used in sequence comparison methods, either as a measure of sequence similarity or in the first search steps of algorithms such as BLAST or BLAT. The D2 statistic is the number of matches of words of k letters between two sequences. Recent advances have been made in the characterization of this statistic and in the approximation of its distribution. Here, these results are extended to the case of approximate word matches. We compute the exact value of the variance of the D2 statistic for the case of a uniform letter distribution, and introduce a method to provide accurate approximations of the variance in the remaining cases. This enables the distribution of D2 to be approximated for typical situations arising in biological research. We apply these results to the identification of cis-regulatory modules, and show that this method detects such sequences with a high accuracy. The ability to approximate the distribution of D2 for both exact and approximate word matches will enable the use of this statistic in a more precise manner for sequence comparison, database searches, and identification of transcription factor binding sites.


Asunto(s)
Algoritmos , Elementos Reguladores de la Transcripción/genética , Alineación de Secuencia/métodos , Homología de Secuencia de Ácido Nucleico , Secuencia de Bases , Sitios de Unión , Bases de Datos de Ácidos Nucleicos , Factores de Transcripción
11.
Pathology ; 41(2): 173-7, 2009 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-19152190

RESUMEN

AIM: To compare the relative diagnostic efficacy of several different tests used to establish a diagnosis of phaeochromocytoma, in patients with a proven diagnosis of phaeochromocytoma, and in hospital patients with significant disease of other types. METHODS: We prospectively compared biochemical markers of catecholamine output and metabolism in plasma and urine in 22 patients with histologically proven phaeochromocytoma, 15 intensive care unit (ICU) patients, 30 patients on chronic haemodialysis and both hypertensive (n = 10) and normotensive (n = 16) controls. RESULTS: Receiver operating characteristic curves were plotted. At the point of maximum efficiency, plasma free metanephrines showed 100% sensitivity and 97.6% specificity, compared with plasma catecholamines (78.6% and 70.7%), urine catecholamines (78.6% and 87.8%), urine metanephrines (85.7% and 95.1%), and urine hydroxymethoxymandelic acid (HMMA or VMA) (93.0% and 75.8%). All patients with phaeochromocytoma had plasma free metanephrine concentrations at least 27% above the upper limit of the reference range. Only three other patients (two on haemodialysis and one in ICU) had PFM concentrations more than 50% above the upper limit of the reference range. CONCLUSIONS: In patients with phaeochromocytoma, plasma free metanephrines displayed superior diagnostic sensitivity and specificity compared with other biochemical markers of catecholamine output and metabolism.


Asunto(s)
Neoplasias de las Glándulas Suprarrenales/diagnóstico , Biomarcadores de Tumor/análisis , Catecolaminas/análisis , Metanefrina/análisis , Feocromocitoma/diagnóstico , Adolescente , Neoplasias de las Glándulas Suprarrenales/sangre , Neoplasias de las Glándulas Suprarrenales/orina , Adulto , Catecolaminas/metabolismo , Niño , Preescolar , Humanos , Lactante , Persona de Mediana Edad , Feocromocitoma/sangre , Feocromocitoma/orina , Curva ROC , Sensibilidad y Especificidad
12.
J Biol Chem ; 284(4): 2584-92, 2009 Jan 23.
Artículo en Inglés | MEDLINE | ID: mdl-19028686

RESUMEN

Cathepsin K is responsible for the degradation of type I collagen in osteoclast-mediated bone resorption. Collagen fragments are known to be biologically active in a number of cell types. Here, we investigate their potential to regulate osteoclast activity. Mature murine osteoclasts were seeded on type I collagen for actin ring assays or dentine discs for resorption assays. Cells were treated with cathepsins K-, L-, or MMP-1-predigested type I collagen or soluble bone fragments for 24 h. The presence of actin rings was determined fluorescently by staining for actin. We found that the percentage of osteoclasts displaying actin rings and the area of resorbed dentine decreased significantly on addition of cathepsin K-digested type I collagen or bone fragments, but not with cathepsin L or MMP-1 digests. Counterintuitively, actin ring formation was found to decrease in the presence of the cysteine proteinase inhibitor LHVS and in cathepsin K-deficient osteoclasts. However, cathepsin L deficiency or the general MMP inhibitor GM6001 had no effect on the presence of actin rings. Predigestion of the collagen matrix with cathepsin K, but not by cathepsin L or MMP-1 resulted in an increased actin ring presence in cathepsin K-deficient osteoclasts. These studies suggest that cathepsin K interaction with type I collagen is required for 1) the release of cryptic Arg-Gly-Asp motifs during the initial attachment of osteoclasts and 2) termination of resorption via the creation of autocrine signals originating from type I collagen degradation.


Asunto(s)
Actinas/metabolismo , Resorción Ósea/metabolismo , Catepsinas/metabolismo , Osteoclastos/metabolismo , Animales , Resorción Ósea/genética , Resorción Ósea/patología , Catepsina K , Catepsina L , Catepsinas/deficiencia , Catepsinas/genética , Proliferación Celular , Células Cultivadas , Colágeno/metabolismo , Cisteína Endopeptidasas/metabolismo , Activación Enzimática , Metaloproteasas/antagonistas & inhibidores , Metaloproteasas/metabolismo , Ratones , Ratones Endogámicos C57BL , Ratones Noqueados , Osteoclastos/citología , Osteoclastos/efectos de los fármacos , Inhibidores de Proteasas/farmacología , Solubilidad
13.
J Biomed Biotechnol ; 2009: 587405, 2009.
Artículo en Inglés | MEDLINE | ID: mdl-20111740

RESUMEN

Scientific advances are raising expectations that patient-tailored treatment will soon be available. The development of resulting clinical approaches needs to be based on well-designed experimental and observational procedures that provide data to which proper biostatistical analyses are applied. Gene expression microarray and related technology are rapidly evolving. It is providing extremely large gene expression profiles containing many thousands of measurements. Choosing a subset from these gene expression measurements to include in a gene expression signature is one of the many challenges needing to be met. Choice of this signature depends on many factors, including the selection of patients in the training set. So the reliability and reproducibility of the resultant prognostic gene signature needs to be evaluated, in such a way as to be relevant to the clinical setting. A relatively straightforward approach is based on cross validation, with separate selection of genes at each iteration to avoid selection bias. Within this approach we developed two different methods, one based on forward selection, the other on genes that were statistically significant in all training blocks of data. We demonstrate our approach to gene signature evaluation with a well-known breast cancer data set.


Asunto(s)
Neoplasias de la Mama/genética , Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , Neoplasias de la Mama/patología , Biología Computacional/normas , Femenino , Perfilación de la Expresión Génica/normas , Humanos , Modelos de Riesgos Proporcionales , Reproducibilidad de los Resultados
14.
BMC Bioinformatics ; 8: 486, 2007 Dec 20.
Artículo en Inglés | MEDLINE | ID: mdl-18096081

RESUMEN

BACKGROUND: An ever increasing number of techniques are being used to find genes with similar profiles from microarray studies. Visualization of gene expression profiles can aid this process, potentially contributing to the identification of co-regulated genes and gene function as well as network development. RESULTS: We introduce the h-Profile plot to display gene expression profiles. Thumbnail versions of plots of gene expression profiles are plotted at coordinates such that profiles of similar shape are located in the same sector, with decreasing variance towards the origin. Negatively correlated profiles can easily be identified. A new method for selecting genes with fixed periodicity, but different phase and amplitude is described and used to demonstrate the use of the plots on cell cycle data. CONCLUSION: Visualization tools for gene expression data are important and h-profile plots provide a timely contribution to the field. They allow the simultaneous visualization of many gene expression profiles and can be used for the identification of genes with similar or reversed profiles, the foundation step in many analyses.


Asunto(s)
Bases de Datos Genéticas , Perfilación de la Expresión Génica/métodos , Regulación de la Expresión Génica , Regulación de la Expresión Génica/fisiología , Análisis por Matrices de Proteínas/métodos , Factores de Tiempo , Levaduras/genética
15.
BMC Genomics ; 8: 404, 2007 Nov 07.
Artículo en Inglés | MEDLINE | ID: mdl-17986358

RESUMEN

BACKGROUND: Hypertension is a complex disease with many contributory genetic and environmental factors. We aimed to identify common targets for therapy by gene expression profiling of a resistance artery taken from animals representing two different models of hypertension. We studied gene expression and morphology of a saphenous artery branch in normotensive WKY rats, spontaneously hypertensive rats (SHR) and adrenocorticotropic hormone (ACTH)-induced hypertensive rats. RESULTS: Differential remodeling of arteries occurred in SHR and ACTH-treated rats, involving changes in both smooth muscle and endothelium. Increased expression of smooth muscle cell growth promoters and decreased expression of growth suppressors confirmed smooth muscle cell proliferation in SHR but not in ACTH. Differential gene expression between arteries from the two hypertensive models extended to the renin-angiotensin system, MAP kinase pathways, mitochondrial activity, lipid metabolism, extracellular matrix and calcium handling. In contrast, arteries from both hypertensive models exhibited significant increases in caveolin-1 expression and decreases in the regulators of G-protein signalling, Rgs2 and Rgs5. Increased protein expression of caveolin-1 and increased incidence of caveolae was found in both smooth muscle and endothelial cells of arteries from both hypertensive models. CONCLUSION: We conclude that the majority of differences in gene expression found in the saphenous artery taken from rats with two different forms of hypertension reflect distinctive morphological and physiological alterations. However, changes in common to caveolin-1 expression and G protein signalling, through attenuation of Rgs2 and Rgs5, may contribute to hypertension through augmentation of vasoconstrictor pathways and provide potential targets for common drug development.


Asunto(s)
Vasos Sanguíneos/metabolismo , Caveolina 1/genética , Perfilación de la Expresión Génica , Hipertensión/genética , Modelos Genéticos , Proteínas RGS/genética , Animales , Análisis de Secuencia por Matrices de Oligonucleótidos , Reacción en Cadena de la Polimerasa , Ratas , Ratas Endogámicas SHR , Ratas Endogámicas WKY , Especificidad de la Especie
17.
Genome Biol ; 8(1): R12, 2007.
Artículo en Inglés | MEDLINE | ID: mdl-17239257

RESUMEN

BACKGROUND: T cells in the thymus undergo opposing positive and negative selection processes so that the only T cells entering circulation are those bearing a T cell receptor (TCR) with a low affinity for self. The mechanism differentiating negative from positive selection is poorly understood, despite the fact that inherited defects in negative selection underlie organ-specific autoimmune disease in AIRE-deficient people and the non-obese diabetic (NOD) mouse strain RESULTS: Here we use homogeneous populations of T cells undergoing either positive or negative selection in vivo together with genome-wide transcription profiling on microarrays to identify the gene expression differences underlying negative selection to an Aire-dependent organ-specific antigen, including the upregulation of a genomic cluster in the cytogenetic band 2F. Analysis of defective negative selection in the autoimmune-prone NOD strain demonstrates a global impairment in the induction of the negative selection response gene set, but little difference in positive selection response genes. Combining expression differences with genetic linkage data, we identify differentially expressed candidate genes, including Bim, Bnip3, Smox, Pdrg1, Id1, Pdcd1, Ly6c, Pdia3, Trim30 and Trim12. CONCLUSION: The data provide a molecular map of the negative selection response in vivo and, by analysis of deviations from this pathway in the autoimmune susceptible NOD strain, suggest that susceptibility arises from small expression differences in genes acting at multiple points in the pathway between the TCR and cell death.


Asunto(s)
Diabetes Mellitus/genética , Perfilación de la Expresión Génica , Predisposición Genética a la Enfermedad , Genómica , Selección Genética , Linfocitos T/metabolismo , Alelos , Animales , Diferenciación Celular , Femenino , Regulación de la Expresión Génica , Ligamiento Genético , Genoma , Ratones , Ratones Endogámicos NOD , Ratones Mutantes , Especificidad de Órganos , ARN Mensajero/análisis , ARN Mensajero/genética , Linfocitos T/citología , Linfocitos T/inmunología , Timo/citología
18.
Bioinformatics ; 22(17): 2162-3, 2006 Sep 01.
Artículo en Inglés | MEDLINE | ID: mdl-16766557

RESUMEN

UNLABELLED: Most phylogenetic methods assume that the sequences evolved under homogeneous, stationary and reversible conditions. Compositional heterogeneity in data intended for studies of phylogeny suggests that the data did not evolve under these conditions. SeqVis, a Java application for analysis of nucleotide content, reads sequence alignments in several formats and plots the nucleotide content in a tetrahedron. Once plotted, outliers can be identified, thus allowing for decisions on the applicability of the data for phylogenetic analysis. AVAILABILITY: http://www.bio.usyd.edu.au/jermiin/programs.htm.


Asunto(s)
Algoritmos , Alineación de Secuencia/métodos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Interfaz Usuario-Computador , Secuencia de Bases , Gráficos por Computador , Secuencia Conservada , Datos de Secuencia Molecular , Homología de Secuencia de Ácido Nucleico
19.
Gastroenterology ; 130(6): 1670-8, 2006 May.
Artículo en Inglés | MEDLINE | ID: mdl-16697731

RESUMEN

BACKGROUND & AIMS: Pancreatic adenocarcinoma is a most devastating cancer that presents late and is rapidly progressive. This study aimed to identify unique, tissue-specific protein biomarkers capable of differentiating pancreatic adenocarcinoma (PC) from adjacent uninvolved pancreatic tissue (AP), benign pancreatic disease (B), and nonmalignant tumor tissue (NM). METHODS: Tissue samples representing PC (n = 31), AP (n = 44), and B (n = 19) tissue were analyzed on hydrophobic protein chip arrays by surface-enhanced laser desorption/ionization time-of-flight mass spectrometry. Training models were developed using logistic regression and validated using the 10-fold cross-validation approach. RESULTS: The hydrophobic protein chip array revealed 13 protein peaks differentially expressed between PC and AP (receiver operating characteristic [ROC] area under the curve [AUC], 0.64-0.85), 8 between PC and B (ROC AUC, 0.67-0.78), and 12 between PC and NM tissue (ROC AUC, 0.63-0.81). Logistic regression and cross-validation identified overlapping panels of peaks to develop a training model that distinguished PC from AP (77.4% sensitivity, 84.1% specificity), B (83.9% sensitivity, 78.9% specificity), and NM tissue (58.1% sensitivity, 90.5% specificity). The final panels selected correctly classified 80.6% of PC and 88.6% of AP samples (ROC AUC, 0.92), 93.5% of PC and 89.5% of B samples (ROC AUC, 0.99), and 71.0% of PC and 92.1% of NM samples (ROC AUC, 0.91). CONCLUSIONS: This study used surface-enhanced laser desorption/ionization time-of-flight mass spectrometry to discover a number of protein panels that can distinguish effectively between pancreatic adenocarcinoma, benign, and adjacent pancreatic tissue. Identification of these proteins will add to our understanding of the biology of pancreatic cancer. Furthermore, these protein panels may have important diagnostic implications.


Asunto(s)
Adenocarcinoma/patología , Biomarcadores de Tumor/análisis , Neoplasias Pancreáticas/patología , Pancreatitis Crónica/patología , Proteómica/clasificación , Adenocarcinoma/cirugía , Adulto , Anciano , Anciano de 80 o más Años , Análisis de Varianza , Biopsia con Aguja , Estudios de Casos y Controles , Diagnóstico Diferencial , Femenino , Humanos , Inmunohistoquímica , Masculino , Persona de Mediana Edad , Análisis Multivariante , Neoplasias Pancreáticas/cirugía , Pancreatitis Crónica/cirugía , Pronóstico , Curva ROC , Valores de Referencia , Medición de Riesgo , Muestreo , Sensibilidad y Especificidad , Espectrometría de Masa por Láser de Matriz Asistida de Ionización Desorción , Estadísticas no Paramétricas
20.
Biometrics ; 61(2): 630-2; discussion 632-4, 2005 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-16011715

RESUMEN

This note is in response to Wouters et al. (2003, Biometrics 59, 1131-1139) who compared three methods for exploring gene expression data. Contrary to their summary that principal component analysis is not very informative, we show that it is possible to determine principal component analyses that are useful for exploratory analysis of microarray data. We also present another biplot representation, the GE-biplot (Gene Expression biplot), that is a useful method for exploring gene expression data with the major advantage of being able to aid interpretation of both the samples and the genes relative to each other.


Asunto(s)
Biología Computacional/métodos , Interpretación Estadística de Datos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Análisis de Componente Principal , Algoritmos , Expresión Génica , Modelos Genéticos , Modelos Estadísticos , Modelos Teóricos , Análisis Multivariante , Reproducibilidad de los Resultados
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...