Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 42
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Bioinformatics ; 37(16): 2299-2307, 2021 Aug 25.
Artículo en Inglés | MEDLINE | ID: mdl-33599251

RESUMEN

MOTIVATION: Off-target predictions are crucial in gene editing research. Recently, significant progress has been made in the field of prediction of off-target mutations, particularly with CRISPR-Cas9 data, thanks to the use of deep learning. CRISPR-Cas9 is a gene editing technique which allows manipulation of DNA fragments. The sgRNA-DNA (single guide RNA-DNA) sequence encoding for deep neural networks, however, has a strong impact on the prediction accuracy. We propose a novel encoding of sgRNA-DNA sequences that aggregates sequence data with no loss of information. RESULTS: In our experiments, we compare the proposed sgRNA-DNA sequence encoding applied in a deep learning prediction framework with state-of-the-art encoding and prediction methods. We demonstrate the superior accuracy of our approach in a simulation study involving Feedforward Neural Networks (FNNs), Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) as well as the traditional Random Forest (RF), Naive Bayes (NB) and Logistic Regression (LR) classifiers. We highlight the quality of our results by building several FNNs, CNNs and RNNs with various layer depths and performing predictions on two popular gene editing datasets (CRISPOR and GUIDE-seq). In all our experiments, the new encoding led to more accurate off-target prediction results, providing an improvement of the area under the Receiver Operating Characteristic (ROC) curve up to 35%. AVAILABILITY AND IMPLEMENTATION: The code and data used in this study are available at: https://github.com/dagrate/dl-offtarget. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

2.
PLoS Biol ; 17(4): e3000188, 2019 04.
Artículo en Inglés | MEDLINE | ID: mdl-30964856

RESUMEN

The need for replication of initial results has been rediscovered only recently in many fields of research. In preclinical biomedical research, it is common practice to conduct exact replications with the same sample sizes as those used in the initial experiments. Such replication attempts, however, have lower probability of replication than is generally appreciated. Indeed, in the common scenario of an effect just reaching statistical significance, the statistical power of the replication experiment assuming the same effect size is approximately 50%-in essence, a coin toss. Accordingly, we use the provocative analogy of "replicating" a neuroprotective drug animal study with a coin flip to highlight the need for larger sample sizes in replication experiments. Additionally, we provide detailed background for the probability of obtaining a significant p value in a replication experiment and discuss the variability of p values as well as pitfalls of simple binary significance testing in both initial preclinical experiments and replication studies with small sample sizes. We conclude that power analysis for determining the sample size for a replication study is obligatory within the currently dominant hypothesis testing framework. Moreover, publications should include effect size point estimates and corresponding measures of precision, e.g., confidence intervals, to allow readers to assess the magnitude and direction of reported effects and to potentially combine the results of initial and replication study later through Bayesian or meta-analytic approaches.


Asunto(s)
Investigación Biomédica/métodos , Reproducibilidad de los Resultados , Proyectos de Investigación/estadística & datos numéricos , Animales , Teorema de Bayes , Investigación Biomédica/estadística & datos numéricos , Interpretación Estadística de Datos , Humanos , Modelos Estadísticos , Probabilidad , Publicaciones , Tamaño de la Muestra
3.
Proc Natl Acad Sci U S A ; 113(44): 12360-12367, 2016 11 01.
Artículo en Inglés | MEDLINE | ID: mdl-27791185

RESUMEN

Translational control of gene expression plays a key role during the early phases of embryonic development. Here we describe a transcriptional regulator of mouse embryonic stem cells (mESCs), Yin-yang 2 (YY2), that is controlled by the translation inhibitors, Eukaryotic initiation factor 4E-binding proteins (4E-BPs). YY2 plays a critical role in regulating mESC functions through control of key pluripotency factors, including Octamer-binding protein 4 (Oct4) and Estrogen-related receptor-ß (Esrrb). Importantly, overexpression of YY2 directs the differentiation of mESCs into cardiovascular lineages. We show that the splicing regulator Polypyrimidine tract-binding protein 1 (PTBP1) promotes the retention of an intron in the 5'-UTR of Yy2 mRNA that confers sensitivity to 4E-BP-mediated translational suppression. Thus, we conclude that YY2 is a major regulator of mESC self-renewal and lineage commitment and document a multilayer regulatory mechanism that controls its expression.


Asunto(s)
Empalme Alternativo/fisiología , Diferenciación Celular , Autorrenovación de las Células/fisiología , Células Madre Embrionarias/metabolismo , Regulación del Desarrollo de la Expresión Génica , Factores de Transcripción/metabolismo , Animales , Blastocisto/metabolismo , Proteínas Portadoras/metabolismo , Linaje de la Célula , Autorrenovación de las Células/genética , Ribonucleoproteínas Nucleares Heterogéneas/genética , Intrones , Ratones , Ratones Noqueados , Modelos Biológicos , Factor 3 de Transcripción de Unión a Octámeros/metabolismo , Fosfoproteínas , Proteína de Unión al Tracto de Polipirimidina/genética , Biosíntesis de Proteínas/genética , ARN Mensajero/metabolismo , ARN Interferente Pequeño/genética , ARN Interferente Pequeño/metabolismo , Receptores de Estrógenos/metabolismo , Factores de Transcripción/genética , Transcripción Genética/fisiología , Factor de Transcripción YY1/metabolismo
4.
Bioinformatics ; 33(20): 3258-3267, 2017 Oct 15.
Artículo en Inglés | MEDLINE | ID: mdl-28633418

RESUMEN

MOTIVATION: Considerable attention has been paid recently to improve data quality in high-throughput screening (HTS) and high-content screening (HCS) technologies widely used in drug development and chemical toxicity research. However, several environmentally- and procedurally-induced spatial biases in experimental HTS and HCS screens decrease measurement accuracy, leading to increased numbers of false positives and false negatives in hit selection. Although effective bias correction methods and software have been developed over the past decades, almost all of these tools have been designed to reduce the effect of additive bias only. Here, we address the case of multiplicative spatial bias. RESULTS: We introduce three new statistical methods meant to reduce multiplicative spatial bias in screening technologies. We assess the performance of the methods with synthetic and real data affected by multiplicative spatial bias, including comparisons with current bias correction methods. We also describe a wider data correction protocol that integrates methods for removing both assay and plate-specific spatial biases, which can be either additive or multiplicative. CONCLUSIONS: The methods for removing multiplicative spatial bias and the data correction protocol are effective in detecting and cleaning experimental data generated by screening technologies. As our protocol is of a general nature, it can be used by researchers analyzing current or next-generation high-throughput screens. AVAILABILITY AND IMPLEMENTATION: The AssayCorrector program, implemented in R, is available on CRAN. CONTACT: makarenkov.vladimir@uqam.ca. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Bioensayo/métodos , Biología Computacional/métodos , Ensayos Analíticos de Alto Rendimiento/métodos , Programas Informáticos , Sesgo , Descubrimiento de Drogas/métodos , Infecciones por VIH/tratamiento farmacológico , Humanos , Toxicología/métodos
5.
Brief Bioinform ; 16(6): 974-86, 2015 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-25750417

RESUMEN

Significant efforts have been made recently to improve data throughput and data quality in screening technologies related to drug design. The modern pharmaceutical industry relies heavily on high-throughput screening (HTS) and high-content screening (HCS) technologies, which include small molecule, complementary DNA (cDNA) and RNA interference (RNAi) types of screening. Data generated by these screening technologies are subject to several environmental and procedural systematic biases, which introduce errors into the hit identification process. We first review systematic biases typical of HTS and HCS screens. We highlight that study design issues and the way in which data are generated are crucial for providing unbiased screening results. Considering various data sets, including the publicly available ChemBank data, we assess the rates of systematic bias in experimental HTS by using plate-specific and assay-specific error detection tests. We describe main data normalization and correction techniques and introduce a general data preprocessing protocol. This protocol can be recommended for academic and industrial researchers involved in the analysis of current or next-generation HTS data.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/normas , ADN Complementario/genética , Interferencia de ARN , Reproducibilidad de los Resultados
6.
Mol Cell Proteomics ; 13(2): 489-502, 2014 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-24319055

RESUMEN

Endoglin and activin receptor-like kinase 1 are specialized transforming growth factor-beta (TGF-ß) superfamily receptors, primarily expressed in endothelial cells. Mutations in the corresponding ENG or ACVRL1 genes lead to hereditary hemorrhagic telangiectasia (HHT1 and HHT2 respectively). To discover proteins interacting with endoglin, ACVRL1 and TGF-ß receptor type 2 and involved in TGF-ß signaling, we applied LUMIER, a high-throughput mammalian interactome mapping technology. Using stringent criteria, we identified 181 novel unique and shared interactions with ACVRL1, TGF-ß receptor type 2, and endoglin, defining potential novel important vascular networks. In particular, the regulatory subunit B-beta of the protein phosphatase PP2A (PPP2R2B) interacted with all three receptors. Interestingly, the PPP2R2B gene lies in an interval in linkage disequilibrium with HHT3, for which the gene remains unidentified. We show that PPP2R2B protein interacts with the ACVRL1/TGFBR2/endoglin complex and recruits PP2A to nitric oxide synthase 3 (NOS3). Endoglin overexpression in endothelial cells inhibits the association of PPP2R2B with NOS3, whereas endoglin-deficient cells show enhanced PP2A-NOS3 interaction and lower levels of endogenous NOS3 Serine 1177 phosphorylation. Our data suggest that endoglin regulates NOS3 activation status by regulating PPP2R2B access to NOS3, and that PPP2R2B might be the HHT3 gene. Furthermore, endoglin and ACVRL1 contribute to several novel networks, including TGF-ß dependent and independent ones, critical for vascular function and potentially defective in HHT.


Asunto(s)
Receptores de Activinas Tipo II/metabolismo , Antígenos CD/metabolismo , Vasos Sanguíneos/metabolismo , Mapas de Interacción de Proteínas , Receptores de Superficie Celular/metabolismo , Animales , Embrión de Mamíferos , Endoglina , Endotelio Vascular/metabolismo , Endotelio Vascular/patología , Células HEK293 , Humanos , Ratones , Ratones Noqueados , Unión Proteica , Telangiectasia Hemorrágica Hereditaria/metabolismo , Telangiectasia Hemorrágica Hereditaria/patología , Factor de Crecimiento Transformador beta/metabolismo
7.
Bioinformatics ; 29(23): 3067-72, 2013 Dec 01.
Artículo en Inglés | MEDLINE | ID: mdl-24058057

RESUMEN

MOTIVATION: Advantages of statistical testing of high-throughput screens include P-values, which provide objective benchmarks of compound activity, and false discovery rate estimation. The cost of replication required for statistical testing, however, may often be prohibitive. We introduce the single assay-wide variance experimental (SAVE) design whereby a small replicated subset of an entire screen is used to derive empirical Bayes random error estimates, which are applied to the remaining majority of unreplicated measurements. RESULTS: The SAVE design is able to generate P-values comparable with those generated with full replication data. It performs almost as well as the random variance model t-test with duplicate data and outperforms the commonly used Z-scores with unreplicated data and the standard t-test. We illustrate the approach with simulated data and with experimental small molecule and small interfering RNA screens. The SAVE design provides substantial performance improvements over unreplicated screens with only slight increases in cost.


Asunto(s)
Ensayos Analíticos de Alto Rendimiento/métodos , Modelos Teóricos , Preparaciones Farmacéuticas/química , Proyectos de Investigación , Teorema de Bayes , Simulación por Computador
8.
Bioinformatics ; 28(13): 1775-82, 2012 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-22563067

RESUMEN

MOTIVATION: Rapid advances in biomedical sciences and genetics have increased the pressure on drug development companies to promptly translate new knowledge into treatments for disease. Impelled by the demand and facilitated by technological progress, the number of compounds evaluated during the initial high-throughput screening (HTS) step of drug discovery process has steadily increased. As a highly automated large-scale process, HTS is prone to systematic error caused by various technological and environmental factors. A number of error correction methods have been designed to reduce the effect of systematic error in experimental HTS (Brideau et al., 2003; Carralot et al., 2012; Kevorkov and Makarenkov, 2005; Makarenkov et al., 2007; Malo et al., 2010). Despite their power to correct systematic error when it is present, the applicability of those methods in practice is limited by the fact that they can potentially introduce a bias when applied to unbiased data. We describe two new methods for eliminating systematic error from HTS data based on a prior knowledge of the error location. This information can be obtained using a specific version of the t-test or of the χ(2) goodness-of-fit test as discussed in Dragiev et al. (2011). We will show that both new methods constitute an important improvement over the standard practice of not correcting for systematic error at all as well as over the B-score correction procedure (Brideau et al., 2003) which is widely used in the modern HTS. We will also suggest a more general data preprocessing framework where the new methods can be applied in combination with the Well Correction procedure (Makarenkov et al., 2007). Such a framework will allow for removing systematic biases affecting all plates of a given screen as well as those relative to some of its individual plates.


Asunto(s)
Ensayos Analíticos de Alto Rendimiento/métodos , Simulación por Computador , Descubrimiento de Drogas
9.
Bioinformatics ; 28(20): 2632-9, 2012 Oct 15.
Artículo en Inglés | MEDLINE | ID: mdl-22914219

RESUMEN

MOTIVATION: Image non-uniformity (NU) refers to systematic, slowly varying spatial gradients in images that result in a bias that can affect all downstream image processing, quantification and statistical analysis steps. Image NU is poorly modeled in the field of high-content screening (HCS), however, such that current conventional correction algorithms may be either inappropriate for HCS or fail to take advantage of the information available in HCS image data. RESULTS: A novel image NU bias correction algorithm, termed intensity quantile estimation and mapping (IQEM), is described. The algorithm estimates the full non-linear form of the image NU bias by mapping pixel intensities to a reference intensity quantile function. IQEM accounts for the variation in NU bias over broad cell intensity ranges and data acquisition times, both of which are characteristic of HCS image datasets. Validation of the method, using simulated and HCS microtubule polymerization screen images, is presented. Two requirements of IQEM are that the dataset consists of large numbers of images acquired under identical conditions and that cells are distributed with no within-image spatial preference. AVAILABILITY AND IMPLEMENTATION: MATLAB function files are available at http://nadon-mugqic.mcgill.ca/.


Asunto(s)
Algoritmos , Procesamiento de Imagen Asistido por Computador/métodos , Células HeLa , Humanos , Microtúbulos/ultraestructura
10.
Proc Natl Acad Sci U S A ; 107(50): 21487-92, 2010 Dec 14.
Artículo en Inglés | MEDLINE | ID: mdl-21115840

RESUMEN

Regulation of gene expression through translational control is a fundamental mechanism implicated in many biological processes ranging from memory formation to innate immunity and whose dysregulation contributes to human diseases. Genome wide analyses of translational control strive to identify differential translation independent of cytosolic mRNA levels. For this reason, most studies measure genes' translation levels as log ratios (translation levels divided by corresponding cytosolic mRNA levels obtained in parallel). Counterintuitively, arising from a mathematical necessity, these log ratios tend to be highly correlated with the cytosolic mRNA levels. Accordingly, they do not effectively correct for cytosolic mRNA level and generate substantial numbers of biological false positives and false negatives. We show that analysis of partial variance, which produces estimates of translational activity that are independent of cytosolic mRNA levels, is a superior alternative. When combined with a variance shrinkage method for estimating error variance, analysis of partial variance has the additional benefit of having greater statistical power and identifying fewer genes as translationally regulated resulting merely from unrealistically low variance estimates rather than from large changes in translational activity. In contrast to log ratios, this formal analytical approach estimates translation effects in a statistically rigorous manner, eliminates the need for inefficient and error-prone heuristics, and produces results that agree with biological function. The method is applicable to datasets obtained from both the commonly used polysome microarray method and the sequencing-based ribosome profiling method.


Asunto(s)
Genoma Humano , Estudio de Asociación del Genoma Completo , Biosíntesis de Proteínas , Bases de Datos Genéticas , Regulación de la Expresión Génica , Humanos , Análisis de Secuencia por Matrices de Oligonucleótidos , ARN Mensajero/genética , ARN Mensajero/metabolismo , Ribosomas/genética , Ribosomas/metabolismo
11.
Bioinformatics ; 27(10): 1440-1, 2011 May 15.
Artículo en Inglés | MEDLINE | ID: mdl-21422072

RESUMEN

UNLABELLED: Translational control of gene expression has emerged as a major mechanism that regulates many biological processes and shows dysregulation in human diseases including cancer. When studying differential translation, levels of both actively translating mRNAs and total cytosolic mRNAs are obtained where the latter is used to correct for a possible contribution of differential cytosolic mRNA levels to the observed differential levels of actively translated mRNAs. We have recently shown that analysis of partial variance (APV) corrects for cytosolic mRNA levels more effectively than the commonly applied log ratio approach. APV provides a high degree of specificity and sensitivity for detecting biologically meaningful translation changes, especially when combined with a variance shrinkage method for estimating random error. Here we describe the anota (analysis of translational activity) R-package which implements APV, allows scrutiny of associated statistical assumptions and provides biologically motivated filters for analysis of genome wide datasets. Although the package was developed for analysis of differential translation in polysome microarray or ribosome-profiling datasets, any high-dimensional data that result in paired controls, such as RNP immunoprecipitation-microarray (RIP-CHIP) datasets, can be successfully analyzed with anota. AVAILABILITY: The anota Bioconductor package, www.bioconductor.org.


Asunto(s)
Regulación de la Expresión Génica , Estudio de Asociación del Genoma Completo/métodos , Biosíntesis de Proteínas , Programas Informáticos , Genoma Humano , Humanos , ARN Mensajero/biosíntesis , ARN Mensajero/genética , ARN Mensajero/metabolismo , Ribosomas/genética , Ribosomas/metabolismo
12.
BMC Bioinformatics ; 12: 25, 2011 Jan 19.
Artículo en Inglés | MEDLINE | ID: mdl-21247425

RESUMEN

BACKGROUND: High-throughput screening (HTS) is a key part of the drug discovery process during which thousands of chemical compounds are screened and their activity levels measured in order to identify potential drug candidates (i.e., hits). Many technical, procedural or environmental factors can cause systematic measurement error or inequalities in the conditions in which the measurements are taken. Such systematic error has the potential to critically affect the hit selection process. Several error correction methods and software have been developed to address this issue in the context of experimental HTS 1234567. Despite their power to reduce the impact of systematic error when applied to error perturbed datasets, those methods also have one disadvantage - they introduce a bias when applied to data not containing any systematic error 6. Hence, we need first to assess the presence of systematic error in a given HTS assay and then carry out systematic error correction method if and only if the presence of systematic error has been confirmed by statistical tests. RESULTS: We tested three statistical procedures to assess the presence of systematic error in experimental HTS data, including the χ2 goodness-of-fit test, Student's t-test and Kolmogorov-Smirnov test 8 preceded by the Discrete Fourier Transform (DFT) method 9. We applied these procedures to raw HTS measurements, first, and to estimated hit distribution surfaces, second. The three competing tests were applied to analyse simulated datasets containing different types of systematic error, and to a real HTS dataset. Their accuracy was compared under various error conditions. CONCLUSIONS: A successful assessment of the presence of systematic error in experimental HTS assays is possible when the appropriate statistical methodology is used. Namely, the t-test should be carried out by researchers to determine whether systematic error is present in their HTS data prior to applying any error correction method. This important step can significantly improve the quality of selected hits.


Asunto(s)
Descubrimiento de Drogas , Ensayos Analíticos de Alto Rendimiento/métodos , Interpretación Estadística de Datos , Programas Informáticos
13.
Bioinformatics ; 26(1): 98-103, 2010 Jan 01.
Artículo en Inglés | MEDLINE | ID: mdl-19892804

RESUMEN

MOTIVATION: Labeling techniques are being used increasingly to estimate relative protein abundances in quantitative proteomic studies. These techniques require the accurate measurement of correspondingly labeled peptide peak intensities to produce high-quality estimates of differential expression ratios. In mass spectrometers with counting detectors, the measurement noise varies with intensity and consequently accuracy increases with the number of ions detected. Consequently, the relative variability of peptide intensity measurements varies with intensity. This effect must be accounted for when combining information from multiple peptides to estimate relative protein abundance. RESULTS: We examined a variety of algorithms that estimate protein differential expression ratios from multiple peptide intensity measurements. Algorithms that account for the variation of measurement error with intensity were found to provide the most accurate estimates of differential abundance. A simple Sum-of-Intensities algorithm provided the best estimates of true protein ratios of all algorithms tested.


Asunto(s)
Algoritmos , Marcaje Isotópico/métodos , Mapeo Peptídico/métodos , Proteínas/análisis , Proteínas/química , Secuencia de Aminoácidos , Datos de Secuencia Molecular , Sensibilidad y Especificidad
14.
Proc Natl Acad Sci U S A ; 105(31): 10853-8, 2008 Aug 05.
Artículo en Inglés | MEDLINE | ID: mdl-18664580

RESUMEN

Activation of the phosphatidylinositol 3-kinase (PI3K)/AKT signaling pathway is a frequent occurrence in human cancers and a major promoter of chemotherapeutic resistance. Inhibition of one downstream target in this pathway, mTORC1, has shown potential to improve chemosensitivity. However, the mechanisms and genetic modifications that confer sensitivity to mTORC1 inhibitors remain unclear. Here, we demonstrate that loss of TSC2 in the E mu-myc murine lymphoma model leads to mTORC1 activation and accelerated oncogenesis caused by a defective apoptotic program despite compromised AKT phosphorylation. Tumors from Tsc2(+/-)E mu-Myc mice underwent rapid apoptosis upon blockade of mTORC1 by rapamycin. We identified myeloid cell leukemia sequence 1 (Mcl-1), a bcl-2 like family member, as a translationally regulated genetic determinant of mTORC1-dependent survival. Our results indicate that the extent by which rapamycin can modulate expression of Mcl-1 is an important feature of the rapamycin response.


Asunto(s)
Regulación Neoplásica de la Expresión Génica/fisiología , Linfoma/metabolismo , Proteínas Proto-Oncogénicas c-bcl-2/metabolismo , Transducción de Señal/fisiología , Sirolimus/metabolismo , Factores de Transcripción/metabolismo , Animales , Regulación Neoplásica de la Expresión Génica/efectos de los fármacos , Immunoblotting , Inmunoprecipitación , Diana Mecanicista del Complejo 1 de la Rapamicina , Ratones , Complejos Multiproteicos , Proteína 1 de la Secuencia de Leucemia de Células Mieloides , Proteínas , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa , Sirolimus/farmacología , Serina-Treonina Quinasas TOR , Factores de Transcripción/antagonistas & inhibidores , Proteína 2 del Complejo de la Esclerosis Tuberosa , Proteínas Supresoras de Tumor/genética , Proteínas Supresoras de Tumor/metabolismo
15.
BMC Bioinformatics ; 10: 45, 2009 Feb 03.
Artículo en Inglés | MEDLINE | ID: mdl-19192265

RESUMEN

BACKGROUND: DNA microarrays provide data for genome wide patterns of expression between observation classes. Microarray studies often have small samples sizes, however, due to cost constraints or specimen availability. This can lead to poor random error estimates and inaccurate statistical tests of differential expression. We compare the performance of the standard t-test, fold change, and four small n statistical test methods designed to circumvent these problems. We report results of various normalization methods for empirical microarray data and of various random error models for simulated data. RESULTS: Three Empirical Bayes methods (CyberT, BRB, and limma t-statistics) were the most effective statistical tests across simulated and both 2-colour cDNA and Affymetrix experimental data. The CyberT regularized t-statistic in particular was able to maintain expected false positive rates with simulated data showing high variances at low gene intensities, although at the cost of low true positive rates. The Local Pooled Error (LPE) test introduced a bias that lowered false positive rates below theoretically expected values and had lower power relative to the top performers. The standard two-sample t-test and fold change were also found to be sub-optimal for detecting differentially expressed genes. The generalized log transformation was shown to be beneficial in improving results with certain data sets, in particular high variance cDNA data. CONCLUSION: Pre-processing of data influences performance and the proper combination of pre-processing and statistical testing is necessary for obtaining the best results. All three Empirical Bayes methods assessed in our study are good choices for statistical tests for small n microarray studies for both Affymetrix and cDNA data. Choice of method for a particular study will depend on software and normalization preferences.


Asunto(s)
Biología Computacional/métodos , Modelos Estadísticos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Algoritmos , ADN Complementario/química , Perfilación de la Expresión Génica/métodos
16.
Trends Genet ; 22(2): 84-9, 2006 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-16377025

RESUMEN

Many biologists believe that data analysis expertise lags behind the capacity for producing high-throughput data. One view within the bioinformatics community is that biological scientists need to develop algorithmic skills to meet the demands of the new technologies. In this article, we argue that the broader concept of inferential literacy, which includes understanding of data characteristics, experimental design and statistical analysis, in addition to computation, more adequately encompasses what is needed for efficient progress in high-throughput biology.


Asunto(s)
Perfilación de la Expresión Génica/normas , Análisis de Secuencia por Matrices de Oligonucleótidos/normas , Animales , Biología Computacional , Humanos , Reproducibilidad de los Resultados , Programas Informáticos
17.
Bioinformatics ; 24(15): 1735-6, 2008 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-18450812

RESUMEN

UNLABELLED: Jain et al. introduced the Local Pooled Error (LPE) statistical test designed for use with small sample size microarray gene-expression data. Based on an asymptotic proof, the test multiplicatively adjusts the standard error for a test of differences between two classes of observations by pi/2 due to the use of medians rather than means as measures of central tendency. The adjustment is upwardly biased at small sample sizes, however, producing fewer than expected small P-values with a consequent loss of statistical power. We present an empirical correction to the adjustment factor which removes the bias and produces theoretically expected P-values when distributional assumptions are met. Our adjusted LPE measure should prove useful to ongoing methodological studies designed to improve the LPE's; performance for microarray and proteomics applications and for future work for other high-throughput biotechnologies. AVAILABILITY: The software is implemented in the R language and can be downloaded from the Bioconductor project website (http://www.bioconductor.org).


Asunto(s)
Algoritmos , Artefactos , Interpretación Estadística de Datos , Perfilación de la Expresión Génica/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Reproducibilidad de los Resultados , Sensibilidad y Especificidad
18.
Nat Biotechnol ; 24(2): 167-75, 2006 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-16465162

RESUMEN

High-throughput screening is an early critical step in drug discovery. Its aim is to screen a large number of diverse chemical compounds to identify candidate 'hits' rapidly and accurately. Few statistical tools are currently available, however, to detect quality hits with a high degree of confidence. We examine statistical aspects of data preprocessing and hit identification for primary screens. We focus on concerns related to positional effects of wells within plates, choice of hit threshold and the importance of minimizing false-positive and false-negative rates. We argue that replicate measurements are needed to verify assumptions of current methods and to suggest data analysis strategies when assumptions are not met. The integration of replicates with robust statistical methods in primary screens will facilitate the discovery of reliable hits, ultimately improving the sensitivity and specificity of the screening process.


Asunto(s)
Bioensayo/métodos , Biometría/métodos , Interpretación Estadística de Datos , Diseño de Fármacos , Evaluación Preclínica de Medicamentos/métodos , Perfilación de la Expresión Génica/métodos , Análisis por Micromatrices/métodos , Guías como Asunto , Reproducibilidad de los Resultados , Sensibilidad y Especificidad
19.
Elife ; 82019 07 29.
Artículo en Inglés | MEDLINE | ID: mdl-31355746

RESUMEN

A range of problems currently undermines public trust in biomedical research. We discuss four erroneous beliefs that may prevent the biomedical research community from recognizing the need to focus on deserving this trust, and thus which act as powerful barriers to necessary improvements in the research process.


Asunto(s)
Investigación Biomédica , Cultura , Opinión Pública , Confianza , Humanos
20.
Bioinformatics ; 23(13): 1648-57, 2007 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-17463024

RESUMEN

MOTIVATION: High-throughput screening (HTS) is an early-stage process in drug discovery which allows thousands of chemical compounds to be tested in a single study. We report a method for correcting HTS data prior to the hit selection process (i.e. selection of active compounds). The proposed correction minimizes the impact of systematic errors which may affect the hit selection in HTS. The introduced method, called a well correction, proceeds by correcting the distribution of measurements within wells of a given HTS assay. We use simulated and experimental data to illustrate the advantages of the new method compared to other widely-used methods of data correction and hit selection in HTS. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Artefactos , Bioensayo/métodos , Interpretación Estadística de Datos , Diseño de Fármacos , Evaluación Preclínica de Medicamentos/métodos , Tecnología Farmacéutica/métodos , Sensibilidad y Especificidad
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA