Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 17 de 17
Filtrar
Más filtros

Bases de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
BMC Genomics ; 23(1): 504, 2022 Jul 12.
Artículo en Inglés | MEDLINE | ID: mdl-35831808

RESUMEN

BACKGROUND: Using single-cell RNA sequencing (scRNA-seq) data to diagnose disease is an effective technique in medical research. Several statistical methods have been developed for the classification of RNA sequencing (RNA-seq) data, including, for example, Poisson linear discriminant analysis (PLDA), negative binomial linear discriminant analysis (NBLDA), and zero-inflated Poisson logistic discriminant analysis (ZIPLDA). Nevertheless, few existing methods perform well for large sample scRNA-seq data, in particular when the distribution assumption is also violated. RESULTS: We propose a deep learning classifier (scDLC) for large sample scRNA-seq data, based on the long short-term memory recurrent neural networks (LSTMs). Our new scDLC does not require a prior knowledge on the data distribution, but instead, it takes into account the dependency of the most outstanding feature genes in the LSTMs model. LSTMs is a special recurrent neural network, which can learn long-term dependencies of a sequence. CONCLUSIONS: Simulation studies show that our new scDLC performs consistently better than the existing methods in a wide range of settings with large sample sizes. Four real scRNA-seq datasets are also analyzed, and they coincide with the simulation results that our new scDLC always performs the best. The code named "scDLC" is publicly available at https://github.com/scDLC-code/code .


Asunto(s)
Aprendizaje Profundo , Análisis Discriminante , Perfilación de la Expresión Génica/métodos , ARN/genética , RNA-Seq , Análisis de Secuencia de ARN/métodos , Análisis de la Célula Individual/métodos
2.
Bioinformatics ; 34(8): 1329-1335, 2018 04 15.
Artículo en Inglés | MEDLINE | ID: mdl-29186294

RESUMEN

Motivation: With the development of high-throughput techniques, RNA-sequencing (RNA-seq) is becoming increasingly popular as an alternative for gene expression analysis, such as RNAs profiling and classification. Identifying which type of diseases a new patient belongs to with RNA-seq data has been recognized as a vital problem in medical research. As RNA-seq data are discrete, statistical methods developed for classifying microarray data cannot be readily applied for RNA-seq data classification. Witten proposed a Poisson linear discriminant analysis (PLDA) to classify the RNA-seq data in 2011. Note, however, that the count datasets are frequently characterized by excess zeros in real RNA-seq or microRNA sequence data (i.e. when the sequence depth is not enough or small RNAs with the length of 18-30 nucleotides). Therefore, it is desired to develop a new model to analyze RNA-seq data with an excess of zeros. Results: In this paper, we propose a Zero-Inflated Poisson Logistic Discriminant Analysis (ZIPLDA) for RNA-seq data with an excess of zeros. The new method assumes that the data are from a mixture of two distributions: one is a point mass at zero, and the other follows a Poisson distribution. We then consider a logistic relation between the probability of observing zeros and the mean of the genes and the sequencing depth in the model. Simulation studies show that the proposed method performs better than, or at least as well as, the existing methods in a wide range of settings. Two real datasets including a breast cancer RNA-seq dataset and a microRNA-seq dataset are also analyzed, and they coincide with the simulation results that our proposed method outperforms the existing competitors. Availability and implementation: The software is available at http://www.math.hkbu.edu.hk/∼tongt. Contact: xwan@comp.hkbu.edu.hk or tongt@hkbu.edu.hk. Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ARN/métodos , Programas Informáticos , Neoplasias de la Mama/genética , Análisis Discriminante , Femenino , Humanos , MicroARNs
3.
Cancer Immunol Immunother ; 66(6): 717-729, 2017 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-28246881

RESUMEN

Non-Hodgkin lymphoma (NHL) is an incurable lymphoproliferative cancer, and patients with NHL have a poor prognosis. The present study explored the regulatory mechanism of expression and possible roles of the immunosuppressive B7-H4 molecule in human NHL. For functional studies, NHL-reactive T cell lines were generated via the isolation of allogeneic CD3+ T cells from healthy donors and repeated in vitro stimulation with irradiated NHL cells isolated from patients. B7-H4 was found to be distributed in NHL cells and tissues, and its surface protein expression levels were further upregulated by the incubation of NHL cells with interleukin (IL)-6, IL-10, or interferon-γ. Additionally, the supernatants of tumor-associated macrophages (tMφs) upregulated B7-H4 surface expression by producing IL-6 and IL-10. B7-H4 expressed in NHL cells inhibited the cytotoxic activity of NHL-reactive T cells. Conversely, the inhibition of B7-H4 in NHL cells promoted T cell immunity and sensitized NHL cells to cytolysis. Furthermore, tMφs induced B7-H4 promoted NHL cell evasion of the T cell immune response. In conclusion, this study shows that NHL-expressed B7-H4 is an important immunosuppressive factor that inhibits host anti-tumor immunity to NHL. Targeting tumor-expressed B7-H4 may thus provide a new treatment strategy for NHL patients.


Asunto(s)
Interleucina-10/metabolismo , Interleucina-6/metabolismo , Linfoma no Hodgkin/inmunología , Linfoma no Hodgkin/metabolismo , Macrófagos/inmunología , Linfocitos T Reguladores/inmunología , Escape del Tumor , Inhibidor 1 de la Activación de Células T con Dominio V-Set/metabolismo , Comunicación Celular/inmunología , Humanos , Linfoma no Hodgkin/patología , Células Tumorales Cultivadas
4.
Genome Res ; 23(9): 1522-40, 2013 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-23804400

RESUMEN

DNA methylation plays key roles in diverse biological processes such as X chromosome inactivation, transposable element repression, genomic imprinting, and tissue-specific gene expression. Sequencing-based DNA methylation profiling provides an unprecedented opportunity to map and compare complete DNA methylomes. This includes one of the most widely applied technologies for measuring DNA methylation: methylated DNA immunoprecipitation followed by sequencing (MeDIP-seq), coupled with a complementary method, methylation-sensitive restriction enzyme sequencing (MRE-seq). A computational approach that integrates data from these two different but complementary assays and predicts methylation differences between samples has been unavailable. Here, we present a novel integrative statistical framework M&M (for integration of MeDIP-seq and MRE-seq) that dynamically scales, normalizes, and combines MeDIP-seq and MRE-seq data to detect differentially methylated regions. Using sample-matched whole-genome bisulfite sequencing (WGBS) as a gold standard, we demonstrate superior accuracy and reproducibility of M&M compared to existing analytical methods for MeDIP-seq data alone. M&M leverages the complementary nature of MeDIP-seq and MRE-seq data to allow rapid comparative analysis between whole methylomes at a fraction of the cost of WGBS. Comprehensive analysis of nineteen human DNA methylomes with M&M reveals distinct DNA methylation patterns among different tissue types, cell types, and individuals, potentially underscoring divergent epigenetic regulation at different scales of phenotypic diversity. We find that differential DNA methylation at enhancer elements, with concurrent changes in histone modifications and transcription factor binding, is common at the cell, tissue, and individual levels, whereas promoter methylation is more prominent in reinforcing fundamental tissue identities.


Asunto(s)
Algoritmos , Metilación de ADN , Genoma Humano , Análisis de Secuencia de ADN/métodos , Interpretación Estadística de Datos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Especificidad de Órganos
5.
BMC Genomics ; 15: 868, 2014 Oct 06.
Artículo en Inglés | MEDLINE | ID: mdl-25286960

RESUMEN

BACKGROUND: Aberrant DNA methylation is a hallmark of many cancers. Classically there are two types of endometrial cancer, endometrioid adenocarcinoma (EAC), or Type I, and uterine papillary serous carcinoma (UPSC), or Type II. However, the whole genome DNA methylation changes in these two classical types of endometrial cancer is still unknown. RESULTS: Here we described complete genome-wide DNA methylome maps of EAC, UPSC, and normal endometrium by applying a combined strategy of methylated DNA immunoprecipitation sequencing (MeDIP-seq) and methylation-sensitive restriction enzyme digestion sequencing (MRE-seq). We discovered distinct genome-wide DNA methylation patterns in EAC and UPSC: 27,009 and 15,676 recurrent differentially methylated regions (DMRs) were identified respectively, compared with normal endometrium. Over 80% of DMRs were in intergenic and intronic regions. The majority of these DMRs were not interrogated on the commonly used Infinium 450K array platform. Large-scale demethylation of chromosome X was detected in UPSC, accompanied by decreased XIST expression. Importantly, we discovered that the majority of the DMRs harbored promoter or enhancer functions and are specifically associated with genes related to uterine development and disease. Among these, abnormal methylation of transposable elements (TEs) may provide a novel mechanism to deregulate normal endometrium-specific enhancers derived from specific TEs. CONCLUSIONS: DNA methylation changes are an important signature of endometrial cancer and regulate gene expression by affecting not only proximal promoters but also distal enhancers.


Asunto(s)
Neoplasias Endometriales/genética , Neoplasias Endometriales/fisiopatología , Elementos de Facilitación Genéticos/genética , Regiones Promotoras Genéticas/genética , Neoplasias Uterinas/genética , Neoplasias Uterinas/fisiopatología , Proteínas Adaptadoras Transductoras de Señales/genética , Familia de Aldehído Deshidrogenasa 1 , Carcinoma Papilar/genética , Carcinoma Papilar/metabolismo , Cromosomas Humanos X , Islas de CpG , ADN (Citosina-5-)-Metiltransferasas/genética , ADN (Citosina-5-)-Metiltransferasas/metabolismo , Metilación de ADN , Elementos Transponibles de ADN/genética , Femenino , Humanos , Factor 4 Similar a Kruppel , Factores de Transcripción de Tipo Kruppel/genética , Homólogo 1 de la Proteína MutL , Proteínas Nucleares/genética , Polimorfismo de Nucleótido Simple , ARN Largo no Codificante/genética , Retinal-Deshidrogenasa/genética , Análisis de Secuencia de ADN
6.
Beijing Da Xue Xue Bao Yi Xue Ban ; 44(3): 437-43, 2012 Jun 18.
Artículo en Zh | MEDLINE | ID: mdl-22692318

RESUMEN

OBJECTIVE: To investigate tissue distribution characteristics of 1,3-diphenyl-1,3-propanedione (DPPD) in mice. METHODS: Male ICR mice were dosed with DPPD 500 mg/kg via oral gavage, and the tissue samples of the heart, liver, spleen, lungs, kidneys and muscle of each mouse were collected as scheduled. At each time point, the concentrations of DPPD in the mouse tissues were measured by high performance liquid chromatography (HPLC) method. The main pharmacokinetic parameters were calculated by Thermo Kinetica 4.4.1 software. RESULTS: DPPD was absorbed rapidly after oral administration. The concentrations of DPPD in the liver and in the kidney were higher, respectively (liver: AUC(tot)=41.92 µg×h/g, kidney: AUC(tot)=40.40 µg×h/g). The drug concentrations showed a rapid distribution in the liver and lungs (T(max)=0.32 h and 0.33 h respectively) after oral administration, but in the muscle the maximum was 3.85 h. The maximum concentration of DPPD was in the liver (C(max)=31.20 µg/g), which was also the highest tissue concentration of all the subjects. DPPD could be detected at the low concentration within 24 h in all the tissues involved. CONCLUSION: DPPD distributed unevenly in various tissues. In the liver, kidney and muscle, the amount of the drug concentration was larger, and was lower in the lungs and spleen.


Asunto(s)
Chalconas/farmacocinética , Animales , Cromatografía Líquida de Alta Presión , Masculino , Ratones , Ratones Endogámicos ICR , Distribución Tisular
7.
Stat Methods Med Res ; 30(1): 112-128, 2021 01.
Artículo en Inglés | MEDLINE | ID: mdl-32726188

RESUMEN

Hidden Markov models are useful in simultaneously analyzing a longitudinal observation process and its dynamic transition. Existing hidden Markov models focus on mean regression for the longitudinal response. However, the tails of the response distribution are as important as the center in many substantive studies. We propose a quantile hidden Markov model to provide a systematic method to examine the entire conditional distribution of the response given the hidden state and potential covariates. Instead of considering homogeneous hidden Markov models, which assume that the probabilities of between-state transitions are independent of subject- and time-specific characteristics, we allow the transition probabilities to depend on exogenous covariates, thereby yielding nonhomogeneous Markov chains and making the proposed model more flexible than its homogeneous counterpart. We develop a Bayesian approach coupled with efficient Markov chain Monte Carlo methods for statistical inference. Simulations are conducted to assess the empirical performance of the proposed method. The proposed methodology is applied to a cocaine use study to provide new insights into the prevention of cocaine use.


Asunto(s)
Modelos Estadísticos , Teorema de Bayes , Cadenas de Markov , Método de Montecarlo
8.
Stat Methods Med Res ; 30(7): 1640-1653, 2021 07.
Artículo en Inglés | MEDLINE | ID: mdl-34134561

RESUMEN

For a nonparametric Behrens-Fisher problem, a directional-sum test is proposed based on division-combination strategy. A one-layer wild bootstrap procedure is given to calculate its statistical significance. We conduct simulation studies with data generated from lognormal, t and Laplace distributions to show that the proposed test can control the type I error rates properly and is more powerful than the existing rank-sum and maximum-type tests under most of the considered scenarios. Applications to the dietary intervention trial further show the performance of the proposed test.


Asunto(s)
Dieta , Proyectos de Investigación , Simulación por Computador , Modelos Estadísticos
9.
PLoS One ; 15(6): e0234094, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32589640

RESUMEN

An important inferential task in functional linear models is to test the dependence between the response and the functional predictor. The traditional testing theory was constructed based on the functional principle component analysis which requires estimating the covariance operator of the functional predictor. Due to the intrinsic high-dimensionality of functional data, the sample is often not large enough to allow accurate estimation of the covariance operator and hence causes the follow-up test underpowered. To avoid the expensive estimation of the covariance operator, we propose a nonparametric method called Functional Linear models with U-statistics TEsting (FLUTE) to test the dependence assumption. We show that the FLUTE test is more powerful than the current benchmark method (Kokoszka P,2008; Patilea V,2016) in the small or moderate sample case. We further prove the asymptotic normality of our test statistic under both the null hypothesis and a local alternative hypothesis. The merit of our method is demonstrated by both simulation studies and real examples.


Asunto(s)
Modelos Estadísticos , Canadá , Modelos Lineales , Estadísticas no Paramétricas , Tiempo (Meteorología)
10.
BMC Bioinformatics ; 10: 146, 2009 May 15.
Artículo en Inglés | MEDLINE | ID: mdl-19445669

RESUMEN

BACKGROUND: Time-course microarray experiments produce vector gene expression profiles across a series of time points. Clustering genes based on these profiles is important in discovering functional related and co-regulated genes. Early developed clustering algorithms do not take advantage of the ordering in a time-course study, explicit use of which should allow more sensitive detection of genes that display a consistent pattern over time. Peddada et al. 1 proposed a clustering algorithm that can incorporate the temporal ordering using order-restricted statistical inference. This algorithm is, however, very time-consuming and hence inapplicable to most microarray experiments that contain a large number of genes. Its computational burden also imposes difficulty to assess the clustering reliability, which is a very important measure when clustering noisy microarray data. RESULTS: We propose a computationally efficient information criterion-based clustering algorithm, called ORICC, that also takes account of the ordering in time-course microarray experiments by embedding the order-restricted inference into a model selection framework. Genes are assigned to the profile which they best match determined by a newly proposed information criterion for order-restricted inference. In addition, we also developed a bootstrap procedure to assess ORICC's clustering reliability for every gene. Simulation studies show that the ORICC method is robust, always gives better clustering accuracy than Peddada's method and saves hundreds of times computational time. Under some scenarios, its accuracy is also better than some other existing clustering methods for short time-course microarray data, such as STEM 2 and Wang et al. 3. It is also computationally much faster than Wang et al. 3. CONCLUSION: Our ORICC algorithm, which takes advantage of the temporal ordering in time-course microarray experiments, provides good clustering accuracy and is meanwhile much faster than Peddada's method. Moreover, the clustering reliability for each gene can also be assessed, which is unavailable in Peddada's method. In a real data example, the ORICC algorithm identifies new and interesting genes that previous analyses failed to reveal.


Asunto(s)
Análisis por Conglomerados , Perfilación de la Expresión Génica/métodos , Modelos Genéticos , Modelos Estadísticos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Algoritmos , Neoplasias de la Mama , Simulación por Computador , Bases de Datos Factuales , Femenino , Genes , Humanos , Proyectos de Investigación
11.
PLoS One ; 13(8): e0201586, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30086146

RESUMEN

DNA methylation is an essential epigenetic modification involved in regulating the expression of mammalian genomes. A variety of experimental approaches to generate genome-wide or whole-genome DNA methylation data have emerged in recent years. Methylated DNA immunoprecipitation followed by sequencing (MeDIP-seq) is one of the major tools used in whole-genome epigenetic studies. However, analyzing this data in terms of accuracy, sensitivity, and speed still remains an important challenge. Existing methods, such as BATMAN and MEDIPS, analyze MeDIP-seq data by dividing the whole genome into equal length windows and assume that each CpG of the same window has the same methylation level. More precise work is necessary to estimate the methylation level of each CpG site in the whole genome. In this paper, we propose a Statistical Inferences with MeDIP-seq Data (SIMD) to infer the methylation level for each CpG site. In addition, we analyze a real dataset for DNA methylation. The results show that our method displays improved precision in detecting differentially methylated CpG sites compared to the existing method. To meet the demands of the application, we have developed an R package called "SIMD", which is freely available in https://github.com/FocusPaka/SIMD.


Asunto(s)
Metilación de ADN , Epigenómica/métodos , Secuenciación Completa del Genoma/métodos , Algoritmos , Islas de CpG , Epigénesis Genética , Regulación de la Expresión Génica , Humanos , Internet
12.
J Comput Biol ; 24(11): 1099-1111, 2017 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-28414553

RESUMEN

High-throughput techniques bring novel tools and also statistical challenges to genomic research. Identification of which type of diseases a new patient belongs to has been recognized as an important problem. For high-dimensional small sample size data, the classical discriminant methods suffer from the singularity problem and are, therefore, no longer applicable in practice. In this article, we propose a geometric diagonalization method for the regularized discriminant analysis. We then consider a bias correction to further improve the proposed method. Simulation studies show that the proposed method performs better than, or at least as well as, the existing methods in a wide range of settings. A microarray dataset and an RNA-seq dataset are also analyzed and they demonstrate the superiority of the proposed method over the existing competitors, especially when the number of samples is small or the number of genes is large. Finally, we have developed an R package called "GDRDA" which is available upon request.


Asunto(s)
Algoritmos , Biomarcadores de Tumor/genética , Neoplasias de la Mama/genética , Perfilación de la Expresión Génica , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Simulación por Computador , Análisis Discriminante , Femenino , Humanos
13.
PLoS One ; 11(7): e0159084, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-27416030

RESUMEN

Representation based classification methods, such as Sparse Representation Classification (SRC) and Linear Regression Classification (LRC) have been developed for face recognition problem successfully. However, most of these methods use the original face images without any preprocessing for recognition. Thus, their performances may be affected by some problematic factors (such as illumination and expression variances) in the face images. In order to overcome this limitation, a novel supervised filter learning algorithm is proposed for representation based face recognition in this paper. The underlying idea of our algorithm is to learn a filter so that the within-class representation residuals of the faces' Local Binary Pattern (LBP) features are minimized and the between-class representation residuals of the faces' LBP features are maximized. Therefore, the LBP features of filtered face images are more discriminative for representation based classifiers. Furthermore, we also extend our algorithm for heterogeneous face recognition problem. Extensive experiments are carried out on five databases and the experimental results verify the efficacy of the proposed algorithm.


Asunto(s)
Inteligencia Artificial , Reconocimiento Facial , Reconocimiento de Normas Patrones Automatizadas/métodos , Algoritmos , Bases de Datos Factuales , Cara , Interpretación de Imagen Asistida por Computador/métodos , Iluminación
14.
BioData Min ; 7: 15, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-25285156

RESUMEN

BACKGROUND: Next generation sequencing technologies are powerful new tools for investigating a wide range of biological and medical questions. Statistical and computational methods are key to analyzing massive and complex sequencing data. In order to derive gene expression measures and compare these measures across samples or libraries, we first need to normalize read counts to adjust for varying sample sequencing depths and other potentially technical effects. RESULTS: In this paper, we develop a normalization method based on iterating median of M-values (IMM) for detecting the differentially expressed (DE) genes. Compared to a previous approach TMM, the IMM method improves the accuracy of DE detection. Simulation studies show that the IMM method outperforms other methods for the sample normalization. We also look into the real data and find that the genes detected by IMM but not by TMM are much more accurate than the genes detected by TMM but not by IMM. What's more, we discovered that gene UNC5C is highly associated with kidney cancer and so on.

15.
BioData Min ; 7(1): 30, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-25503379

RESUMEN

[This corrects the article DOI: 10.1186/1756-0381-7-15.].

16.
PLoS One ; 9(11): e113198, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-25419662

RESUMEN

Recently, Sparse Representation-based Classification (SRC) has attracted a lot of attention for its applications to various tasks, especially in biometric techniques such as face recognition. However, factors such as lighting, expression, pose and disguise variations in face images will decrease the performances of SRC and most other face recognition techniques. In order to overcome these limitations, we propose a robust face recognition method named Locality Constrained Joint Dynamic Sparse Representation-based Classification (LCJDSRC) in this paper. In our method, a face image is first partitioned into several smaller sub-images. Then, these sub-images are sparsely represented using the proposed locality constrained joint dynamic sparse representation algorithm. Finally, the representation results for all sub-images are aggregated to obtain the final recognition result. Compared with other algorithms which process each sub-image of a face image independently, the proposed algorithm regards the local matching-based face recognition as a multi-task learning problem. Thus, the latent relationships among the sub-images from the same face image are taken into account. Meanwhile, the locality information of the data is also considered in our algorithm. We evaluate our algorithm by comparing it with other state-of-the-art approaches. Extensive experiments on four benchmark face databases (ORL, Extended YaleB, AR and LFW) demonstrate the effectiveness of LCJDSRC.


Asunto(s)
Algoritmos , Cara/anatomía & histología , Interpretación de Imagen Asistida por Computador/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Identificación Biométrica/métodos , Humanos , Reproducibilidad de los Resultados
17.
Comput Med Imaging Graph ; 32(8): 685-98, 2008 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-18818051

RESUMEN

Image segmentation is often required as a preliminary and indispensable stage in the computer aided medical image process, particularly during the clinical analysis of magnetic resonance (MR) brain images. In this paper, we present a modified fuzzy c-means (FCM) algorithm for MRI brain image segmentation. In order to reduce the noise effect during segmentation, the proposed method incorporates both the local spatial context and the non-local information into the standard FCM cluster algorithm using a novel dissimilarity index in place of the usual distance metric. The efficiency of the proposed algorithm is demonstrated by extensive segmentation experiments using both simulated and real MR images and by comparison with other state of the art algorithms.


Asunto(s)
Encéfalo/anatomía & histología , Análisis por Conglomerados , Lógica Difusa , Aumento de la Imagen/métodos , Imagen por Resonancia Magnética/métodos , Humanos , Imagenología Tridimensional/métodos , Almacenamiento y Recuperación de la Información/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Pesos y Medidas
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA