Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 46
Filtrar
1.
Infancy ; 28(3): 597-618, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-36757022

RESUMO

Caregivers' touches that occur alongside words and utterances could aid in the detection of word/utterance boundaries and the mapping of word forms to word meanings. We examined changes in caregivers' use of touches with their speech directed to infants using a multimodal cross-sectional corpus of 35 Korean mother-child dyads across three age groups of infants (8, 14, and 27 months). We tested the hypothesis that caregivers' frequency and use of touches with speech change with infants' development. Results revealed that the frequency of word/utterance-touch alignment as well as word + touch co-occurrence is highest in speech addressed to the youngest group of infants. Thus, this study provides support for the hypothesis that caregivers' use of touch during dyadic interactions is sensitive to infants' age in a way similar to caregivers' use of speech alone and could provide cues useful to infants' language learning at critical points in early development.


Assuntos
Mães , Tato , Feminino , Humanos , Lactente , Estudos Transversais , Idioma , República da Coreia
2.
Cogn Psychol ; 125: 101360, 2021 03.
Artigo em Inglês | MEDLINE | ID: mdl-33472104

RESUMO

Interest in computational modeling of cognition and behavior continues to grow. To be most productive, modelers should be equipped with tools that ensure optimal efficiency in data collection and in the integrity of inference about the phenomenon of interest. Traditionally, models in cognitive science have been parametric, which are particularly susceptible to model misspecification because their strong assumptions (e.g. parameterization, functional form) may introduce unjustified biases in data collection and inference. To address this issue, we propose a data-driven nonparametric framework for model development, one that also includes optimal experimental design as a goal. It combines Gaussian Processes, a stochastic process often used for regression and classification, with active learning, from machine learning, to iteratively fit the model and use it to optimize the design selection throughout the experiment. The approach, dubbed Gaussian process with active learning (GPAL), is an extension of the parametric, adaptive design optimization (ADO) framework (Cavagnaro, Myung, Pitt, & Kujala, 2010). We demonstrate the application and features of GPAL in a delay discounting task and compare its performance to ADO in two experiments. The results show that GPAL is a viable modeling framework that is noteworthy for its high sensitivity to individual differences, identifying novel patterns in the data that were missed by the model-constrained ADO. This investigation represents a first step towards the development of a data-driven cognitive modeling framework that serves as a middle ground between raw data, which can be difficult to interpret, and parametric models, which rely on strong assumptions.


Assuntos
Projetos de Pesquisa , Teorema de Bayes , Humanos , Distribuição Normal , Processos Estocásticos
3.
Molecules ; 24(7)2019 Apr 10.
Artigo em Inglês | MEDLINE | ID: mdl-30974800

RESUMO

Recent research in DNA nanotechnology has demonstrated that biological substrates can be used for computing at a molecular level. However, in vitro demonstrations of DNA computations use preprogrammed, rule-based methods which lack the adaptability that may be essential in developing molecular systems that function in dynamic environments. Here, we introduce an in vitro molecular algorithm that 'learns' molecular models from training data, opening the possibility of 'machine learning' in wet molecular systems. Our algorithm enables enzymatic weight update by targeting internal loop structures in DNA and ensemble learning, based on the hypernetwork model. This novel approach allows massively parallel processing of DNA with enzymes for specific structural selection for learning in an iterative manner. We also introduce an intuitive method of DNA data construction to dramatically reduce the number of unique DNA sequences needed to cover the large search space of feature sets. By combining molecular computing and machine learning the proposed algorithm makes a step closer to developing molecular computing technologies for future access to more intelligent molecular systems.


Assuntos
DNA , Aprendizado de Máquina , Modelos Moleculares , Redes Neurais de Computação , Conformação de Ácido Nucleico , DNA/química , DNA/genética
4.
Telemed J E Health ; 24(10): 753-772, 2018 10.
Artigo em Inglês | MEDLINE | ID: mdl-29420125

RESUMO

BACKGROUND: Stress recognition using electrocardiogram (ECG) signals requires the intractable long-term heart rate variability (HRV) parameter extraction process. This study proposes a novel deep learning framework to recognize the stressful states, the Deep ECGNet, using ultra short-term raw ECG signals without any feature engineering methods. METHODS: The Deep ECGNet was developed through various experiments and analysis of ECG waveforms. We proposed the optimal recurrent and convolutional neural networks architecture, and also the optimal convolution filter length (related to the P, Q, R, S, and T wave durations of ECG) and pooling length (related to the heart beat period) based on the optimization experiments and analysis on the waveform characteristics of ECG signals. The experiments were also conducted with conventional methods using HRV parameters and frequency features as a benchmark test. The data used in this study were obtained from Kwangwoon University in Korea (13 subjects, Case 1) and KU Leuven University in Belgium (9 subjects, Case 2). Experiments were designed according to various experimental protocols to elicit stressful conditions. RESULTS: The proposed framework to recognize stress conditions, the Deep ECGNet, outperformed the conventional approaches with the highest accuracy of 87.39% for Case 1 and 73.96% for Case 2, respectively, that is, 16.22% and 10.98% improvements compared with those of the conventional HRV method. CONCLUSIONS: We proposed an optimal deep learning architecture and its parameters for stress recognition, and the theoretical consideration on how to design the deep learning structure based on the periodic patterns of the raw ECG data. Experimental results in this study have proved that the proposed deep learning model, the Deep ECGNet, is an optimal structure to recognize the stress conditions using ultra short-term ECG data.


Assuntos
Aprendizado Profundo , Eletrocardiografia/métodos , Processamento de Imagem Assistida por Computador/métodos , Estresse Psicológico/fisiopatologia , Adulto , Bélgica , Frequência Cardíaca/fisiologia , Humanos , Masculino , Redes Neurais de Computação , República da Coreia , Adulto Jovem
5.
Nucleic Acids Res ; 41(18): 8464-74, 2013 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-23887935

RESUMO

Aberrant DNA methylation of CpG islands, CpG island shores and first exons is known to play a key role in the altered gene expression patterns in all human cancers. To date, a systematic study on the effect of DNA methylation on gene expression using high resolution data has not been reported. In this study, we conducted an integrated analysis of MethylCap-sequencing data and Affymetrix gene expression microarray data for 30 breast cancer cell lines representing different breast tumor phenotypes. As well-developed methods for the integrated analysis do not currently exist, we created a series of four different analysis methods. On the computational side, our goal is to develop methylome data analysis protocols for the integrated analysis of DNA methylation and gene expression data on the genome scale. On the cancer biology side, we present comprehensive genome-wide methylome analysis results for differentially methylated regions and their potential effect on gene expression in 30 breast cancer cell lines representing three molecular phenotypes, luminal, basal A and basal B. Our integrated analysis demonstrates that methylation status of different genomic regions may play a key role in establishing transcriptional patterns in molecular subtypes of human breast cancer.


Assuntos
Neoplasias da Mama/genética , Metilação de DNA , Regulação Neoplásica da Expressão Gênica , Sítios de Ligação , Neoplasias da Mama/classificação , Neoplasias da Mama/metabolismo , Linhagem Celular Tumoral , Regulação para Baixo , Feminino , Perfilação da Expressão Gênica , Genômica/métodos , Humanos , Fenótipo , Regiões Promotoras Genéticas , Fatores de Transcrição/metabolismo
6.
J Biomed Inform ; 49: 101-11, 2014 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-24524888

RESUMO

Predicting the clinical outcomes of cancer patients is a challenging task in biomedicine. A personalized and refined therapy based on predicting prognostic outcomes of cancer patients has been actively sought in the past decade. Accurate prognostic prediction requires higher-order representations of complex dependencies among genetic factors. However, identifying the co-regulatory roles and functional effects of genetic interactions on cancer prognosis is hindered by the complexity of the interactions. Here we propose a prognostic prediction model based on evolutionary learning that identifies higher-order prognostic biomarkers of cancer clinical outcomes. The proposed model represents the interactions of prognostic genes as a combinatorial space. It adopts a flexible hypergraph structure composed of a large population of hyperedges that encode higher-order relationships among many genetic factors. The hyperedge population is optimized by an evolutionary learning method based on sequential Bayesian sampling. The proposed learning approach effectively balances performance and parsimony of the model using information-theoretic dependency and complexity-theoretic regularization priors. Using MAQC-II project data, we demonstrate that our model can handle high-dimensional data more effectively than state-of-the-art classification models. We also identify potential gene interactions characterizing prognosis and recurrence risk in cancer.


Assuntos
Teorema de Bayes , Aprendizagem , Neoplasias/terapia , Humanos , Neoplasias/patologia , Resultado do Tratamento
7.
BMC Bioinformatics ; 13 Suppl 17: S12, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23282075

RESUMO

BACKGROUND: Biclustering has been utilized to find functionally important patterns in biological problem. Here a bicluster is a submatrix that consists of a subset of rows and a subset of columns in a matrix, and contains homogeneous patterns. The problem of finding biclusters is still challengeable due to computational complex trying to capture patterns from two-dimensional features. RESULTS: We propose a Probabilistic COevolutionary Biclustering Algorithm (PCOBA) that can cluster the rows and columns in a matrix simultaneously by utilizing a dynamic adaptation of multiple species and adopting probabilistic learning. In biclustering problems, a coevolutionary search is suitable since it can optimize interdependent subcomponents formed of rows and columns. Furthermore, acquiring statistical information on two populations using probabilistic learning can improve the ability of search towards the optimum value. We evaluated the performance of PCOBA on synthetic dataset and yeast expression profiles. The results demonstrated that PCOBA outperformed previous evolutionary computation methods as well as other biclustering methods. CONCLUSIONS: Our approach for searching particular biological patterns could be valuable for systematically understanding functional relationships between genes and other biological components at a genome-wide level.


Assuntos
Perfilação da Expressão Gênica/estatística & dados numéricos , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Algoritmos , Análise por Conglomerados , Evolução Molecular , Expressão Gênica , Saccharomyces cerevisiae/genética
8.
Nucleic Acids Res ; 36(Web Server issue): W411-5, 2008 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-18508809

RESUMO

Protein-protein interaction (PPI) extraction has been an important research topic in bio-text mining area, since the PPI information is critical for understanding biological processes. However, there are very few open systems available on the Web and most of the systems focus on keyword searching based on predefined PPIs. PIE (Protein Interaction information Extraction system) is a configurable Web service to extract PPIs from literature, including user-provided papers as well as PubMed articles. After providing abstracts or papers, the prediction results are displayed in an easily readable form with essential, yet compact features. The PIE interface supports more features such as PDF file extraction, PubMed search tool and network communication, which are useful for biologists and bio-system developers. The PIE system utilizes natural language processing techniques and machine learning methodologies to predict PPI sentences, which results in high precision performance for Web users. PIE is freely available at http://bi.snu.ac.kr/pie/.


Assuntos
Mapeamento de Interação de Proteínas , Software , Internet , PubMed , Interface Usuário-Computador
9.
Front Psychol ; 11: 602623, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33456445

RESUMO

We describe a corpus of speech taking place between 30 Korean mother-child pairs, divided in three groups of Prelexical (M = 0;08), Early-Lexical (M = 1;02), and Advanced-Lexical (M = 2;03). In addition to the child-directed speech (CDS), this corpus includes two different formalities of adult-directed speech (ADS), i.e., family-directed ADS (ADS_Fam) and experimenter-directed ADS (ADS_Exp). Our analysis of the MLU in CDS, family-, and experimenter-directed ADS found significant differences between CDS and ADS_Fam, and between ADS_Fam and ADS_Exp, but not between CDS and ADS_Exp. Our finding suggests that researchers should pay more attention to controlling the level of formality in CDS and ADS when comparing the two registers for their speech characteristics. The corpus was transcribed in the CHAT format of the CHILDES system, so users can easily extract data related to verbal behavior in the mother-child interaction using the CLAN program of CHILDES.

10.
Lab Chip ; 9(3): 479-82, 2009 Feb 07.
Artigo em Inglês | MEDLINE | ID: mdl-19156301

RESUMO

We present a novel active mixing method in a microfluidic chip, where the controlled stirring of magnetic particles is used to achieve an effective mixing of fluids. To perform mixing, the ferromagnetic particles were embedded and manipulated under the influence of a rotating magnetic field. By aligning the magnetic beads along the magnetic field lines, rod-like structures are formed, functioning as small stir bars. Under higher flow conditions the particles did not form the typical rod structure but rather formed aggregates, which were even more beneficial for mixing. Our system reached a 96% mixing efficiency in a relatively short distance (800 microm) at a flow rate of 1.2-4.8 mm/s. These results demonstrate that our mixing method is useful for microfluidic devices with low aspect ratios and molecules with large molecular weights.


Assuntos
Compostos Férricos/química , Magnetismo , Técnicas Analíticas Microfluídicas/métodos , Dextranos/química , Fluoresceína-5-Isotiocianato/análogos & derivados , Fluoresceína-5-Isotiocianato/química , Microscopia de Fluorescência , Microesferas , Fenômenos Físicos
11.
BMC Genomics ; 10 Suppl 3: S29, 2009 Dec 03.
Artigo em Inglês | MEDLINE | ID: mdl-19958493

RESUMO

BACKGROUND: Gene regulation is a key mechanism in higher eukaryotic cellular processes. One of the major challenges in gene regulation studies is to identify regulators affecting the expression of their target genes in specific biological processes. Despite their importance, regulators involved in diverse biological processes still remain largely unrevealed. In the present study, we propose a kernel-based approach to efficiently identify core regulatory elements involved in specific biological processes using gene expression profiles. RESULTS: We developed a framework that can detect correlations between gene expression profiles and the upstream sequences on the basis of the kernel canonical correlation analysis (kernel CCA). Using a yeast cell cycle dataset, we demonstrated that upstream sequence patterns were closely related to gene expression profiles based on the canonical correlation scores obtained by measuring the correlation between them. Our results showed that the cell cycle-specific regulatory motifs could be found successfully based on the motif weights derived through kernel CCA. Furthermore, we identified co-regulatory motif pairs using the same framework. CONCLUSION: Given expression profiles, our method was able to identify regulatory motifs involved in specific biological processes. The method could be applied to the elucidation of the unknown regulatory mechanisms associated with complex gene regulatory processes.


Assuntos
Ciclo Celular , Elementos Reguladores de Transcrição , Saccharomyces cerevisiae/química , Saccharomyces cerevisiae/citologia , Perfilação da Expressão Gênica , Regulação Fúngica da Expressão Gênica , Saccharomyces cerevisiae/genética , Análise de Sequência de DNA
12.
Bioinformatics ; 23(9): 1141-7, 2007 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-17350973

RESUMO

MOTIVATION: MicroRNAs (miRNAs) and mRNAs constitute an important part of gene regulatory networks, influencing diverse biological phenomena. Elucidating closely related miRNAs and mRNAs can be an essential first step towards the discovery of their combinatorial effects on different cellular states. Here, we propose a probabilistic learning method to identify synergistic miRNAs involving regulation of their condition-specific target genes (mRNAs) from multiple information sources, i.e. computationally predicted target genes of miRNAs and their respective expression profiles. RESULTS: We used data sets consisting of miRNA-target gene binding information and expression profiles of miRNAs and mRNAs on human cancer samples. Our method allowed us to detect functionally correlated miRNA-mRNA modules involved in specific biological processes from multiple data sources by using a balanced fitness function and efficient searching over multiple populations. The proposed algorithm found two miRNA-mRNA modules, highly correlated with respect to their expression and biological function. Moreover, the mRNAs included in the same module showed much higher correlations when the related miRNAs were highly expressed, demonstrating our method's ability for finding coherent miRNA-mRNA modules. Most members of these modules have been reported to be closely related with cancer. Consequently, our method can provide a primary source of miRNA and target sets presumed to constitute closely related parts of gene regulatory pathways.


Assuntos
Inteligência Artificial , Genética Populacional , MicroRNAs/genética , Modelos Genéticos , Reconhecimento Automatizado de Padrão/métodos , RNA Mensageiro/genética , Análise de Sequência de RNA/métodos , Sequência de Bases , Sítios de Ligação , Simulação por Computador , Evolução Molecular , Dados de Sequência Molecular
13.
Int J Radiat Biol ; 84(9): 734-41, 2008 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-18821387

RESUMO

PURPOSE: The biological effects of exposure to mobile phone emitted radiofrequency (RF) radiation are the subject of intense study, yet the hypothesis that RF exposure is a potential health hazard remains controversial. In this paper, we monitored cellular and molecular changes in Jurkat human T lymphoma cells after irradiating with 1763 MHz RF radiation to understand the effect on RF radiation in immune cells. MATERIALS AND METHODS: Jurkat T-cells were exposed to RF radiation to assess the effects on cell proliferation, cell cycle progression, DNA damage and gene expression. Jurkat cells were exposed to 1763 MHz RF radiation at 10 W/kg specific absorption rate (SAR) and compared to sham exposed cells. RESULTS: RF exposure did not produce significant changes in cell numbers, cell cycle distributions, or levels of DNA damage. In genome-wide analysis of gene expressions, there were no genes changed more than two-fold upon RF-radiation while ten genes change to 1.3 approximately 1.8-fold. Among ten genes, two cytokine receptor genes such as chemokine (C-X-C motif) receptor 3 (CXCR3) and interleukin 1 receptor, type II (IL1R2) were down-regulated upon RF radiation, but they were not directly related to cell proliferation or DNA damage responses. CONCLUSION: These results indicate that the alterations in cell proliferation, cell cycle progression, DNA integrity or global gene expression was not detected upon 1763 MHz RF radiation under 10 W/kg SAR for 24 h to Jurkat T cells.


Assuntos
Ondas de Rádio , Linfócitos T/efeitos da radiação , Animais , Bovinos , Telefone Celular , Exposição Ambiental , Perfilação da Expressão Gênica , Humanos , Células Jurkat , Análise de Sequência com Séries de Oligonucleotídeos , Linfócitos T/citologia , Linfócitos T/metabolismo
14.
Int J Radiat Biol ; 84(11): 909-15, 2008 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-19016139

RESUMO

PURPOSE: Radiofrequency (RF) exposure at the frequency of mobile phones has been reported not to induce cellular damage in in vitro and in vivo models. We chose HEI-OC1 immortalized mouse auditory hair cells to characterize the cellular response to 1763 MHz RF exposure, because auditory cells could be exposed to mobile phone frequencies. MATERIALS AND METHODS: Cells were exposed to 1763 MHz RF at a 20 W/kg specific absorption rate (SAR) in a code division multiple access (CDMA) exposure chamber for 24 and 48 h to check for changes in cell cycle, DNA damage, stress response, and gene expression. RESULTS: Neither of cell cycle changes nor DNA damage was detected in RF-exposed cells. The expression of heat shock proteins (HSP) and the phosphorylation of mitogen-activated protein kinases (MAPK) did not change, either. We tried to identify any alteration in gene expression using microarrays. Using the Applied Biosystems 1700 full genome expression mouse microarray, we found that only 29 genes (0.09% of total genes examined) were changed by more than 1.5-fold on RF exposure. CONCLUSION: From these results, we could not find any evidence of the induction of cellular responses, including cell cycle distribution, DNA damage, stress response and gene expression, after 1763 MHz RF exposure at an SAR of 20 W/kg in HEI-OC1 auditory hair cells.


Assuntos
Telefone Celular , Exposição Ambiental , Células Ciliadas Auditivas/efeitos da radiação , Ondas de Rádio/efeitos adversos , Animais , Biomarcadores/metabolismo , Linhagem Celular , Cóclea/citologia , Regulação da Expressão Gênica/efeitos da radiação , Células Ciliadas Auditivas/metabolismo , Camundongos , Análise de Sequência com Séries de Oligonucleotídeos
15.
Nucleic Acids Res ; 34(Web Server issue): W455-8, 2006 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-16845048

RESUMO

ProMiR is a web-based service for the prediction of potential microRNAs (miRNAs) in a query sequence of 60-150 nt, using a probabilistic colearning model. Identification of miRNAs requires a computational method to predict clustered and nonclustered, conserved and nonconserved miRNAs in various species. Here we present an improved version of ProMiR for identifying new clusters near known or unknown miRNAs. This new version, ProMiR II, integrates additional evidence, such as free energy data, G/C ratio, conservation score and entropy of candidate sequences, for more controllable prediction of miRNAs in mouse and human genomes. It also provides a wider range of services, e.g. the prediction of miRNA genes in long nonrelated sequences such as viral genomes. Importantly, we have validated this method using several case studies. All data used in ProMiR II are structured in the MySQL database for efficient analysis. The ProMiR II web server is available at http://cbit.snu.ac.kr/~ProMiR2/.


Assuntos
MicroRNAs/genética , Software , Animais , Sequência de Bases , Sequência Conservada , Genoma Viral , Genômica/métodos , Humanos , Internet , Camundongos , Modelos Estatísticos , Família Multigênica , Ratos , Interface Usuário-Computador
16.
Biosystems ; 91(1): 69-75, 2008 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-17897776

RESUMO

Many DNA-based technologies, such as DNA computing, DNA nanoassembly and DNA biochips, rely on DNA hybridization reactions. Previous hybridization models have focused on macroscopic reactions between two DNA strands at the sequence level. Here, we propose a novel population-based Monte Carlo algorithm that simulates a microscopic model of reacting DNA molecules. The algorithm uses two essential thermodynamic quantities of DNA molecules: the binding energy of bound DNA strands and the entropy of unbound strands. Using this evolutionary Monte Carlo method, we obtain a minimum free energy configuration in the equilibrium state. We applied this method to a logical reasoning problem and compared the simulation results with the experimental results of the wet-lab DNA experiments performed subsequently. Our simulation predicted the experimental results quantitatively.


Assuntos
Algoritmos , Evolução Biológica , DNA/genética , Hibridização Genética/genética , Método de Monte Carlo , Simulação por Computador
17.
IEEE Trans Pattern Anal Mach Intell ; 40(1): 106-118, 2018 01.
Artigo em Inglês | MEDLINE | ID: mdl-28186880

RESUMO

We consider the problem of learning a local metric in order to enhance the performance of nearest neighbor classification. Conventional metric learning methods attempt to separate data distributions in a purely discriminative manner; here we show how to take advantage of information from parametric generative models. We focus on the bias in the information-theoretic error arising from finite sampling effects, and find an appropriate local metric that maximally reduces the bias based upon knowledge from generative models. As a byproduct, the asymptotic theoretical analysis in this work relates metric learning to dimensionality reduction from a novel perspective, which was not understood from previous discriminative approaches. Empirical experiments show that this learned local metric enhances the discriminative nearest neighbor performance on various datasets using simple class conditional generative models such as a Gaussian.

18.
IEEE Trans Pattern Anal Mach Intell ; 40(1): 92-105, 2018 01.
Artigo em Inglês | MEDLINE | ID: mdl-28186879

RESUMO

Classical discriminant analysis attempts to discover a low-dimensional subspace where class label information is maximally preserved under projection. Canonical methods for estimating the subspace optimize an information-theoretic criterion that measures the separation between the class-conditional distributions. Unfortunately, direct optimization of the information-theoretic criteria is generally non-convex and intractable in high-dimensional spaces. In this work, we propose a novel, tractable algorithm for discriminant analysis that considers the class-conditional densities as interacting fluids in the high-dimensional embedding space. We use the Bhattacharyya criterion as a potential function that generates forces between the interacting fluids, and derive a computationally tractable method for finding the low-dimensional subspace that optimally constrains the resulting fluid flow. We show that this model properly reduces to the optimal solution for homoscedastic data as well as for heteroscedastic Gaussian distributions with equal means. We also extend this model to discover optimal filters for discriminating Gaussian processes and provide experimental results and comparisons on a number of datasets.

19.
Bioinformatics ; 22(16): 2005-11, 2006 Aug 15.
Artigo em Inglês | MEDLINE | ID: mdl-16899491

RESUMO

MOTIVATION: An important issue in stem cell biology is to understand how to direct differentiation towards a specific cell type. To elucidate the mechanism, previous studies have focused on identifying the responsible gene regulators, which have, however, failed to provide a systemic view of regulatory modules. To obtain a unified description of the regulatory modules, we characterized major stem cell species by employing a co-clustering latent variable model (LVM). The LVM-based method allowed us to elucidate the cell type-specific transcription factors, using genomic sequences as well as expression profiles. RESULTS: We used a list of genes enriched in each of 21 stem cell subpopulations, and their upstream genomic sequences. The LVM-based study allowed us to uncover the regulatory modules for each stem cell cluster, e.g. GABP and E2F for the proliferation phase, and Ap2alpha and Ap2gamma for the quiescence phase. Furthermore, the identities of the stem cell clusters were well revealed by the constituent genes that were directly targeted by the modules. Consequently, our analytical framework was demonstrated to be useful through a detailed case study of stem cell differentiation and can be applied to problems with similar characteristics.


Assuntos
Biologia Computacional/métodos , Células-Tronco/citologia , Algoritmos , Animais , Diferenciação Celular , Linhagem da Célula , Proliferação de Células , Análise por Conglomerados , Perfilação da Expressão Gênica , Genoma , Humanos , Família Multigênica , Análise de Sequência com Séries de Oligonucleotídeos , Reconhecimento Automatizado de Padrão
20.
Nucleic Acids Res ; 33(11): 3570-81, 2005.
Artigo em Inglês | MEDLINE | ID: mdl-15987789

RESUMO

MicroRNAs (miRNAs) are small regulatory RNAs of approximately 22 nt. Although hundreds of miRNAs have been identified through experimental complementary DNA cloning methods and computational efforts, previous approaches could detect only abundantly expressed miRNAs or close homologs of previously identified miRNAs. Here, we introduce a probabilistic co-learning model for miRNA gene finding, ProMiR, which simultaneously considers the structure and sequence of miRNA precursors (pre-miRNAs). On 5-fold cross-validation with 136 referenced human datasets, the efficiency of the classification shows 73% sensitivity and 96% specificity. When applied to genome screening for novel miRNAs on human chromosomes 16, 17, 18 and 19, ProMiR effectively searches distantly homologous patterns over diverse pre-miRNAs, detecting at least 23 novel miRNA gene candidates. Importantly, the miRNA gene candidates do not demonstrate clear sequence similarity to the known miRNA genes. By quantitative PCR followed by RNA interference against Drosha, we experimentally confirmed that 9 of the 23 representative candidate genes express transcripts that are processed by the miRNA biogenesis enzyme Drosha in HeLa cells, indicating that ProMiR may successfully predict miRNA genes with at least 40% accuracy. Our study suggests that the miRNA gene family may be more abundant than previously anticipated, and confer highly extensive regulatory networks on eukaryotic cells.


Assuntos
MicroRNAs/genética , Modelos Estatísticos , Algoritmos , Cromossomos Humanos , Biologia Computacional , Células HeLa , Humanos , Cadeias de Markov , MicroRNAs/química , Conformação de Ácido Nucleico , Precursores de RNA/genética , Ribonuclease III/metabolismo , Análise de Sequência de DNA
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA