Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 43
Filtrar
1.
J Autism Dev Disord ; 53(5): 2050-2061, 2023 May.
Artículo en Inglés | MEDLINE | ID: mdl-35220523

RESUMEN

Autism spectrum disorders (ASD) are strikingly more prevalent in males, but the molecular mechanisms responsible for ASD sex-differential risk are poorly understood. Abnormally shorter telomeres have been associated with autism. Examination of relative telomere lengths (RTL) among non-syndromic male (N = 14) and female (N = 10) children with autism revealed that only autistic male children had significantly shorter RTL than typically-developing controls (N = 24) and paired siblings (N = 10). While average RTL of autistic girls did not differ significantly from controls, it was substantially longer than autistic boys. Our findings indicate a sexually-dimorphic pattern of RTL in childhood autism and could have important implications for RTL as a potential biomarker and the role/s of telomeres in the molecular mechanisms responsible for ASD sex-biased prevalence and etiology.


Asunto(s)
Trastorno del Espectro Autista , Trastorno Autístico , Niño , Humanos , Masculino , Femenino , Trastorno Autístico/genética , Trastorno del Espectro Autista/genética , Caracteres Sexuales , Biomarcadores , Telómero
2.
BMC Bioinformatics ; 21(1): 584, 2020 Dec 17.
Artículo en Inglés | MEDLINE | ID: mdl-33334319

RESUMEN

BACKGROUND: Predicting physical interaction between proteins is one of the greatest challenges in computational biology. There are considerable various protein interactions and a huge number of protein sequences and synthetic peptides with unknown interacting counterparts. Most of co-evolutionary methods discover a combination of physical interplays and functional associations. However, there are only a handful of approaches which specifically infer physical interactions. Hybrid co-evolutionary methods exploit inter-protein residue coevolution to unravel specific physical interacting proteins. In this study, we introduce a hybrid co-evolutionary-based approach to predict physical interplays between pairs of protein families, starting from protein sequences only. RESULTS: In the present analysis, pairs of multiple sequence alignments are constructed for each dimer and the covariation between residues in those pairs are calculated by CCMpred (Contacts from Correlated Mutations predicted) and three mutual information based approaches for ten accessible surface area threshold groups. Then, whole residue couplings between proteins of each dimer are unified into a single Frobenius norm value. Norms of residue contact matrices of all dimers in different accessible surface area thresholds are fed into support vector machine as single or multiple feature models. The results of training the classifiers by single features show no apparent different accuracies in distinct methods for different accessible surface area thresholds. Nevertheless, mutual information product and context likelihood of relatedness procedures may roughly have an overall higher and lower performances than other two methods for different accessible surface area cut-offs, respectively. The results also demonstrate that training support vector machine with multiple norm features for several accessible surface area thresholds leads to a considerable improvement of prediction performance. In this context, CCMpred roughly achieves an overall better performance than mutual information based approaches. The best accuracy, sensitivity, specificity, precision and negative predictive value for that method are 0.98, 1, 0.962, 0.96, and 0.962, respectively. CONCLUSIONS: In this paper, by feeding norm values of protein dimers into support vector machines in different accessible surface area thresholds, we demonstrate that even small number of proteins in pairs of multiple alignments could allow one to accurately discriminate between positive and negative dimers.


Asunto(s)
Proteínas/química , Máquina de Vectores de Soporte , Bases de Datos de Proteínas , Dimerización , Evolución Molecular , Mapas de Interacción de Proteínas , Proteínas/metabolismo
3.
Sci Rep ; 10(1): 8384, 2020 05 20.
Artículo en Inglés | MEDLINE | ID: mdl-32433480

RESUMEN

Since the world population is ageing, dementia is going to be a growing concern. Alzheimer's disease is the most common form of dementia. The pathogenesis of Alzheimer's disease is extensively studied, yet unknown remains. Therefore, we aimed to extract new knowledge from existing data. We analysed about 2700 upregulated genes and 2200 downregulated genes from three studies on the CA1 of the hippocampus of brains with Alzheimer's disease. We found that only the calcium signalling pathway enriched by 48 downregulated genes was consistent between all three studies. We predicted miR-129 to target nine out of 48 genes. Then, we validated miR-129 to regulate six out of nine genes in HEK cells. We noticed that four out of six genes play a role in synaptic plasticity. Finally, we confirmed the upregulation of miR-129 in the hippocampus of brains of rats with scopolamine-induced amnesia as a model of Alzheimer's disease. We suggest that future research should investigate the possible role of miR-129 in synaptic plasticity and Alzheimer's disease. This paper presents a novel framework to gain insight into potential biomarkers and targets for diagnosis and treatment of diseases.


Asunto(s)
Enfermedad de Alzheimer/metabolismo , Enfermedad de Alzheimer/fisiopatología , Encéfalo/metabolismo , Encéfalo/fisiopatología , Hipocampo/fisiología , Plasticidad Neuronal/fisiología , Animales , Masculino , Análisis por Micromatrices , Ratas
4.
IEEE/ACM Trans Comput Biol Bioinform ; 17(5): 1555-1562, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-30990436

RESUMEN

Joint graphical lasso (JGL) approach is a Gaussian graphical model to estimate multiple graphical models corresponding to distinct but related groups. Molecular apocrine (MA) breast cancer tumor has similar characteristics to luminal and basal subtypes. Due to the relationship between MA tumor and two other subtypes, this paper investigates the similarities and differences between the MA genes association network and the ones corresponding to other tumors by taking advantageous of JGL properties. Two distinct JGL graphical models are applied to two sub-datasets including the gene expression information of the MA and the luminal tumors and also the MA and the basal tumors. Then, topological comparisons between the networks such as finding the shared edges are applied. In addition, several support vector machine (SVM) classification models are performed to assess the discriminating power of some critical nodes in the networks, like hub nodes, to discriminate the tumors sample. Applying the JGL approach prepares an appropriate tool to observe the networks of the MA tumor and other subtypes in one map. The results obtained by comparing the networks could be helpful to generate new insight about MA tumor for future studies.


Asunto(s)
Biomarcadores de Tumor/genética , Neoplasias de la Mama , Transcriptoma/genética , Neoplasias de la Mama/clasificación , Neoplasias de la Mama/genética , Neoplasias de la Mama/patología , Biología Computacional , Bases de Datos Genéticas , Femenino , Humanos , Máquina de Vectores de Soporte , Factores de Transcripción/genética
5.
Sci Rep ; 8(1): 4009, 2018 03 05.
Artículo en Inglés | MEDLINE | ID: mdl-29507384

RESUMEN

Currently a few tools are capable of detecting genome-wide Copy Number Variations (CNVs) based on sequencing of multiple samples. Although aberrations in mate pair insertion sizes provide additional hints for the CNV detection based on multiple samples, the majority of the current tools rely only on the depth of coverage. Here, we propose a new algorithm (MSeq-CNV) which allows detecting common CNVs across multiple samples. MSeq-CNV applies a mixture density for modeling aberrations in depth of coverage and abnormalities in the mate pair insertion sizes. Each component in this mixture density applies a Binomial distribution for modeling the number of mate pairs with aberration in the insertion size and also a Poisson distribution for emitting the read counts, in each genomic position. MSeq-CNV is applied on simulated data and also on real data of six HapMap individuals with high-coverage sequencing, in 1000 Genomes Project. These individuals include a CEU trio of European ancestry and a YRI trio of Nigerian ethnicity. Ancestry of these individuals is studied by clustering the identified CNVs. MSeq-CNV is also applied for detecting CNVs in two samples with low-coverage sequencing in 1000 Genomes Project and six samples form the Simons Genome Diversity Project.


Asunto(s)
Variaciones en el Número de Copia de ADN , Análisis de Secuencia de ADN/normas , Algoritmos , Eliminación de Gen , Genoma Humano , Proyecto Mapa de Haplotipos , Heterocigoto , Homocigoto , Humanos , Distribución de Poisson , Análisis de Secuencia de ADN/métodos
6.
BMC Genomics ; 18(1): 964, 2017 Dec 12.
Artículo en Inglés | MEDLINE | ID: mdl-29233090

RESUMEN

BACKGROUND: DNA methylation at promoters is largely correlated with inhibition of gene expression. However, the role of DNA methylation at enhancers is not fully understood, although a crosstalk with chromatin marks is expected. Actually, there exist contradictory reports about positive and negative correlations between DNA methylation and H3K4me1, a chromatin hallmark of enhancers. RESULTS: We investigated the relationship between DNA methylation and active chromatin marks through genome-wide correlations, and found anti-correlation between H3K4me1 and H3K4me3 enrichment at low and intermediate DNA methylation loci. We hypothesized "seesaw" dynamics between H3K4me1 and H3K4me3 in the low and intermediate DNA methylation range, in which DNA methylation discriminates between enhancers and promoters, marked by H3K4me1 and H3K4me3, respectively. Low methylated regions are H3K4me3 enriched, while those with intermediate DNA methylation levels are progressively H3K4me1 enriched. Additionally, the enrichment of H3K27ac, distinguishing active from primed enhancers, follows a plateau in the lower range of the intermediate DNA methylation level, corresponding to active enhancers, and decreases linearly in the higher range of the intermediate DNA methylation. Thus, the decrease of the DNA methylation switches smoothly the state of the enhancers from a primed to an active state. We summarize these observations into a rule of thumb of one-out-of-three methylation marks: "In each genomic region only one out of these three methylation marks {DNA methylation, H3K4me1, H3K4me3} is high. If it is the DNA methylation, the region is inactive. If it is H3K4me1, the region is an enhancer, and if it is H3K4me3, the region is a promoter". To test our model, we used available genome-wide datasets of H3K4 methyltransferases knockouts. Our analysis suggests that CXXC proteins, as readers of non-methylated CpGs would regulate the "seesaw" mechanism that focuses H3K4me3 to unmethylated sites, while being repulsed from H3K4me1 decorated enhancers and CpG island shores. CONCLUSIONS: Our results show that DNA methylation discriminates promoters from enhancers through H3K4me1-H3K4me3 seesaw mechanism, and suggest its possible function in the inheritance of chromatin marks after cell division. Our analyses suggest aberrant formation of promoter-like regions and ectopic transcription of hypomethylated regions of DNA. Such mechanism process can have important implications in biological process in where it has been reported abnormal DNA methylation status such as cancer and aging.


Asunto(s)
Metilación de ADN , Elementos de Facilitación Genéticos , Código de Histonas , Regiones Promotoras Genéticas , Animales , Citosina/metabolismo , Proteínas de Unión al ADN/química , Proteínas de Unión al ADN/metabolismo , Expresión Génica , Histonas/metabolismo , Ratones , Dominios Proteicos
7.
PLoS One ; 12(9): e0184795, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-28938012

RESUMEN

The Common topological features of related species gene regulatory networks suggest reconstruction of the network of one species by using the further information from gene expressions profile of related species. We present an algorithm to reconstruct the gene regulatory network named; F-MAP, which applies the knowledge about gene interactions from related species. Our algorithm sets a Bayesian framework to estimate the precision matrix of one species microarray gene expressions dataset to infer the Gaussian Graphical model of the network. The conjugate Wishart prior is used and the information from related species is applied to estimate the hyperparameters of the prior distribution by using the factor analysis. Applying the proposed algorithm on six related species of drosophila shows that the precision of reconstructed networks is improved considerably compared to the precision of networks constructed by other Bayesian approaches.


Asunto(s)
Algoritmos , Perfilación de la Expresión Génica/métodos , Redes Reguladoras de Genes , Animales , Teorema de Bayes , Conjuntos de Datos como Asunto , Drosophila , Proteínas de Drosophila/metabolismo , Análisis Factorial , Expresión Génica , Modelos Moleculares , Filogenia , Especificidad de la Especie
8.
Cell J ; 19(3): 343-351, 2017 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-28836397

RESUMEN

OBJECTIVE: Cellular decision-making is a key process in which cells with similar geneticand environmental background make dissimilar decisions. This stochastic process, which happens in prokaryotic and eukaryotic cells including stem cells, causes cellular diversity and phenotypic variation. In addition, fitness predicts and describes changes in the genetic composition of populations throughout the evolutionary history. Fitness may thus be defined as the ability to adapt and produce surviving offspring. Here, we present a mathematical model to predict the fitness of a cell and to address the fundamental issue of phenotypic variation. We study a basic decision-making scenario where a bacteriophage lambda reproduces in E. coli, using both the lytic and the lysogenic pathways. In the lytic pathway, the bacteriophage replicates itself within the host bacterium. This fast replication overcrowds and in turn destroys the host bacterium. In the lysogenic pathway, however, the bacteriophage inserts its DNA into the host genome, and is replicated simultaneously with the host genome. MATERIALS AND METHODS: In this prospective study, a mathematical predictive model was developed to estimate fitness as an index of survived offspring. We then leverage experimental data to validate the predictive power of our proposed model. A mathematical model based on game theory was also generated to elucidate a rationale behind cell decision. RESULTS: Our findings indicate that a rational decision that is aimed to maximize life expectancy of offspring is almost identical to bacteriophage behavior reported based on experimental data. The results also showed that stochastic decision on cell fate maximizes the expected number of survived offspring. CONCLUSION: We present a mathematical framework for analyzing a basic phenotypic variation problem and explain how bacteriophages maximize offspring longevity based on this model. We also introduce a mathematical benchmark for other investigations of phenotypic variation that exists in eukaryotes including stem cell differentiation.

9.
Neuroimage ; 159: 289-301, 2017 10 01.
Artículo en Inglés | MEDLINE | ID: mdl-28782679

RESUMEN

In free visual exploration, eye-movement is immediately followed by dynamic reconfiguration of brain functional connectivity. We studied the task-dependency of this process in a combined visual search-change detection experiment. Participants viewed two (nearly) same displays in succession. First time they had to find and remember multiple targets among distractors, so the ongoing task involved memory encoding. Second time they had to determine if a target had changed in orientation, so the ongoing task involved memory retrieval. From multichannel EEG recorded during 200 ms intervals time-locked to fixation onsets, we estimated the functional connectivity using a weighted phase lag index at the frequencies of theta, alpha, and beta bands, and derived global and local measures of the functional connectivity graphs. We found differences between both memory task conditions for several network measures, such as mean path length, radius, diameter, closeness and eccentricity, mainly in the alpha band. Both the local and the global measures indicated that encoding involved a more segregated mode of operation than retrieval. These differences arose immediately after fixation onset and persisted for the entire duration of the lambda complex, an evoked potential commonly associated with early visual perception. We concluded that encoding and retrieval differentially shape network configurations involved in early visual perception, affecting the way the visual input is processed at each fixation. These findings demonstrate that task requirements dynamically control the functional connectivity networks involved in early visual perception.


Asunto(s)
Memoria/fisiología , Vías Nerviosas/fisiología , Percepción Visual/fisiología , Adolescente , Adulto , Conducta , Electroencefalografía , Movimientos Oculares/fisiología , Femenino , Humanos , Masculino , Red Nerviosa/fisiología , Estimulación Luminosa , Adulto Joven
10.
BMC Bioinformatics ; 18(1): 30, 2016 Nov 03.
Artículo en Inglés | MEDLINE | ID: mdl-27809781

RESUMEN

BACKGROUND: Copy Number Variation (CNV) is envisaged to be a major source of large structural variations in the human genome. In recent years, many studies apply Next Generation Sequencing (NGS) data for the CNV detection. However, still there is a necessity to invent more accurate computational tools. RESULTS: In this study, mate pair NGS data are used for the CNV detection in a Hidden Markov Model (HMM). The proposed HMM has position specific emission probabilities, i.e. a Gaussian mixture distribution. Each component in the Gaussian mixture distribution captures a different type of aberration that is observed in the mate pairs, after being mapped to the reference genome. These aberrations may include any increase (decrease) in the insertion size or change in the direction of mate pairs that are mapped to the reference genome. This HMM with Position-Specific Emission probabilities (PSE-HMM) is utilized for the genome-wide detection of deletions and tandem duplications. The performance of PSE-HMM is evaluated on a simulated dataset and also on a real data of a Yoruban HapMap individual, NA18507. CONCLUSIONS: PSE-HMM is effective in taking observation dependencies into account and reaches a high accuracy in detecting genome-wide CNVs. MATLAB programs are available at http://bs.ipm.ir/softwares/PSE-HMM/ .


Asunto(s)
Algoritmos , Variaciones en el Número de Copia de ADN , Genoma Humano , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ADN/métodos , Exactitud de los Datos , Genómica/métodos , Humanos , Probabilidad
11.
Math Biosci ; 279: 53-62, 2016 09.
Artículo en Inglés | MEDLINE | ID: mdl-27424951

RESUMEN

MOTIVATION: Association of Copy Number Variation (CNV) with schizophrenia, autism, developmental disabilities and fatal diseases such as cancer is verified. Recent developments in Next Generation Sequencing (NGS) have facilitated the CNV studies. However, many of the current CNV detection tools are not capable of discriminating tandem duplication from non-tandem duplications. RESULTS: In this study, we propose MGP-HMM as a tool which besides detecting genome-wide deletions discriminates tandem duplications from non-tandem duplications. MGP-HMM takes mate pair abnormalities into account and predicts the digitized number of tandem or non-tandem copies. Abnormalities in the mate pair directions and insertion sizes, after being mapped to the reference genome, are elucidated using a Hidden Markov Model (HMM). For this purpose, a Mixture Gaussian density with time-dependent parameters is applied for emitting mate pair insertion sizes from HMM states. Indeed, depending on observed abnormalities in mate pair insertion size or its orientation, each component in the mixture density will have different parameters. MGP-HMM also applies a Poisson distribution for modeling read depth data. This parametric modeling of the mate pair reads enables us to estimate the length of CNVs precisely, which is an advantage over methods which rely only on read depth approach for the CNV detection. Hidden state of the proposed HMM is the digitized copy number of a genomic segment and states correspond to the multipliers of the mixture Gaussian components. The accuracy of our model is validated on a set of next generation sequencing real and simulated data and is compared to other tools.


Asunto(s)
Variaciones en el Número de Copia de ADN , Modelos Estadísticos , Análisis de Secuencia , Humanos
12.
J Air Waste Manag Assoc ; 66(9): 912-21, 2016 09.
Artículo en Inglés | MEDLINE | ID: mdl-27192035

RESUMEN

UNLABELLED: The present study aimed to optimize the electrospinning parameters for polyacrylonitrile (PAN) nanofibers containing MgO nanoparticle to obtain the appropriate fiber diameter and mat porosity to be applied in air filtration. Optimization of applied voltage, solution concentration, and spinning distance was performed using response surface methodology. In total, 15 trials were done according to the prepared study design. Fiber diameter and porosity were measured using scanning electron microscopic (SEM) image analysis. For air filtration testing, the nanofiber mat was produced based on the suggested optimum conditions for electrospinning. According to the results, the lower solution concentration favored the thinner fiber. The larger diameter gave a higher porosity. At a given spinning distance, there was a negative correlation between fiber diameter and applied voltage. Moreover, there were curvilinear relationships between porosity and both spinning distance and applied voltage at any concentration. It was also concluded that the developed filter medium could be comparable to the high-efficiency particulate air (HEPA) filter in terms of collection efficiency and pressure drop. The empirical models presented in this study can provide an orientation to the subsequent experiments to form uniform and continuous nanofibers for future application in air purification. IMPLICATIONS: High-efficiency filtration is becoming more important, due to decreasing trends air quality. Effective filter media are increasingly needed in industries applying clean-air technologies, and the necessity for developing the high-performance air filters has been more and more felt. Nanofibrous filter media that are mostly fabricated via electrospinning technique have attracted considerable attention in the last decade. The present study aimed to develop the electrospun PAN-containing MgO nanoparticle (using the special functionalities such as absorption and adsorption characteristics, antibacterial functionality, and as a pore-forming agent) filter medium through experimental investigations for application in high-performance air filters.


Asunto(s)
Resinas Acrílicas/química , Filtros de Aire , Óxido de Magnesio/química , Nanofibras/química , Microscopía Electroquímica de Rastreo , Porosidad
13.
BMC Syst Biol ; 9: 23, 2015 Jun 02.
Artículo en Inglés | MEDLINE | ID: mdl-26033487

RESUMEN

BACKGROUND: Understanding the mechanisms by which hundreds of diverse cell types develop from a single mammalian zygote has been a central challenge of developmental biology. Conrad H. Waddington, in his metaphoric "epigenetic landscape" visualized the early embryogenesis as a hierarchy of lineage bifurcations. In each bifurcation, a single progenitor cell type produces two different cell lineages. The tristable dynamical systems are used to model the lineage bifurcations. It is also shown that a genetic circuit consisting of two auto-activating transcription factors (TFs) with cross inhibitions can form a tristable dynamical system. RESULTS: We used gene expression profiles of pre-implantation mouse embryos at the single cell resolution to visualize the Waddington landscape of the early embryogenesis. For each lineage bifurcation we identified two clusters of TFs - rather than two single TFs as previously proposed - that had opposite expression patterns between the pair of bifurcated cell types. The regulatory circuitry among each pair of TF clusters resembled a genetic circuit of a pair of single TFs; it consisted of positive feedbacks among the TFs of the same cluster, and negative interactions among the members of the opposite clusters. Our analyses indicated that the tristable dynamical system of the two-cluster regulatory circuitry is more robust than the genetic circuit of two single TFs. CONCLUSIONS: We propose that a modular hierarchy of regulatory circuits, each consisting of two mutually inhibiting and auto-activating TF clusters, can form hierarchical lineage bifurcations with improved safeguarding of critical early embryogenesis against biological perturbations. Furthermore, our computationally fast framework for modeling and visualizing the epigenetic landscape can be used to obtain insights from experimental data of development at the single cell resolution.


Asunto(s)
Desarrollo Embrionario , Modelos Biológicos , Factores de Transcripción/metabolismo , Animales , Blastocisto/citología , Blastocisto/metabolismo , Perfilación de la Expresión Génica , Ratones , Análisis de la Célula Individual
14.
Biomed Res Int ; 2015: 165186, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-25692131

RESUMEN

The evaluation of the biological networks is considered the essential key to understanding the complex biological systems. Meanwhile, the graph clustering algorithms are mostly used in the protein-protein interaction (PPI) network analysis. The complexes introduced by the clustering algorithms include noise proteins. The error rate of the noise proteins in the PPI network researches is about 40-90%. However, only 30-40% of the existing interactions in the PPI databases depend on the specific biological function. It is essential to eliminate the noise proteins and the interactions from the complexes created via clustering methods. We have introduced new methods of weighting interactions in protein clusters and the splicing of noise interactions and proteins-based interactions on their weights. The coexpression and the sequence similarity of each pair of proteins are considered the edge weight of the proteins in the network. The results showed that the edge filtering based on the amount of coexpression acts similar to the node filtering via graph-based characteristics. Regarding the removal of the noise edges, the edge filtering has a significant advantage over the graph-based method. The edge filtering based on the amount of sequence similarity has the ability to remove the noise proteins and the noise interactions.


Asunto(s)
Bases de Datos de Proteínas , Regulación de la Expresión Génica/fisiología , Redes Reguladoras de Genes/fisiología , Modelos Genéticos , Proteínas/genética , Proteínas/metabolismo , Humanos , Homología de Secuencia de Aminoácido
15.
PLoS One ; 9(8): e103569, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-25090629

RESUMEN

Decision making at a cellular level determines different fates for isogenic cells. However, it is not yet clear how rational decisions are encoded in the genome, how they are transmitted to their offspring, and whether they evolve and become optimized throughout generations. In this paper, we use a game theoretic approach to explain how rational decisions are made in the presence of cooperators and competitors. Our results suggest the existence of an internal switch that operates as a biased coin. The biased coin is, in fact, a biochemical bistable network of interacting genes that can flip to one of its stable states in response to different environmental stimuli. We present a framework to describe how the positions of attractors in such a gene regulatory network correspond to the behavior of a rational player in a competing environment. We evaluate our model by considering lysis/lysogeny decision making of bacteriophage lambda in E. coli.


Asunto(s)
Bacteriófago lambda/genética , Escherichia coli/citología , Escherichia coli/virología , Genoma Viral , Modelos Biológicos , Simulación por Computador , Redes Reguladoras de Genes , Espacio Intracelular/metabolismo , Lisogenia/genética , Probabilidad
16.
J Biopharm Stat ; 24(4): 715-31, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-24697665

RESUMEN

In this article, we discuss an optimization approach to the sample size question, founded on maximizing the value of information in comparison studies with binary responses. The expected value of perfect information (EVPI) is calculated and the optimal sample size is obtained by maximizing the expected net gain of sampling (ENGS), the difference between the expected value of sample information (EVSI) and the cost of conducting the trial. The data are assumed to come from two independent binomial distributions, while the parameter of interest is the difference between the two success probabilities, [Formula: see text]. To formulate our prior knowledge on the parameters, a Dirichlet prior is used. Monte Carlo integration is used in the computation and optimization of ENGS. We also compare the results of this approach with existing Bayesian methods and show how the new approach reduces the computational complexity considerably.


Asunto(s)
Teorema de Bayes , Método de Montecarlo , Tamaño de la Muestra , Humanos
17.
Proteins ; 82(9): 1937-46, 2014 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-24596179

RESUMEN

Decomposition of structural domains is an essential task in classifying protein structures, predicting protein function, and many other proteomics problems. As the number of known protein structures in PDB grows exponentially, the need for accurate automatic domain decomposition methods becomes more essential. In this article, we introduce a bottom-up algorithm for assigning protein domains using a graph theoretical approach. This algorithm is based on a center-based clustering approach. For constructing initial clusters, members of an independent dominating set for the graph representation of a protein are considered as the centers. A distance matrix is then defined for these clusters. To obtain final domains, these clusters are merged using the compactness principle of domains and a method similar to the neighbor-joining algorithm considering some thresholds. The thresholds are computed using a training set consisting of 50 protein chains. The algorithm is implemented using C++ language and is named ProDomAs. To assess the performance of ProDomAs, its results are compared with seven automatic methods, against five publicly available benchmarks. The results show that ProDomAs outperforms other methods applied on the mentioned benchmarks. The performance of ProDomAs is also evaluated against 6342 chains obtained from ASTRAL SCOP 1.71. ProDomAs is freely available at http://www.bioinf.cs.ipm.ir/software/prodomas.


Asunto(s)
Estructura Terciaria de Proteína , Proteínas/química , Proteínas/clasificación , Algoritmos , Secuencia de Aminoácidos , Análisis por Conglomerados , Biología Computacional , Proteómica , Análisis de Secuencia de Proteína
18.
PLoS One ; 8(12): e80565, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-24376498

RESUMEN

The profile hidden Markov model (PHMM) is widely used to assign the protein sequences to their respective families. A major limitation of a PHMM is the assumption that given states the observations (amino acids) are independent. To overcome this limitation, the dependency between amino acids in a multiple sequence alignment (MSA) which is the representative of a PHMM can be appended to the PHMM. Due to the fact that with a MSA, the sequences of amino acids are biologically related, the one-by-one dependency between two amino acids can be considered. In other words, based on the MSA, the dependency between an amino acid and its corresponding amino acid located above can be combined with the PHMM. For this purpose, the new emission probability matrix which considers the one-by-one dependencies between amino acids is constructed. The parameters of a PHMM are of two types; transition and emission probabilities which are usually estimated using an EM algorithm called the Baum-Welch algorithm. We have generalized the Baum-Welch algorithm using similarity emission matrix constructed by integrating the new emission probability matrix with the common emission probability matrix. Then, the performance of similarity emission is discussed by applying it to the top twenty protein families in the Pfam database. We show that using the similarity emission in the Baum-Welch algorithm significantly outperforms the common Baum-Welch algorithm in the task of assigning protein sequences to protein families.


Asunto(s)
Algoritmos , Homología de Secuencia de Aminoácido , Bases de Datos de Proteínas , Alineación de Secuencia
19.
Genomics ; 102(5-6): 507-14, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-24161398

RESUMEN

Recent advances in the sequencing technologies have provided a handful of RNA-seq datasets for transcriptome analysis. However, reconstruction of full-length isoforms and estimation of the expression level of transcripts with a low cost are challenging tasks. We propose a novel de novo method named SSP that incorporates interval integer linear programming to resolve alternatively spliced isoforms and reconstruct the whole transcriptome from short reads. Experimental results show that SSP is fast and precise in determining different alternatively spliced isoforms along with the estimation of reconstructed transcript abundances. The SSP software package is available at http://www.bioinf.cs.ipm.ir/software/ssp.


Asunto(s)
Programación Lineal , Isoformas de ARN/análisis , Análisis de Secuencia de ARN/métodos , Empalme Alternativo , Perfilación de la Expresión Génica/métodos , Programación Lineal/economía , Análisis de Secuencia de ARN/economía , Programas Informáticos , Transcriptoma
20.
Int J Data Min Bioinform ; 8(1): 66-82, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23865165

RESUMEN

A Profile Hidden Markov Model (PHMM) is a standard form of a Hidden Markov Models used for modeling protein and DNA sequence families based on multiple alignment. In this paper, we implement Baum-Welch algorithm and the Bayesian Monte Carlo Markov Chain (BMCMC) method for estimating parameters of small artificial PHMM. In order to improve the prediction accuracy of the estimation of the parameters of the PHMM, we classify the training data using the weighted values of sequences in the PHMM then apply an algorithm for estimating parameters of the PHMM. The results show that the BMCMC method performs better than the Maximum Likelihood estimation.


Asunto(s)
Algoritmos , Análisis por Conglomerados , Cadenas de Markov , Análisis de Secuencia de ADN , Análisis de Secuencia de Proteína , Secuencia de Bases , Teorema de Bayes , Funciones de Verosimilitud , Alineación de Secuencia
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA