Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
BMC Genomics ; 25(1): 440, 2024 May 03.
Artículo en Inglés | MEDLINE | ID: mdl-38702606

RESUMEN

BACKGROUND: Alzheimer's disease (AD) is a heritable neurodegenerative disease whose long asymptomatic phase makes the early diagnosis of it pivotal. Blood U-p53 has recently emerged as a superior predictive biomarker for AD in the early stages. We hypothesized that genetic variants associated with blood U-p53 could reveal novel loci and pathways involved in the early stages of AD. RESULTS: We performed a blood U-p53 Genome-wide association study (GWAS) on 484 healthy and mild cognitively impaired subjects from the ADNI cohort using 612,843 Single nucleotide polymorphisms (SNPs). We performed a pathway analysis and prioritized candidate genes using an AD single-cell gene program. We fine-mapped the intergenic SNPs by leveraging a cell-type-specific enhancer-to-gene linking strategy using a brain single-cell multimodal dataset. We validated the candidate genes in an independent brain single-cell RNA-seq and the ADNI blood transcriptome datasets. The rs279686 between AASS and FEZF1 genes was the most significant SNP (p-value = 4.82 × 10-7). Suggestive pathways were related to the immune and nervous systems. Twenty-three candidate genes were prioritized at 27 suggestive loci. Fine-mapping of 5 intergenic loci yielded nine cell-specific candidate genes. Finally, 15 genes were validated in the independent single-cell RNA-seq dataset, and five were validated in the ADNI blood transcriptome dataset. CONCLUSIONS: We underlined the importance of performing a GWAS on an early-stage biomarker of AD and leveraging functional omics datasets for pinpointing causal genes in AD. Our study prioritized nine genes (SORCS1, KIF5C, TMEFF2, TMEM63C, HLA-E, ATAT1, TUBB, ARID1B, and RUNX1) strongly implicated in the early stages of AD.


Asunto(s)
Enfermedad de Alzheimer , Estudio de Asociación del Genoma Completo , Polimorfismo de Nucleótido Simple , Humanos , Enfermedad de Alzheimer/genética , Enfermedad de Alzheimer/sangre , Anciano , Masculino , Femenino , Predisposición Genética a la Enfermedad , Biomarcadores/sangre , Anciano de 80 o más Años
2.
Brief Bioinform ; 25(3)2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38653489

RESUMEN

There is a growing interest in inferring context specific gene regulatory networks from single-cell RNA sequencing (scRNA-seq) data. This involves identifying the regulatory relationships between transcription factors (TFs) and genes in individual cells, and then characterizing these relationships at the level of specific cell types or cell states. In this study, we introduce scGATE (single-cell gene regulatory gate) as a novel computational tool for inferring TF-gene interaction networks and reconstructing Boolean logic gates involving regulatory TFs using scRNA-seq data. In contrast to current Boolean models, scGATE eliminates the need for individual formulations and likelihood calculations for each Boolean rule (e.g. AND, OR, XOR). By employing a Bayesian framework, scGATE infers the Boolean rule after fitting the model to the data, resulting in significant reductions in time-complexities for logic-based studies. We have applied assay for transposase-accessible chromatin with sequencing (scATAC-seq) data and TF DNA binding motifs to filter out non-relevant TFs in gene regulations. By integrating single-cell clustering with these external cues, scGATE is able to infer context specific networks. The performance of scGATE is evaluated using synthetic and real single-cell multi-omics data from mouse tissues and human blood, demonstrating its superiority over existing tools for reconstructing TF-gene networks. Additionally, scGATE provides a flexible framework for understanding the complex combinatorial and cooperative relationships among TFs regulating target genes by inferring Boolean logic gates among them.


Asunto(s)
Redes Reguladoras de Genes , Análisis de la Célula Individual , Factores de Transcripción , Análisis de la Célula Individual/métodos , Factores de Transcripción/metabolismo , Factores de Transcripción/genética , Animales , Ratones , Biología Computacional/métodos , Teorema de Bayes , Humanos , Algoritmos , Análisis de Secuencia de ARN/métodos , Regulación de la Expresión Génica , Multiómica
3.
Bioinformatics ; 39(2)2023 02 03.
Artículo en Inglés | MEDLINE | ID: mdl-36790055

RESUMEN

MOTIVATION: The gene regulatory process resembles a logic system in which a target gene is regulated by a logic gate among its regulators. While various computational techniques are developed for a gene regulatory network (GRN) reconstruction, the study of logical relationships has received little attention. Here, we propose a novel tool called wpLogicNet that simultaneously infers both the directed GRN structures and logic gates among genes or transcription factors (TFs) that regulate their target genes, based on continuous steady-state gene expressions. RESULTS: wpLogicNet proposes a framework to infer the logic gates among any number of regulators, with a low time-complexity. This distinguishes wpLogicNet from the existing logic-based models that are limited to inferring the gate between two genes or TFs. Our method applies a Bayesian mixture model to estimate the likelihood of the target gene profile and to infer the logic gate a posteriori. Furthermore, in structure-aware mode, wpLogicNet reconstructs the logic gates in TF-gene or gene-gene interaction networks with known structures. The predicted logic gates are validated on simulated datasets of TF-gene interaction networks from Escherichia coli. For the directed-edge inference, the method is validated on datasets from E.coli and DREAM project. The results show that compared to other well-known methods, wpLogicNet is more precise in reconstructing the network and logical relationships among genes. AVAILABILITY AND IMPLEMENTATION: The datasets and R package of wpLogicNet are available in the github repository, https://github.com/CompBioIPM/wpLogicNet. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Algoritmos , Redes Reguladoras de Genes , Teorema de Bayes , Regulación de la Expresión Génica , Factores de Transcripción/metabolismo , Escherichia coli/genética , Escherichia coli/metabolismo
4.
BMC Bioinformatics ; 21(1): 318, 2020 Jul 20.
Artículo en Inglés | MEDLINE | ID: mdl-32690031

RESUMEN

BACKGROUND: Gene Regulatory Networks (GRNs) have been previously studied by using Boolean/multi-state logics. While the gene expression values are usually scaled into the range [0, 1], these GRN inference methods apply a threshold to discretize the data, resulting in missing information. Most of studies apply fuzzy logics to infer the logical gene-gene interactions from continuous data. However, all these approaches require an a priori known network structure. RESULTS: Here, by introducing a new probabilistic logic for continuous data, we propose a novel logic-based approach (called the LogicNet) for the simultaneous reconstruction of the GRN structure and identification of the logics among the regulatory genes, from the continuous gene expression data. In contrast to the previous approaches, the LogicNet does not require an a priori known network structure to infer the logics. The proposed probabilistic logic is superior to the existing fuzzy logics and is more relevant to the biological contexts than the fuzzy logics. The performance of the LogicNet is superior to that of several Mutual Information-based and regression-based tools for reconstructing GRNs. CONCLUSIONS: The LogicNet reconstructs GRNs and logic functions without requiring prior knowledge of the network structure. Moreover, in another application, the LogicNet can be applied for logic function detection from the known regulatory genes-target interactions. We also conclude that computational modeling of the logical interactions among the regulatory genes significantly improves the GRN reconstruction accuracy.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Escherichia coli/genética , Lógica Difusa , Redes Reguladoras de Genes , Genes Reguladores , Modelos Genéticos , Simulación por Computador , Escherichia coli/metabolismo , Perfilación de la Expresión Génica
5.
Infect Genet Evol ; 85: 104426, 2020 11.
Artículo en Inglés | MEDLINE | ID: mdl-32561293

RESUMEN

Human T-lymphotropic virus type-1 (HTLV-1) is a retrovirus that causes the neurological disorder HTLV-1 associated myelopathy/ tropical spastic paraparesis (HAM/TSP) and/or adult T-cell leukemia/lymphoma (ATLL). Iran is one of the endemic regions of the HTLV-1 in the Middle East. To infer the origin of the virus in Iran and to follow the movements of human population and routes of virus spread to this country, phylogenetic and phylodynamic analyses were performed. To this purpose, the long terminal repeat (LTR) region of HTLV-1 was used. New LTR sequences were obtained from 100 blood samples which infected with HTLV-1. Moreover, all Iranian LTR sequences which have been reported so far, were obtained from GenBank database. Sequences were aligned and maximum-likelihood and Bayesian tree topologies were explored. After identification of Iranian specific cluster, molecular-clock and coalescent models were used to estimate time to the most recent common ancestor (tMRCA). Bayesian Skyline Plots (BSP), representing population dynamics HTLV-1 strains back to the MRCA, were estimated using BEAST software. Phylogenetic analysis demonstrated that the Iranian, Kuwaiti, German, Israelite and southern Indian isolates are located within the widespread "transcontinental" subgroup A clade of HTLV-1 Cosmopolitan subtype a. Molecular clock analysis of the Iranian cluster dated back their respective tMRCA to be 1290 AC with a 95% HPD confidence intervals (918, 1517). BSPs indicated a rapid exponential growth rate in the effective number of infections prior the 15th century. Our results support the hypothesis of a multiple introductions of HTLV-1 into Iran with the majority of introductions occurring in prior the 15th century, at the same time the Mongol invasion of Iran. Our results further suggest that HTLV-1 introduction into Iran was facilitated by the commercial/migratory linkage as known as the ancient Silk Road which linked China to Antioch (now in Turkey).


Asunto(s)
Infecciones por HTLV-I/virología , Virus Linfotrópico T Tipo 1 Humano/genética , Adolescente , Adulto , Secuencia de Bases , Teorema de Bayes , Sangre/virología , ADN Viral , Evolución Molecular , Femenino , Infecciones por HTLV-I/epidemiología , Virus Linfotrópico T Tipo 1 Humano/aislamiento & purificación , Humanos , Irán , Masculino , Persona de Mediana Edad , Filogenia , Análisis de Secuencia de ADN , Secuencias Repetidas Terminales , Adulto Joven
6.
J Bioinform Comput Biol ; 16(4): 1850012, 2018 08.
Artículo en Inglés | MEDLINE | ID: mdl-30051743

RESUMEN

Based on previous studies, empirical distribution of the bacterial burst size varies even in a population of isogenic bacteria. Since bacteriophage progenies increase linearly with time, it is the lysis time variation that results in the bacterial burst size variations. Here, the burst size variation is computationally modeled by considering the lysis time decisions as a game. Each player in the game is a bacteriophage that has initially infected and lysed its host bacterium. Also, the payoff of each burst size strategy is the average number of bacteria that are solely infected by the bacteriophage progenies after lysis. For calculating the payoffs, a new version of ball and bin model with time dependent occupation probabilities (TDOP) is proposed. We show that Nash equilibrium occurs for a range of mixed burst size strategies that are chosen and played by bacteriophages, stochastically. Moreover, it is concluded that the burst size variations arise from choosing mixed lysis strategies by each player. By choosing the lysis time and also the burst size stochastically, the released bacteriophage progenies infect a portion of host bacteria in environment and avoid extinction. The probability distribution of the mixed burst size strategies is also identified.


Asunto(s)
Bacterias/virología , Bacteriólisis/fisiología , Modelos Biológicos , Modelos Estadísticos , Bacterias/citología , Fenómenos Fisiológicos Bacterianos , Bacteriófagos , Teoría del Juego
7.
Sci Rep ; 8(1): 4009, 2018 03 05.
Artículo en Inglés | MEDLINE | ID: mdl-29507384

RESUMEN

Currently a few tools are capable of detecting genome-wide Copy Number Variations (CNVs) based on sequencing of multiple samples. Although aberrations in mate pair insertion sizes provide additional hints for the CNV detection based on multiple samples, the majority of the current tools rely only on the depth of coverage. Here, we propose a new algorithm (MSeq-CNV) which allows detecting common CNVs across multiple samples. MSeq-CNV applies a mixture density for modeling aberrations in depth of coverage and abnormalities in the mate pair insertion sizes. Each component in this mixture density applies a Binomial distribution for modeling the number of mate pairs with aberration in the insertion size and also a Poisson distribution for emitting the read counts, in each genomic position. MSeq-CNV is applied on simulated data and also on real data of six HapMap individuals with high-coverage sequencing, in 1000 Genomes Project. These individuals include a CEU trio of European ancestry and a YRI trio of Nigerian ethnicity. Ancestry of these individuals is studied by clustering the identified CNVs. MSeq-CNV is also applied for detecting CNVs in two samples with low-coverage sequencing in 1000 Genomes Project and six samples form the Simons Genome Diversity Project.


Asunto(s)
Variaciones en el Número de Copia de ADN , Análisis de Secuencia de ADN/normas , Algoritmos , Eliminación de Gen , Genoma Humano , Proyecto Mapa de Haplotipos , Heterocigoto , Homocigoto , Humanos , Distribución de Poisson , Análisis de Secuencia de ADN/métodos
8.
Arch Virol ; 163(6): 1479-1488, 2018 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-29442226

RESUMEN

Previous local and national Iranian publications indicate that all Iranian hepatitis B virus (HBV) strains belong to HBV genotype D. The aim of this study was to analyze the evolutionary history of HBV infection in Iran for the first time, based on an intensive phylodynamic study. The evolutionary parameters, time to most recent common ancestor (tMRCA), and the population dynamics of infections were investigated using the Bayesian Monte Carlo Markov chain (BMCMC). The effective sample size (ESS) and sampling convergence were then monitored. After sampling from the posterior distribution of the nucleotide substitution rate and other evolutionary parameters, the point estimations (median) of these parameters were obtained. All Iranian HBV isolates were of genotype D, sub-type ayw2. The origin of HBV is regarded as having evolved first on the eastern border, before moving westward, where Isfahan province then hosted the virus. Afterwards, the virus moved to the south and west of the country. The tMRCA of HBV in Iran was estimated to be around 1894, with a 95% credible interval between the years 1701 and 1957. The effective number of infections increased exponentially from around 1925 to 1960. Conversely, from around 1992 onwards, the effective number of HBV infections has decreased at a very high rate. Phylodynamic inference clearly demonstrates a unique homogenous pattern of HBV genotype D compatible with a steady configuration of the decreased effective number of infections in the population in recent years, possibly due to the implementation of blood donation screening and vaccination programs. Adequate molecular epidemiology databases for HBV are crucial for infection prevention and treatment programs.


Asunto(s)
ADN Viral/genética , Genotipo , Virus de la Hepatitis B/genética , Hepatitis B/epidemiología , Filogenia , Teorema de Bayes , Evolución Molecular , Variación Genética , Hepatitis B/historia , Hepatitis B/prevención & control , Hepatitis B/transmisión , Virus de la Hepatitis B/clasificación , Virus de la Hepatitis B/aislamiento & purificación , Historia del Siglo XVIII , Historia del Siglo XIX , Historia del Siglo XX , Historia del Siglo XXI , Humanos , Programas de Inmunización/historia , Programas de Inmunización/organización & administración , Irán/epidemiología , Cadenas de Markov , Epidemiología Molecular , Método de Montecarlo , Tasa de Mutación , Análisis de Secuencia de ADN , Vacunas contra Hepatitis Viral/administración & dosificación
9.
BMC Bioinformatics ; 18(1): 30, 2016 Nov 03.
Artículo en Inglés | MEDLINE | ID: mdl-27809781

RESUMEN

BACKGROUND: Copy Number Variation (CNV) is envisaged to be a major source of large structural variations in the human genome. In recent years, many studies apply Next Generation Sequencing (NGS) data for the CNV detection. However, still there is a necessity to invent more accurate computational tools. RESULTS: In this study, mate pair NGS data are used for the CNV detection in a Hidden Markov Model (HMM). The proposed HMM has position specific emission probabilities, i.e. a Gaussian mixture distribution. Each component in the Gaussian mixture distribution captures a different type of aberration that is observed in the mate pairs, after being mapped to the reference genome. These aberrations may include any increase (decrease) in the insertion size or change in the direction of mate pairs that are mapped to the reference genome. This HMM with Position-Specific Emission probabilities (PSE-HMM) is utilized for the genome-wide detection of deletions and tandem duplications. The performance of PSE-HMM is evaluated on a simulated dataset and also on a real data of a Yoruban HapMap individual, NA18507. CONCLUSIONS: PSE-HMM is effective in taking observation dependencies into account and reaches a high accuracy in detecting genome-wide CNVs. MATLAB programs are available at http://bs.ipm.ir/softwares/PSE-HMM/ .


Asunto(s)
Algoritmos , Variaciones en el Número de Copia de ADN , Genoma Humano , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ADN/métodos , Exactitud de los Datos , Genómica/métodos , Humanos , Probabilidad
10.
PLoS One ; 11(9): e0162492, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-27611688

RESUMEN

The high rate of hepatitis C virus (HCV) infection among transfusion related risk groups such as patients with inherited bleeding disorders highlighting the investigation on prevalent subtypes and their epidemic history among this group. In this study, 166 new HCV NS5B sequences isolated from patients with inherited bleeding disorders together with 29 sequences related to hemophiliacs obtained from a previous study on diversity of HCV in Iran were analyzed. The most prevalent subtype was 1a (65%), followed by 3a (18.7%),1b (14.5%),4(1.2%) and 2k (0.6%). Subtypes 1a and 3a showed exponential expansion during the 20th century. Whereas expansion of 3a started around 20 years earlier than 1a among the study patients, the epidemic growth of 1a revealed a delay of about 10 years compared with that found for this subtype in developed countries. Our results supported the view that the spread of 3a reached the plateau 10 years prior to the screening of blood donors for HCV. Rather, 1a reached the plateau when screening program was implemented. The differences observed in the epidemic behavior of HCV-1a and 3a may be associated with different transmission routes of two subtypes. Indeed, expansion of 1a was more commonly linked to blood transfusion, while 3a was more strongly associated to drug use and specially IDU after 1960. Our findings also showed HCV transmission through blood products has effectively been controlled from late 1990s. In conclusion, the implementation of strategies such as standard surveillance programs and subsiding antiviral treatments seems to be essential to both prevent new HCV infections and to decline the current and future HCV disease among Iranian patients with inherited bleeding disorders.


Asunto(s)
Hepatitis C/epidemiología , Adolescente , Adulto , Anciano , Trastornos de la Coagulación Sanguínea/epidemiología , Trastornos de la Coagulación Sanguínea/virología , Femenino , Genotipo , Hepacivirus/patogenicidad , Hepatitis C/clasificación , Hepatitis C/virología , Humanos , Irán/epidemiología , Masculino , Persona de Mediana Edad , Filogenia , Filogeografía , ARN Viral/genética , Proteínas no Estructurales Virales/genética , Adulto Joven
11.
Math Biosci ; 279: 53-62, 2016 09.
Artículo en Inglés | MEDLINE | ID: mdl-27424951

RESUMEN

MOTIVATION: Association of Copy Number Variation (CNV) with schizophrenia, autism, developmental disabilities and fatal diseases such as cancer is verified. Recent developments in Next Generation Sequencing (NGS) have facilitated the CNV studies. However, many of the current CNV detection tools are not capable of discriminating tandem duplication from non-tandem duplications. RESULTS: In this study, we propose MGP-HMM as a tool which besides detecting genome-wide deletions discriminates tandem duplications from non-tandem duplications. MGP-HMM takes mate pair abnormalities into account and predicts the digitized number of tandem or non-tandem copies. Abnormalities in the mate pair directions and insertion sizes, after being mapped to the reference genome, are elucidated using a Hidden Markov Model (HMM). For this purpose, a Mixture Gaussian density with time-dependent parameters is applied for emitting mate pair insertion sizes from HMM states. Indeed, depending on observed abnormalities in mate pair insertion size or its orientation, each component in the mixture density will have different parameters. MGP-HMM also applies a Poisson distribution for modeling read depth data. This parametric modeling of the mate pair reads enables us to estimate the length of CNVs precisely, which is an advantage over methods which rely only on read depth approach for the CNV detection. Hidden state of the proposed HMM is the digitized copy number of a genomic segment and states correspond to the multipliers of the mixture Gaussian components. The accuracy of our model is validated on a set of next generation sequencing real and simulated data and is compared to other tools.


Asunto(s)
Variaciones en el Número de Copia de ADN , Modelos Estadísticos , Análisis de Secuencia , Humanos
12.
Int J Data Min Bioinform ; 8(1): 66-82, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23865165

RESUMEN

A Profile Hidden Markov Model (PHMM) is a standard form of a Hidden Markov Models used for modeling protein and DNA sequence families based on multiple alignment. In this paper, we implement Baum-Welch algorithm and the Bayesian Monte Carlo Markov Chain (BMCMC) method for estimating parameters of small artificial PHMM. In order to improve the prediction accuracy of the estimation of the parameters of the PHMM, we classify the training data using the weighted values of sequences in the PHMM then apply an algorithm for estimating parameters of the PHMM. The results show that the BMCMC method performs better than the Maximum Likelihood estimation.


Asunto(s)
Algoritmos , Análisis por Conglomerados , Cadenas de Markov , Análisis de Secuencia de ADN , Análisis de Secuencia de Proteína , Secuencia de Bases , Teorema de Bayes , Funciones de Verosimilitud , Alineación de Secuencia
13.
Math Biosci ; 221(2): 130-5, 2009 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-19646454

RESUMEN

Hidden Markov Models (HMMs) are practical tools which provide probabilistic base for protein secondary structure prediction. In these models, usually, only the information of the left hand side of an amino acid is considered. Accordingly, these models seem to be inefficient with respect to long range correlations. In this work we discuss a Segmental Semi Markov Model (SSMM) in which the information of both sides of amino acids are considered. It is assumed and seemed reasonable that the information on both sides of an amino acid can provide a suitable tool for measuring dependencies. We consider these dependencies by dividing them into shorter dependencies. Each of these dependency models can be applied for estimating the probability of segments in structural classes. Several conditional probabilities concerning dependency of an amino acid to the residues appeared on its both sides are considered. Based on these conditional probabilities a weighted model is obtained to calculate the probability of each segment in a structure. This results in 2.27% increase in prediction accuracy in comparison with the ordinary Segmental Semi Markov Models, SSMMs. We also compare the performance of our model with that of the Segmental Semi Markov Model introduced by Schmidler et al. [C.S. Schmidler, J.S. Liu, D.L. Brutlag, Bayesian segmentation of protein secondary structure, J. Comp. Biol. 7(1/2) (2000) 233-248]. The calculations show that the overall prediction accuracy of our model is higher than the SSMM introduced by Schmidler.


Asunto(s)
Cadenas de Markov , Modelos Moleculares , Estructura Secundaria de Proteína , Proteínas/química , Algoritmos , Aminoácidos/química , Teorema de Bayes , Modelos Estadísticos
14.
Math Biosci ; 217(2): 145-50, 2009 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-19046975

RESUMEN

Prediction of protein secondary structure is an important step towards elucidating its three dimensional structure and its function. This is a challenging problem in bioinformatics. Segmental semi Markov models (SSMMs) are one of the best studied methods in this field. However, incorporating evolutionary information to these methods is somewhat difficult. On the other hand, the systems of multiple neural networks (NNs) are powerful tools for multi-class pattern classification which can easily be applied to take these sorts of information into account. To overcome the weakness of SSMMs in prediction, in this work we consider a SSMM as a decision function on outputs of three NNs that uses multiple sequence alignment profiles. We consider four types of observations for outputs of a neural network. Then profile table related to each sequence is reduced to a sequence of four observations. In order to predict secondary structure of each amino acid we need to consider a decision function. We use an SSMM on outputs of three neural networks. The proposed SSMM has discriminative power and weights over different dependency models for outputs of neural networks. The results show that the accuracy of our model in predictions, particularly for strands, is considerably increased.


Asunto(s)
Cadenas de Markov , Redes Neurales de la Computación , Estructura Secundaria de Proteína , Proteínas/química , Alineación de Secuencia
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...