Pesquisa | Portal de Pesquisa da BVS Enfermagem

1.

Ontogeny-related pharmacogene changes in the pediatric liver transcriptome.

Meier, Richard; Bi, Chengpeng; Gaedigk, Roger; Heruth, Daniel P; Ye, Shui Qing; Leeder, J Steven; Fridley, Brooke L.

Pharmacogenet Genomics ; 28(3): 86-94, 2018 03.

Artigo em Inglês | MEDLINE | ID: mdl-29360682

RESUMO

OBJECTIVES: The majority of drug dosing studies are based on adult populations, with modification of the dosing for children based on size and weight. This rudimentary approach for drug dosing children is limited, as biologically a child can differ from an adult in far more aspects than just size and weight. Specifically, understanding the ontogeny of childhood liver development is critical in dosing drugs that are metabolized through the liver, as the rate of metabolism determines the duration and intensity of a drug's pharmacologic action. Therefore, we set out to determine pharmacogenes that change over childhood development, followed by a secondary agnostic analysis, assessing changes transcriptome wide. MATERIALS AND METHODS: A total of 47 human liver tissue samples, with between 10 and 13 samples in four age groups spanning childhood development, underwent pair-end sequencing. Kruskal-Wallis and Spearman's rank correlation tests were used to determine the association of gene expression levels with age. Gene set analysis based on the pathways in KEGG utilized the gamma method. Correction for multiple testing was completed using q-values. RESULTS: We found evidence for increased expression of 'very important pharmacogenes', for example, coagulation factor V (F5) (P=6.7×10(-7)), angiotensin I converting enzyme (ACE) (P=6.4×10(-3)), and solute carrier family 22 member 1 (SLC22A1) (P=7.0×10(-5)) over childhood development. In contrast, we observed a significant decrease in expression of two alternative CYP3A7 transcripts (P=1.5×10(-5) and 3.0×10(-5)) over development. The analysis of genome-wide changes detected transcripts in the following genes with significant changes in mRNA expression (P<1×10(-9) with false discovery rate<5×0(-5)): ADCY1, PTPRD, CNDP1, DCAF12L1 and HIP1. Gene set analysis determined ontogeny-related transcriptomic changes in the renin-angiotensin pathway (P<0.002), with lower expression of the pathway, in general, observed in liver samples from younger participants. CONCLUSION: Considering that the renin-angiotensin pathway plays a central role in blood pressure and plasma sodium concentration, and our observation that ACE and PTPRD expression increased over the spectrum of childhood development, this finding could potentially impact the dosing of an entire class of drugs known as ACE-inhibitors in pediatric patients.

Assuntos

Sequenciamento de Nucleotídeos em Larga Escala , Transportador 1 de Cátions Orgânicos/genética , Sistema Renina-Angiotensina/genética , Transcriptoma/genética , Adolescente , Criança , Pré-Escolar , Citocromo P-450 CYP3A/genética , Fator V/genética , Feminino , Regulação da Expressão Gênica/efeitos dos fármacos , Humanos , Lactente , Recém-Nascido , Fígado/efeitos dos fármacos , Fígado/metabolismo , Masculino , Peptidil Dipeptidase A/genética

2.

Dynamics of Cytosine Methylation in the Proximal Promoters of CYP3A4 and CYP3A7 in Pediatric and Prenatal Livers.

Vyhlidal, Carrie A; Bi, Chengpeng; Ye, Shui Qing; Leeder, J Steven.

Drug Metab Dispos ; 44(7): 1020-6, 2016 07.

Artigo em Inglês | MEDLINE | ID: mdl-26772622

RESUMO

Members of the human CYP3A family of metabolizing enzymes exhibit developmental changes in expression whereby CYP3A7 is expressed in fetal tissues, followed by a transition to expression of CYP3A4 in the first months of life. Despite knowledge about the general pattern of CYP3A activity in human development, the mechanisms that regulate developmental expression remain poorly understood. Epigenetic changes, including cytosine methylation, have been suggested to play a role in the regulation of CYP3A expression. The objective of this study was to investigate changes in cytosine methylation of the CYP3A4 and CYP3A7 genes in human pediatric and prenatal livers. The methylation status of cytosine-phospho-guanine dinucleotides was determined in 16 pediatric liver samples using methyl-seq and confirmed by bisulfite sequencing of 48 pediatric and 34 prenatal liver samples. Samples were separated by age into five groups (prenatal, < 1 year of age, 1.8-6 years, 7-11 years, and 12-17 years). Methyl-seq anaylsis revealed that cytosines in the proximal promoter of CYP3A7 are hypomethylated in neonates compared with adolescents (P < 0.001). In contrast, a cytosine 383 base pair upstream of CYP3A4 is hypermethylated in liver samples from neonates compared with adolescents (P = 0.00001). Developmental changes in methylation of cytosines in the proximal promoters of CYP3A4 and CYP3A7 in pediatric livers were confirmed by bisulfite sequencing. In addition, the methylation status of cytosine in the CYP3A4 and CYP3A7 proximal promoters correlated with changes in developmental expression of mRNA for the two enzymes.

Assuntos

Envelhecimento/genética , Citocromo P-450 CYP3A/genética , Citosina , Metilação de DNA , Epigênese Genética , Fígado/enzimologia , Regiões Promotoras Genéticas , Adolescente , Fatores Etários , Envelhecimento/metabolismo , Criança , Pré-Escolar , Citocromo P-450 CYP3A/metabolismo , Feminino , Regulação da Expressão Gênica no Desenvolvimento , Regulação Enzimológica da Expressão Gênica , Idade Gestacional , Humanos , Lactente , Recém-Nascido , Masculino , RNA Mensageiro/genética , RNA Mensageiro/metabolismo

3.

Characterization and visualization of tandem repeats at genome scale.

Dolzhenko, Egor; English, Adam; Dashnow, Harriet; De Sena Brandine, Guilherme; Mokveld, Tom; Rowell, William J; Karniski, Caitlin; Kronenberg, Zev; Danzi, Matt C; Cheung, Warren A; Bi, Chengpeng; Farrow, Emily; Wenger, Aaron; Chua, Khi Pin; Martínez-Cerdeño, Verónica; Bartley, Trevor D; Jin, Peng; Nelson, David L; Zuchner, Stephan; Pastinen, Tomi; Quinlan, Aaron R; Sedlazeck, Fritz J; Eberle, Michael A.

Nat Biotechnol ; 2024 Jan 02.

Artigo em Inglês | MEDLINE | ID: mdl-38168995

RESUMO

Tandem repeat (TR) variation is associated with gene expression changes and numerous rare monogenic diseases. Although long-read sequencing provides accurate full-length sequences and methylation of TRs, there is still a need for computational methods to profile TRs across the genome. Here we introduce the Tandem Repeat Genotyping Tool (TRGT) and an accompanying TR database. TRGT determines the consensus sequences and methylation levels of specified TRs from PacBio HiFi sequencing data. It also reports reads that support each repeat allele. These reads can be subsequently visualized with a companion TR visualization tool. Assessing 937,122 TRs, TRGT showed a Mendelian concordance of 98.38%, allowing a single repeat unit difference. In six samples with known repeat expansions, TRGT detected all expansions while also identifying methylation signals and mosaicism and providing finer repeat length resolution than existing methods. Additionally, we released a database with allele sequences and methylation levels for 937,122 TRs across 100 genomes.

4.

Direct haplotype-resolved 5-base HiFi sequencing for genome-wide profiling of hypermethylation outliers in a rare disease cohort.

Cheung, Warren A; Johnson, Adam F; Rowell, William J; Farrow, Emily; Hall, Richard; Cohen, Ana S A; Means, John C; Zion, Tricia N; Portik, Daniel M; Saunders, Christopher T; Koseva, Boryana; Bi, Chengpeng; Truong, Tina K; Schwendinger-Schreck, Carl; Yoo, Byunggil; Johnston, Jeffrey J; Gibson, Margaret; Evrony, Gilad; Rizzo, William B; Thiffault, Isabelle; Younger, Scott T; Curran, Tom; Wenger, Aaron M; Grundberg, Elin; Pastinen, Tomi.

Nat Commun ; 14(1): 3090, 2023 05 29.

Artigo em Inglês | MEDLINE | ID: mdl-37248219

RESUMO

Long-read HiFi genome sequencing allows for accurate detection and direct phasing of single nucleotide variants, indels, and structural variants. Recent algorithmic development enables simultaneous detection of CpG methylation for analysis of regulatory element activity directly in HiFi reads. We present a comprehensive haplotype resolved 5-base HiFi genome sequencing dataset from a rare disease cohort of 276 samples in 152 families to identify rare (~0.5%) hypermethylation events. We find that 80% of these events are allele-specific and predicted to cause loss of regulatory element activity. We demonstrate heritability of extreme hypermethylation including rare cis variants associated with short (~200 bp) and large hypermethylation events (>1 kb), respectively. We identify repeat expansions in proximal promoters predicting allelic gene silencing via hypermethylation and demonstrate allelic transcriptional events downstream. On average 30-40 rare hypermethylation tiles overlap rare disease genes per patient, providing indications for variation prioritization including a previously undiagnosed pathogenic allele in DIP2B causing global developmental delay. We propose that use of HiFi genome sequencing in unsolved rare disease cases will allow detection of unconventional diseases alleles due to loss of regulatory element activity.

Assuntos

Metilação de DNA , Doenças Raras , Humanos , Haplótipos , Doenças Raras/genética , Metilação de DNA/genética , Análise de Sequência de DNA , Sequência de Bases , Sequenciamento de Nucleotídeos em Larga Escala , Proteínas do Tecido Nervoso/genética

5.

Pediatric growth patterns in youth-onset type 2 diabetes mellitus: Implications for physiologically-based pharmacokinetic models.

Hosey, Chelsea M; Halpin, Kelsee; Shakhnovich, Valentina; Bi, Chengpeng; Sweeney, Brooke; Yan, Yun; Leeder, J Steven.

Clin Transl Sci ; 15(4): 912-922, 2022 04.

Artigo em Inglês | MEDLINE | ID: mdl-35297172

RESUMO

An accurate understanding of the changes in height and weight of children with age is critical to the development of models predicting drug concentrations in children (i.e., physiologically-based pharmacokinetic models). However, curves describing the growth of a typical population of children may not accurately characterize growth of children with various conditions, such as obesity. Therefore, to develop height and weight versus age growth curves for youth who were diagnosed with type 2 diabetes, we extracted data from electronic medical records. Robust nonlinear models were parameterized to the equations describing height and weight versus age as defined by the Centers for Disease Control and Prevention (CDC). CDC z-scores were calculated using an internal program. The growth curves and z-scores were compared to CDC norms. Youth with type 2 diabetes were increasingly heavier than CDC norms from early childhood. Except for a period around puberty, youth with type 2 diabetes were, on average, shorter than CDC norms, resulting in shorter average adult height. Deviations in growth were apparent in youth who develop type 2 diabetes; such deviations may be expected for other conditions as well, and disease-specific growth curves should be considered during development of model-informed drug development for pediatric conditions.

Assuntos

Diabetes Mellitus Tipo 2 , Adolescente , Adulto , Estatura , Índice de Massa Corporal , Peso Corporal , Criança , Desenvolvimento Infantil , Pré-Escolar , Humanos , Obesidade

6.

Alternative Splicing of the SLCO1B1 Gene: An Exploratory Analysis of Isoform Diversity in Pediatric Liver.

van Groen, Bianca D; Bi, Chengpeng; Gaedigk, Roger; Staggs, Vincent S; Tibboel, Dick; de Wildt, Saskia N; Leeder, J Steven.

Clin Transl Sci ; 13(3): 509-519, 2020 05.

Artigo em Inglês | MEDLINE | ID: mdl-31917523

RESUMO

The hepatic influx transporter OATP1B1 (SLCO1B1) plays an important role in the disposition of endogenous substrates and drugs prescribed to children. Alternative splicing increases the diversity of protein products from > 90% of human genes and may be triggered by developmental signals. As concentrations of several endogenous OATP1B1 substrates change during growth and development, with this exploratory study we investigated age-dependent alternative splicing of SLCO1B1 mRNA in 97 postmortem livers (fetus-adolescents). Twenty-seven splice variants were detected; 10 were confirmed by additional bioinformatic analyses and verified by quantitative polymerase chain reaction, and selected for detailed analysis based on relative abundance, association with age, and overlap with an adjacent gene. Two splice variants code for reference OATP1B1 protein, and eight code for truncated proteins. The expression of eight isoforms was associated with age. We conclude that alternative splicing of SLCO1B1 occurs frequently in children; although the functional consequences remain unknown, the data raise the possibility of a regulatory role for alternative splicing in mediating developmental changes in drug disposition.

Assuntos

Processamento Alternativo , Regulação da Expressão Gênica no Desenvolvimento , Transportador 1 de Ânion Orgânico Específico do Fígado/genética , Fígado/metabolismo , Feto Abortado , Adolescente , Fatores Etários , Criança , Pré-Escolar , Humanos , Lactente , Recém-Nascido , Transportador 1 de Ânion Orgânico Específico do Fígado/metabolismo , Países Baixos , Transportadores de Ânions Orgânicos/genética , Transportadores de Ânions Orgânicos/metabolismo , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , RNA-Seq , Proteínas Carreadoras de Solutos/genética , Proteínas Carreadoras de Solutos/metabolismo , Natimorto

7.

DNA motif alignment by evolving a population of Markov chains.

Bi, Chengpeng.

BMC Bioinformatics ; 10 Suppl 1: S13, 2009 Jan 30.

Artigo em Inglês | MEDLINE | ID: mdl-19208112

RESUMO

BACKGROUND: Deciphering cis-regulatory elements or de novo motif-finding in genomes still remains elusive although much algorithmic effort has been expended. The Markov chain Monte Carlo (MCMC) method such as Gibbs motif samplers has been widely employed to solve the de novo motif-finding problem through sequence local alignment. Nonetheless, the MCMC-based motif samplers still suffer from local maxima like EM. Therefore, as a prerequisite for finding good local alignments, these motif algorithms are often independently run a multitude of times, but without information exchange between different chains. Hence it would be worth a new algorithm design enabling such information exchange. RESULTS: This paper presents a novel motif-finding algorithm by evolving a population of Markov chains with information exchange (PMC), each of which is initialized as a random alignment and run by the Metropolis-Hastings sampler (MHS). It is progressively updated through a series of local alignments stochastically sampled. Explicitly, the PMC motif algorithm performs stochastic sampling as specified by a population-based proposal distribution rather than individual ones, and adaptively evolves the population as a whole towards a global maximum. The alignment information exchange is accomplished by taking advantage of the pooled motif site distributions. A distinct method for running multiple independent Markov chains (IMC) without information exchange, or dubbed as the IMC motif algorithm, is also devised to compare with its PMC counterpart. CONCLUSION: Experimental studies demonstrate that the performance could be improved if pooled information were used to run a population of motif samplers. The new PMC algorithm was able to improve the convergence and outperformed other popular algorithms tested using simulated and biological motif sequences.

Assuntos

DNA/química , Cadeias de Markov , Alinhamento de Sequência/métodos , Algoritmos , DNA/genética , Análise de Sequência de DNA/métodos

8.

Proteomics of human liver membrane transporters: a focus on fetuses and newborn infants.

van Groen, Bianca D; van de Steeg, Evita; Mooij, Miriam G; van Lipzig, Marola M H; de Koning, Barbara A E; Verdijk, Robert M; Wortelboer, Heleen M; Gaedigk, Roger; Bi, Chengpeng; Leeder, J Steven; van Schaik, Ron H N; van Rosmalen, Joost; Tibboel, Dick; Vaes, Wouter H; de Wildt, Saskia N.

Eur J Pharm Sci ; 124: 217-227, 2018 Nov 01.

Artigo em Inglês | MEDLINE | ID: mdl-30171984

RESUMO

BACKGROUND: Hepatic membrane transporters are involved in the transport of many endogenous and exogenous compounds, including drugs. We aimed to study the relation of age with absolute transporter protein expression in a cohort of 62 mainly fetus and newborn samples. METHODS: Protein expressions of BCRP, BSEP, GLUT1, MCT1, MDR1, MRP1, MRP2, MRP3, NTCP, OCT1, OATP1B1, OATP1B3, OATP2B1 and ATP1A1 were quantified with LC-MS/MS in isolated crude membrane fractions of snap-frozen post-mortem fetal and pediatric, and surgical adult liver samples. mRNA expression was quantified using RNA sequencing, and genetic variants with TaqMan assays. We explored relationships between protein expression and age (gestational age [GA], postnatal age [PNA], and postmenstrual age); between protein and mRNA expression; and between protein expression and genotype. RESULTS: We analyzed 36 fetal (median GA 23.4â¯weeks [range 15.3-41.3]), 12 premature newborn (GA 30.2â¯weeks [24.9-36.7], PNA 1.0â¯weeks [0.14-11.4]), 10 term newborn (GA 40.0â¯weeks [39.7-41.3], PNA 3.9â¯weeks [0.3-18.1]), 4 pediatric (PNA 4.1â¯years [1.1-7.4]) and 8 adult liver samples. A relationship with age was found for BCRP, BSEP, GLUT1, MDR1, MRP1, MRP2, MRP3, NTCP, OATP1B1 and OCT1, with the strongest relationship for postmenstrual age. For most transporters mRNA and protein expression were not correlated. No genotype-protein expression relationship was detected. DISCUSSION AND CONCLUSION: Various developmental patterns of protein expression of hepatic transporters emerged in fetuses and newborns up to four months of age. Postmenstrual age was the most robust factor predicting transporter expression in this cohort. Our data fill an important gap in current pediatric transporter ontogeny knowledge.

Assuntos

Feto/metabolismo , Fígado/metabolismo , Proteínas de Membrana Transportadoras/metabolismo , Adulto , Animais , Criança , Pré-Escolar , Cães , Células HEK293 , Humanos , Lactente , Recém-Nascido , Fígado/embriologia , Células Madin Darby de Rim Canino , Proteínas de Membrana Transportadoras/genética , Proteômica , RNA Mensageiro/metabolismo

9.

Correlations between scaffold/matrix attachment region (S/MAR) binding activity and DNA duplex destabilization energy.

Bode, Jürgen; Winkelmann, Silke; Götze, Sandra; Spiker, Steven; Tsutsui, Ken; Bi, Chengpeng; A K, Prashanth; Benham, Craig.

J Mol Biol ; 358(2): 597-613, 2006 Apr 28.

Artigo em Inglês | MEDLINE | ID: mdl-16516920

RESUMO

Scaffold or matrix-attachment regions (S/MARs) are thought to be involved in the organization of eukaryotic chromosomes and in the regulation of several DNA functions. Their characteristics are conserved between plants and humans, and a variety of biological activities have been associated with them. The identification of S/MARs within genomic sequences has proved to be unexpectedly difficult, as they do not appear to have consensus sequences or sequence motifs associated with them. We have shown that S/MARs do share a characteristic structural property, they have a markedly high predicted propensity to undergo strand separation when placed under negative superhelical tension. This result agrees with experimental observations, that S/MARs contain base-unpairing regions (BURs). Here, we perform a quantitative evaluation of the association between the ease of stress-induced DNA duplex destabilization (SIDD) and S/MAR binding activity. We first use synthetic oligomers to investigate how the arrangement of localized unpairing elements within a base-unpairing region affects S/MAR binding. The organizational properties found in this way are applied to the investigation of correlations between specific measures of stress-induced duplex destabilization and the binding properties of naturally occurring S/MARs. For this purpose, we analyze S/MAR and non-S/MAR elements that have been derived from the human genome or from the tobacco genome. We find that S/MARs exhibit long regions of extensive destabilization. Moreover, quantitative measures of the SIDD attributes of these fragments calculated under uniform conditions are found to correlate very highly (r2>0.8) with their experimentally measured S/MAR-binding strengths. These results suggest that duplex destabilization may be involved in the mechanisms by which S/MARs function. They suggest also that SIDD properties may be incorporated into an improved computational strategy to search genomic DNA sequences for sites having the necessary attributes to function as S/MARs, and even to estimate their relative binding strengths.

Assuntos

DNA/metabolismo , Regiões de Interação com a Matriz , Ácidos Nucleicos Heteroduplexes/metabolismo , Antineoplásicos/química , Antineoplásicos/metabolismo , Cromatina/genética , DNA/química , Dimerização , Genoma Humano , Genoma de Planta , Humanos , Interferon beta/química , Interferon beta/metabolismo , Conformação de Ácido Nucleico , Ligação Proteica

10.

SEAM: a Stochastic EM-type Algorithm for Motif-finding in biopolymer sequences.

Bi, Chengpeng.

J Bioinform Comput Biol ; 5(1): 47-77, 2007 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-17477491

RESUMO

Position weight matrix-based statistical modeling for the identification and characterization of motif sites in a set of unaligned biopolymer sequences is presented. This paper describes and implements a new algorithm, the Stochastic EM-type Algorithm for Motif-finding (SEAM), and redesigns and implements the EM-based motif-finding algorithm called deterministic EM (DEM) for comparison with SEAM, its stochastic counterpart. The gold standard example, cyclic adenosine monophosphate receptor protein (CRP) binding sequences, together with other biological sequences, is used to illustrate the performance of the new algorithm and compare it with other popular motif-finding programs. The convergence of the new algorithm is shown by simulation. The in silico experiments using simulated and biological examples illustrate the power and robustness of the new algorithm SEAM in de novo motif discovery.

Assuntos

Algoritmos , Inteligência Artificial , Biopolímeros/química , Proteínas/química , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína/métodos , Motivos de Aminoácidos , Sequência de Aminoácidos , Sítios de Ligação , Interpretação Estatística de Dados , Funções Verossimilhança , Cadeias de Markov , Dados de Sequência Molecular , Ligação Proteica , Software , Processos Estocásticos

11.

BIPAD: a web server for modeling bipartite sequence elements.

Bi, Chengpeng; Rogan, Peter K.

BMC Bioinformatics ; 7: 76, 2006 Feb 17.

Artigo em Inglês | MEDLINE | ID: mdl-16503993

RESUMO

BACKGROUND: Many dimeric protein complexes bind cooperatively to families of bipartite nucleic acid sequence elements, which consist of pairs of conserved half-site sequences separated by intervening distances that vary among individual sites. RESULTS: We introduce the Bipad Server, a web interface to predict sequence elements embedded within unaligned sequences. Either a bipartite model, consisting of a pair of one-block position weight matrices (PWM's) with a gap distribution, or a single PWM matrix for contiguous single block motifs may be produced. The Bipad program performs multiple local alignment by entropy minimization and cyclic refinement using a stochastic greedy search strategy. The best models are refined by maximizing incremental information contents among a set of potential models with varying half site and gap lengths. CONCLUSION: The web service generates information positional weight matrices, identifies binding site motifs, graphically represents the set of discovered elements as a sequence logo, and depicts the gap distribution as a histogram. Server performance was evaluated by generating a collection of bipartite models for distinct DNA binding proteins.

Assuntos

Mapeamento Cromossômico/métodos , Proteínas de Ligação a DNA/genética , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Software , Fatores de Transcrição/genética , Algoritmos , Sequência de Bases , Sítios de Ligação , Simulação por Computador , Internet , Modelos Genéticos , Dados de Sequência Molecular , Sistemas On-Line , Ligação Proteica , Homologia de Sequência do Ácido Nucleico

12.

Bipartite pattern discovery by entropy minimization-based multiple local alignment.

Bi, Chengpeng; Rogan, Peter K.

Nucleic Acids Res ; 32(17): 4979-91, 2004.

Artigo em Inglês | MEDLINE | ID: mdl-15388800

RESUMO

Many multimeric transcription factors recognize DNA sequence patterns by cooperatively binding to bipartite elements composed of half sites separated by a flexible spacer. We developed a novel bipartite algorithm, bipartite pattern discovery (Bipad), which produces a mathematical model based on information maximization or Shannon's entropy minimization principle, for discovery of bipartite sequence patterns. Bipad is a C++ program that applies greedy methods to search the bipartite alignment space and examines the upstream or downstream regions of co-regulated genes, looking for cis-regulatory bipartite patterns. An input sequence file with zero or one site per locus is required, and the left and right motif widths and a range of possible gap lengths must be specified. Bipad can run in either single-block or bipartite pattern search modes, and it is capable of comprehensively searching all four orientations of half-site patterns. Simulation studies showed that the accuracy of this motif discovery algorithm depends on sample size and motif conservation level, but results were independent of background composition. Bipad performed equivalent with or better than other pattern search algorithms in correctly identifying Escherichia coli cyclic AMP receptor protein and Bacillus subtilis sigma factor binding site sequences based on experimentally defined benchmarks. Finally, a new bipartite information weight matrix for vitamin D3 receptor/retinoid X receptor alpha (VDR/RXRalpha) binding sites was derived that comprehensively models the natural variability inherent in these sequence elements.

Assuntos

Algoritmos , DNA/química , DNA/metabolismo , Sequências Reguladoras de Ácido Nucleico , Análise de Sequência de DNA/métodos , Fatores de Transcrição/metabolismo , Sítios de Ligação , Proteína Receptora de AMP Cíclico/metabolismo , Entropia , Modelos Genéticos , Receptores de Calcitriol/metabolismo , Receptores do Ácido Retinoico/metabolismo , Receptores X de Retinoides , Alinhamento de Sequência , Fator sigma/metabolismo

13.

The analysis of stress-induced duplex destabilization in long genomic DNA sequences.

Benham, Craig J; Bi, Chengpeng.

J Comput Biol ; 11(4): 519-43, 2004.

Artigo em Inglês | MEDLINE | ID: mdl-15579230

RESUMO

We present a method for calculating predicted locations and extents of stress-induced DNA duplex destabilization (SIDD) as functions of base sequence and stress level in long DNA molecules. The base pair denaturation energies are assigned individually, so the influences of near neighbors, methylated bases, adducts, or lesions can be included. Sample calculations indicate that copolymeric energetics give results that are close to those derived when full near-neighbor energetics are used; small but potentially informative differences occur only in the calculated SIDD properties of moderately destabilized regions. The method presented here for analyzing long sequences calculates the destabilization properties within windows of fixed length N, with successive windows displaced by an offset distance d(o). The final values of the relevant destabilization parameters for each base pair are calculated as weighted averages of the values computed for each window in which that base pair appears. This approach implicitly assumes that the strength of the direct coupling between remote base pairs that is induced by the imposed stress attenuates with their separation distance. This strategy enables calculations of the destabilization properties of DNA sequences of any length, up to and including complete chromosomes. We illustrate its utility by calculating the destabilization properties of the entire E. coli genomic DNA sequence. A preliminary analysis of the results shows that promoters are associated with SIDD regions in a highly statistically significant manner, suggesting that SIDD attributes may prove useful in the computational prediction of promoter locations in prokaryotes.

Assuntos

DNA/química , DNA/genética , Conformação de Ácido Nucleico , Análise de Sequência de DNA/métodos , Pareamento de Bases , Fenômenos Biomecânicos , Biologia Computacional , DNA Bacteriano/química , DNA Bacteriano/genética , DNA Super-Helicoidal/química , DNA Super-Helicoidal/genética , Estabilidade de Medicamentos , Escherichia coli/genética , Genoma Bacteriano , Genômica/métodos , Genômica/estatística & dados numéricos , Modelos Biológicos , Desnaturação de Ácido Nucleico , Análise de Sequência de DNA/estatística & dados numéricos , Termodinâmica

14.

Central nervous system lymphoma in immunocompetent patients: the North Shore-Long Island Jewish Health System experience.

Zhang, Xinmin; Chen, Qiang Hua; Farmer, Peter; Nasim, Mansoor; Demopoulos, Alexis; Devoe, Craig; Ranjan, Tulika; Eisenberg, Mark B; Schulder, Michael; Bi, Chengpeng; Li, Jian Yi.

J Clin Neurosci ; 20(1): 75-9, 2013 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-23098391

RESUMO

Of the 74 immunocompetent patients diagnosed between July 2004 and June 2011 at the North Shore University Hospital and Long Island Jewish Medical Center with primary central nervous system lymphoma, 71 (95.9%) had diffuse large B-cell lymphomas (DLBCL). The median patient age was 68 years (range: 19-87 years) with a slight male preponderance (1.1:1). The overall median survival time was 21 months. For patients older than 70 years, the median survival time was 8 months while for those 70 years or younger, the median survival time was 27 months (p<0.01). Female patients had a worse prognosis than male patients (p<0.05, median survival time, 17 months compared to 23 months). We had enough data from 52 of these 71 patients to define the lymphomas as either germinal center B-cell-like (GCB) or activated B-cell-like (ABC) DLBCL. Of these 52 patients, 42 (80.8%) had ABC DLBCL while only 10 (19.2%) had GCB DLBCL. The patients in the GCB subgroup seemed to survive longer than the patients in the ABC subgroup, although the difference did not reach statistical significance. No statistically significant difference in overall survival was seen between patients with BCL-6 positive or negative DLBCL; or between patients with BCL-2 positive or negative DLBCL.

Assuntos

Neoplasias do Sistema Nervoso Central/etnologia , Neoplasias do Sistema Nervoso Central/epidemiologia , Imunocompetência , Linfoma Difuso de Grandes Células B/etnologia , Linfoma Difuso de Grandes Células B/epidemiologia , Adulto , Idoso , Idoso de 80 Anos ou mais , Neoplasias do Sistema Nervoso Central/mortalidade , Proteínas de Ligação a DNA/metabolismo , Feminino , Citometria de Fluxo , Seguimentos , Humanos , Judeus , Estimativa de Kaplan-Meier , Linfoma Difuso de Grandes Células B/mortalidade , Masculino , Pessoa de Meia-Idade , Neprilisina/metabolismo , Cidade de Nova Iorque/epidemiologia , Cidade de Nova Iorque/etnologia , Proteínas Proto-Oncogênicas c-bcl-6 , Estudos Retrospectivos , Adulto Jovem

15.

Memetic algorithms for de novo motif-finding in biomedical sequences.

Bi, Chengpeng.

Artif Intell Med ; 56(1): 1-17, 2012 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-22613029

RESUMO

OBJECTIVES: The objectives of this study are to design and implement a new memetic algorithm for de novo motif discovery, which is then applied to detect important signals hidden in various biomedical molecular sequences. METHODS AND MATERIALS: In this paper, memetic algorithms are developed and tested in de novo motif-finding problems. Several strategies in the algorithm design are employed that are to not only efficiently explore the multiple sequence local alignment space, but also effectively uncover the molecular signals. As a result, there are a number of key features in the implementation of the memetic motif-finding algorithm (MaMotif), including a chromosome replacement operator, a chromosome alteration-aware local search operator, a truncated local search strategy, and a stochastic operation of local search imposed on individual learning. To test the new algorithm, we compare MaMotif with a few of other similar algorithms using simulated and experimental data including genomic DNA, primary microRNA sequences (let-7 family), and transmembrane protein sequences. RESULTS: The new memetic motif-finding algorithm is successfully implemented in C++, and exhaustively tested with various simulated and real biological sequences. In the simulation, it shows that MaMotif is the most time-efficient algorithm compared with others, that is, it runs 2 times faster than the expectation maximization (EM) method and 16 times faster than the genetic algorithm-based EM hybrid. In both simulated and experimental testing, results show that the new algorithm is compared favorably or superior to other algorithms. Notably, MaMotif is able to successfully discover the transcription factors' binding sites in the chromatin immunoprecipitation followed by massively parallel sequencing (ChIP-Seq) data, correctly uncover the RNA splicing signals in gene expression, and precisely find the highly conserved helix motif in the transmembrane protein sequences, as well as rightly detect the palindromic segments in the primary microRNA sequences. CONCLUSIONS: The memetic motif-finding algorithm is effectively designed and implemented, and its applications demonstrate it is not only time-efficient, but also exhibits excellent performance while compared with other popular algorithms.

Assuntos

Algoritmos , Proteínas/química , Sequência de Bases , Sítios de Ligação , Imunoprecipitação da Cromatina , MicroRNAs/química , MicroRNAs/metabolismo , Dados de Sequência Molecular , Proteínas/metabolismo , Análise de Sequência de DNA/métodos

16.

Allele drop-out in the MECP2 gene due to G-quadruplex and i-motif sequences when using polymerase chain reaction-based diagnosis for Rett syndrome.

Saunders, Carol J; Friez, Michael J; Patterson, Melanie; Nzabi, Masha; Zhao, Weiwei; Bi, Chengpeng.

Genet Test Mol Biomarkers ; 14(2): 241-7, 2010 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-20384458

RESUMO

Although few examples are formally documented, all polymerase chain reaction-based testing is theoretically vulnerable to allele drop-out (ADO), the failure to amplify one of the two alleles present in a cell. In a clinical setting, this can lead to false positive or negative diagnosis. We investigated the mechanisms leading to ADO in the MECP2 gene in two unrelated female patients undergoing testing for Rett syndrome. Both the patients had two benign DNA variations, c.819G > T and c.1161C > T, that appeared homozygous due to ADO. Bioinformatics analyses indicate that this region of the MECP2 gene is rich in complex tertiary structures called G-quadruplex and i-motifs, the disruption of which by the c.819G > T and c.1161C > T variants leads to preferential amplification of the variant allele. Other examples of ADO likely occur, and consideration of disrupting G-quadruplex and i-motif structures should be given when this phenomenon is unexpected. We identify factors in both the polymerase chain reaction amplification and the sequencing steps that help overcome ADO.

Assuntos

Proteína 2 de Ligação a Metil-CpG/genética , Síndrome de Rett/diagnóstico , Síndrome de Rett/genética , Alelos , Sequência de Bases , Criança , DNA/química , DNA/genética , Primers do DNA/genética , Feminino , Quadruplex G , Testes Genéticos , Homozigoto , Humanos , Dados de Sequência Molecular , Conformação de Ácido Nucleico , Reação em Cadeia da Polimerase/métodos , Polimorfismo de Nucleotídeo Único

17.

A Monte Carlo EM algorithm for de novo motif discovery in biomolecular sequences.

Bi, Chengpeng.

IEEE/ACM Trans Comput Biol Bioinform ; 6(3): 370-86, 2009.

Artigo em Inglês | MEDLINE | ID: mdl-19644166

RESUMO

Motif discovery methods play pivotal roles in deciphering the genetic regulatory codes (i.e., motifs) in genomes as well as in locating conserved domains in protein sequences. The Expectation Maximization (EM) algorithm is one of the most popular methods used in de novo motif discovery. Based on the position weight matrix (PWM) updating technique, this paper presents a Monte Carlo version of the EM motif-finding algorithm that carries out stochastic sampling in local alignment space to overcome the conventional EM's main drawback of being trapped in a local optimum. The newly implemented algorithm is named as Monte Carlo EM Motif Discovery Algorithm (MCEMDA). MCEMDA starts from an initial model, and then it iteratively performs Monte Carlo simulation and parameter update until convergence. A log-likelihood profiling technique together with the top-k strategy is introduced to cope with the phase shifts and multiple modal issues in motif discovery problem. A novel grouping motif alignment (GMA) algorithm is designed to select motifs by clustering a population of candidate local alignments and successfully applied to subtle motif discovery. MCEMDA compares favorably to other popular PWM-based and word enumerative motif algorithms tested using simulated (l, d)-motif cases, documented prokaryotic, and eukaryotic DNA motif sequences. Finally, MCEMDA is applied to detect large blocks of conserved domains using protein benchmarks and exhibits its excellent capacity while compared with other multiple sequence alignment methods.

Assuntos

Algoritmos , DNA/química , Método de Monte Carlo , Proteínas/química , Análise de Sequência/métodos , Motivos de Aminoácidos , Sequência de Aminoácidos , Animais , Sequência de Bases , Simulação por Computador , Bases de Dados Genéticas , Cadeias de Markov , Modelos Moleculares , Dados de Sequência Molecular , Conformação de Ácido Nucleico , Estrutura Secundária de Proteína , Fatores de Transcrição

18.

Data augmentation algorithms for detecting conserved domains in protein sequences: a comparative study.

Bi, Chengpeng.

J Proteome Res ; 7(1): 192-201, 2008 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-18081244

RESUMO

Protein conserved domains are distinct units of molecular structure, usually associated with particular aspects of molecular function such as catalysis or binding. These conserved subsequences are often unobserved and thus in need of detection. Motif discovery methods can be used to find these unobserved domains given a set of sequences. This paper presents the data augmentation (DA) framework that unifies a suite of motif-finding algorithms through maximizing the same likelihood function by imputing the unobserved data. The data augmentation refers to those methods that formulate iterative optimization by exploiting the unobserved data. Two categories of maximum likelihood based motif-finding algorithms are illustrated under the DA framework. The first is the deterministic algorithms that are to maximize the likelihood function by performing an iteratively optimal local search in the alignment space. The second is the stochastic algorithms that are to iteratively draw motif location samples via Monte Carlo simulation and simultaneously keep track of the superior solution with the best likelihood. As a result, four DA motif discovery algorithms are described, evaluated, and compared by aligning real and simulated protein sequences.

Assuntos

Algoritmos , Motivos de Aminoácidos , Sequência Conservada , Homologia Estrutural de Proteína , Armazenamento e Recuperação da Informação , Funções Verossimilhança , Método de Monte Carlo , Estrutura Terciária de Proteína , Alinhamento de Sequência , Processos Estocásticos

19.

A comparative study on computational two-block motif detection: algorithms and applications.

Bi, Chengpeng; Leeder, J Steven; Vyhlidal, Carrie A.

Mol Pharm ; 5(1): 3-16, 2008.

Artigo em Inglês | MEDLINE | ID: mdl-18076137

RESUMO

Since the completion of human genome sequencing, cataloging of all genomic functional elements has been one of the challenging problems in bioinformatics. Deciphering cis-regulatory elements in the human genome still remains elusive although much effort has been expended. This paper reviews a suite of methods for two-block motif discovery including mathematical modeling, de novo motif-finding based on multiple local alignment, and genomic sequence scanning method for putative sites. We formulate a general method to address this challenge and compare two major existing algorithms (i.e., greedy local search and Gibbs sampling) implemented to solve the popular two-block structured motif discovery issue. We demonstrate how to use this suite of methods and apply them to human nuclear receptor response elements (i.e., protein binding sites of several relevant nuclear receptors, HNF4alpha, CAR/RXR, and PXR/RXR).

Assuntos

Algoritmos , Biologia Computacional , Receptores Citoplasmáticos e Nucleares/química , Motivos de Aminoácidos , Sequência de Bases , Humanos , Dados de Sequência Molecular , Homologia de Sequência do Ácido Nucleico , Software

20.

WebSIDD: server for predicting stress-induced duplex destabilized (SIDD) sites in superhelical DNA.

Bi, Chengpeng; Benham, Craig J.

Bioinformatics ; 20(9): 1477-9, 2004 Jun 12.

Artigo em Inglês | MEDLINE | ID: mdl-15130924

RESUMO

SUMMARY: WebSIDD is a Web-based service designed to predict locations and extents of stress-induced duplex destabilization (SIDD) that occur in a double-stranded DNA molecule of specified base sequence, on which a specified level of superhelical stress is imposed. The algorithm calculates the approximate equilibrium statistical mechanical distribution of a population of identical molecules among its accessible states. The user inputs the DNA sequence, and the program outputs the calculated transition probability and destabilization energy of each base pair in the sequence. As options, the user can specify the temperature and the level of superhelicity. The values of all structural and energy parameters used in the calculation have been experimentally measured. WebSIDD should prove useful for finding SIDD-susceptible sites in genomic sequences, and correlating their occurrence with locations involved in regulatory and pathological processes. This strategy already has illuminated the roles of SIDD in diverse biological regulatory processes, including transcriptional initiation and termination, and the eukaryotic nuclear scaffold attachments that partition chromosomes into domains. AVAILABILITY: http://orange.genomecenter.ucdavis.edu/benham/sidd/index.html

Assuntos

Algoritmos , DNA/química , DNA/genética , Internet , Modelos Químicos , Conformação de Ácido Nucleico , Análise de Sequência de DNA/métodos , Metodologias Computacionais , DNA/análise , Dano ao DNA , Modelos Moleculares , Desnaturação de Ácido Nucleico , Sistemas On-Line , Estresse Oxidativo/genética , Software , Relação Estrutura-Atividade

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA