Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
1.
Nucleic Acids Res ; 48(W1): W275-W286, 2020 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-32421805

RESUMO

A group of genes controlled as a unit, usually by the same repressor or activator gene, is known as a regulon. The ability to identify active regulons within a specific cell type, i.e., cell-type-specific regulons (CTSR), provides an extraordinary opportunity to pinpoint crucial regulators and target genes responsible for complex diseases. However, the identification of CTSRs from single-cell RNA-Seq (scRNA-Seq) data is computationally challenging. We introduce IRIS3, the first-of-its-kind web server for CTSR inference from scRNA-Seq data for human and mouse. IRIS3 is an easy-to-use server empowered by over 20 functionalities to support comprehensive interpretations and graphical visualizations of identified CTSRs. CTSR data can be used to reliably characterize and distinguish the corresponding cell type from others and can be combined with other computational or experimental analyses for biomedical studies. CTSRs can, therefore, aid in the discovery of major regulatory mechanisms and allow reliable constructions of global transcriptional regulation networks encoded in a specific cell type. The broader impact of IRIS3 includes, but is not limited to, investigation of complex diseases hierarchies and heterogeneity, causal gene regulatory network construction, and drug development. IRIS3 is freely accessible from https://bmbl.bmi.osumc.edu/iris3/ with no login requirement.


Assuntos
RNA-Seq , Regulon , Análise de Célula Única , Software , Animais , Encéfalo/metabolismo , Análise por Conglomerados , Camundongos
2.
Brief Bioinform ; 20(6): 2044-2054, 2019 11 27.
Artigo em Inglês | MEDLINE | ID: mdl-30099484

RESUMO

Differential gene expression (DGE) analysis is one of the most common applications of RNA-sequencing (RNA-seq) data. This process allows for the elucidation of differentially expressed genes across two or more conditions and is widely used in many applications of RNA-seq data analysis. Interpretation of the DGE results can be nonintuitive and time consuming due to the variety of formats based on the tool of choice and the numerous pieces of information provided in these results files. Here we reviewed DGE results analysis from a functional point of view for various visualizations. We also provide an R/Bioconductor package, Visualization of Differential Gene Expression Results using R, which generates information-rich visualizations for the interpretation of DGE results from three widely used tools, Cuffdiff, DESeq2 and edgeR. The implemented functions are also tested on five real-world data sets, consisting of one human, one Malus domestica and three Vitis riparia data sets.


Assuntos
Expressão Gênica , Análise de Sequência de RNA , Biologia Computacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos
3.
Brief Bioinform ; 19(5): 1069-1081, 2018 09 28.
Artigo em Inglês | MEDLINE | ID: mdl-28334268

RESUMO

Transcription factors are proteins that bind to specific DNA sequences and play important roles in controlling the expression levels of their target genes. Hence, prediction of transcription factor binding sites (TFBSs) provides a solid foundation for inferring gene regulatory mechanisms and building regulatory networks for a genome. Chromatin immunoprecipitation sequencing (ChIP-seq) technology can generate large-scale experimental data for such protein-DNA interactions, providing an unprecedented opportunity to identify TFBSs (a.k.a. cis-regulatory motifs). The bottleneck, however, is the lack of robust mathematical models, as well as efficient computational methods for TFBS prediction to make effective use of massive ChIP-seq data sets in the public domain. The purpose of this study is to review existing motif-finding methods for ChIP-seq data from an algorithmic perspective and provide new computational insight into this field. The state-of-the-art methods were shown through summarizing eight representative motif-finding algorithms along with corresponding challenges, and introducing some important relative functions according to specific biological demands, including discriminative motif finding and cofactor motifs analysis. Finally, potential directions and plans for ChIP-seq-based motif-finding tools were showcased in support of future algorithm development.


Assuntos
Algoritmos , Redes Reguladoras de Genes , Software , Sequência de Bases , Sítios de Ligação/genética , Imunoprecipitação da Cromatina/estatística & dados numéricos , Biologia Computacional/métodos , DNA/genética , DNA/metabolismo , Humanos , Análise de Sequência de DNA/estatística & dados numéricos , Fatores de Transcrição/metabolismo
4.
Brief Bioinform ; 19(6): 1415-1429, 2018 11 27.
Artigo em Inglês | MEDLINE | ID: mdl-28481971

RESUMO

Metagenomic and metatranscriptomic sequencing approaches are more frequently being used to link microbiota to important diseases and ecological changes. Many analyses have been used to compare the taxonomic and functional profiles of microbiota across habitats or individuals. While a large portion of metagenomic analyses focus on species-level profiling, some studies use strain-level metagenomic analyses to investigate the relationship between specific strains and certain circumstances. Metatranscriptomic analysis provides another important insight into activities of genes by examining gene expression levels of microbiota. Hence, combining metagenomic and metatranscriptomic analyses will help understand the activity or enrichment of a given gene set, such as drug-resistant genes among microbiome samples. Here, we summarize existing bioinformatics tools of metagenomic and metatranscriptomic data analysis, the purpose of which is to assist researchers in deciding the appropriate tools for their microbiome studies. Additionally, we propose an Integrated Meta-Function mapping pipeline to incorporate various reference databases and accelerate functional gene mapping procedures for both metagenomic and metatranscriptomic analyses.


Assuntos
Biologia Computacional , Metagenoma , Microbiota , Transcriptoma , RNA Ribossômico 16S/genética
5.
Bioinformatics ; 35(21): 4474-4477, 2019 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-31116375

RESUMO

MOTIVATION: Metagenomic and metatranscriptomic analyses can provide an abundance of information related to microbial communities. However, straightforward analysis of this data does not provide optimal results, with a required integration of data types being needed to thoroughly investigate these microbiomes and their environmental interactions. RESULTS: Here, we present MetaQUBIC, an integrated biclustering-based computational pipeline for gene module detection that integrates both metagenomic and metatranscriptomic data. Additionally, we used this pipeline to investigate 735 paired DNA and RNA human gut microbiome samples, resulting in a comprehensive hybrid gene expression matrix of 2.3 million cross-species genes in the 735 human fecal samples and 155 functional enriched gene modules. We believe both the MetaQUBIC pipeline and the generated comprehensive human gut hybrid expression matrix will facilitate further investigations into multiple levels of microbiome studies. AVAILABILITY AND IMPLEMENTATION: The package is freely available at https://github.com/OSU-BMBL/metaqubic. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Microbioma Gastrointestinal , Metagenoma , Fezes , Humanos , Metagenômica , Transcriptoma
6.
PLoS Comput Biol ; 15(2): e1006792, 2019 02.
Artigo em Inglês | MEDLINE | ID: mdl-30763315

RESUMO

Next-Generation Sequencing has made available substantial amounts of large-scale Omics data, providing unprecedented opportunities to understand complex biological systems. Specifically, the value of RNA-Sequencing (RNA-Seq) data has been confirmed in inferring how gene regulatory systems will respond under various conditions (bulk data) or cell types (single-cell data). RNA-Seq can generate genome-scale gene expression profiles that can be further analyzed using correlation analysis, co-expression analysis, clustering, differential gene expression (DGE), among many other studies. While these analyses can provide invaluable information related to gene expression, integration and interpretation of the results can prove challenging. Here we present a tool called IRIS-EDA, which is a Shiny web server for expression data analysis. It provides a straightforward and user-friendly platform for performing numerous computational analyses on user-provided RNA-Seq or Single-cell RNA-Seq (scRNA-Seq) data. Specifically, three commonly used R packages (edgeR, DESeq2, and limma) are implemented in the DGE analysis with seven unique experimental design functionalities, including a user-specified design matrix option. Seven discovery-driven methods and tools (correlation analysis, heatmap, clustering, biclustering, Principal Component Analysis (PCA), Multidimensional Scaling (MDS), and t-distributed Stochastic Neighbor Embedding (t-SNE)) are provided for gene expression exploration which is useful for designing experimental hypotheses and determining key factors for comprehensive DGE analysis. Furthermore, this platform integrates seven visualization tools in a highly interactive manner, for improved interpretation of the analyses. It is noteworthy that, for the first time, IRIS-EDA provides a framework to expedite submission of data and results to NCBI's Gene Expression Omnibus following the FAIR (Findable, Accessible, Interoperable and Reusable) Data Principles. IRIS-EDA is freely available at http://bmbl.sdstate.edu/IRIS/.


Assuntos
Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Células Cultivadas , Análise por Conglomerados , Bases de Dados Factuais , Humanos , RNA/análise , RNA/genética , RNA/metabolismo
7.
Bioinformatics ; 33(16): 2586-2588, 2017 Aug 15.
Artigo em Inglês | MEDLINE | ID: mdl-28419194

RESUMO

MOTIVATION: Motif identification and analyses are important and have been long-standing computational problems in bioinformatics. Substantial efforts have been made in this field during the past several decades. However, the lack of intuitive and integrative web servers impedes the progress of making effective use of emerging algorithms and tools. RESULTS: Here we present an integrated web server, DMINDA 2.0, which contains: (i) five motif prediction and analyses algorithms, including a phylogenetic footprinting framework; (ii) 2125 species with complete genomes to support the above five functions, covering animals, plants and bacteria and (iii) bacterial regulon prediction and visualization. AVAILABILITY AND IMPLEMENTATION: DMINDA 2.0 is freely available at http://bmbl.sdstate.edu/DMINDA2. CONTACT: qin.ma@sdstate.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Genômica/métodos , Motivos de Nucleotídeos , Sequências Reguladoras de Ácido Nucleico , Análise de Sequência de DNA/métodos , Software , Algoritmos , Bactérias/genética , DNA , Eucariotos/genética , Genoma , Filogenia
10.
Clin Pharmacol Ther ; 111(5): 1075-1083, 2022 05.
Artigo em Inglês | MEDLINE | ID: mdl-35034348

RESUMO

The SLCO1B1 genotype is known to influence patient adherence to statin therapy, in part by increasing the risk for statin-associated musculoskeletal symptoms (SAMSs). The SLCO1B1*5 allele has previously been associated with simvastatin discontinuation and SAMSs. Prior analyses of the relationship between SLCO1B1*5 and atorvastatin muscle side effects have been inconclusive due to insufficient power. We now quantify the impact of SLCO1B1*5 on atorvastatin discontinuation and SAMSs in a large observational cohort using electronic medical record data from a single health care system. In our study cohort (n = 1,627 patients exposed to atorvastatin during the course of routine clinical care), 56% (n = 912 of 1,627 patients) discontinued atorvastatin and 18% (n = 303 of 1,627 patients) developed SAMSs. A univariate model revealed that SLCO1B1*5 increased the likelihood that patients would stop atorvastatin during routine care (odds ratio 1.2; 95% confidence interval (CI), 1.1-1.5; P = 0.04). A multivariate Cox proportional hazards model further demonstrated that this same variant was associated with time to atorvastatin discontinuation (hazard ratio 1.2; 95% CI, 1.1-1.4; P = 0.004). Additional time-to-event analyses also revealed that SCLO1B1*5 was associated with SAMSs (hazard ratio 1.4; 95% CI, 1.1-1.7; P = 0.02). Atorvastatin discontinuation was associated with SAMSs (odds ratio 1.67; P = 0.0001) in our cohort.


Assuntos
Inibidores de Hidroximetilglutaril-CoA Redutases , Alelos , Atorvastatina/efeitos adversos , Humanos , Inibidores de Hidroximetilglutaril-CoA Redutases/efeitos adversos , Transportador 1 de Ânion Orgânico Específico do Fígado/genética , Músculos , Polimorfismo de Nucleotídeo Único , Pirróis/uso terapêutico , Sinvastatina/efeitos adversos
11.
Trends Biotechnol ; 38(9): 1007-1022, 2020 09.
Artigo em Inglês | MEDLINE | ID: mdl-32818441

RESUMO

Fast-developing single-cell multimodal omics (scMulti-omics) technologies enable the measurement of multiple modalities, such as DNA methylation, chromatin accessibility, RNA expression, protein abundance, gene perturbation, and spatial information, from the same cell. scMulti-omics can comprehensively explore and identify cell characteristics, while also presenting challenges to the development of computational methods and tools for integrative analyses. Here, we review these integrative methods and summarize the existing tools for studying a variety of scMulti-omics data. The various functionalities and practical challenges in using the available tools in the public domain are explored through several case studies. Finally, we identify remaining challenges and future trends in scMulti-omics modeling and analyses.


Assuntos
Biologia Computacional , Genômica/tendências , Proteômica/tendências , Análise de Célula Única/tendências , Algoritmos , Metilação de DNA/genética , Humanos
12.
Math Biosci ; 310: 24-30, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-30768948

RESUMO

Chronic kidney disease (CKD) is prevalent across the world, and kidney function is well defined by an estimated glomerular filtration rate (eGFR). The progression of kidney disease can be predicted if the future eGFR can be accurately estimated using predictive analytics. In this study, we developed and validated a prediction model of eGFR by data extracted from a regional health system. This dataset includes demographic, clinical and laboratory information from primary care clinics. The model was built using Random Forest regression and evaluated using Goodness-of-fit statistics and discrimination metrics. After data preprocessing, the patient cohort for model development and validation contained 61,740 patients. The final model included eGFR, age, gender, body mass index (BMI), obesity, hypertension, and diabetes, which achieved a mean coefficient of determination of 0.95. The estimated eGFRs were used to classify patients into CKD stages with high macro-averaged and micro-averaged metrics. In conclusion, a model using real-world electronic medical records (EMR) data can accurately predict future kidney functions and provide clinical decision support.


Assuntos
Progressão da Doença , Modelos Estatísticos , Avaliação de Resultados em Cuidados de Saúde , Insuficiência Renal Crônica/diagnóstico , Insuficiência Renal Crônica/epidemiologia , Adulto , Idoso , Comorbidade , Registros Eletrônicos de Saúde/estatística & dados numéricos , Feminino , Taxa de Filtração Glomerular , Humanos , Masculino , Pessoa de Meia-Idade , Prognóstico
13.
Hortic Res ; 6: 86, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31666956

RESUMO

The CBF signal pathway is responsible for a significant portion of plant responses to low temperature and freezing. Overexpression of CBF genes in model organisms such as Arabidopsis thaliana enhances abiotic stress tolerance but also reduces growth. In addition to these effects, overexpression of the peach (Prunus persica [L.] Batsch) CBF1 gene in transgenic apple (Malus x domestica Borkh.) line T166 also results in early entry into and late exit from dormancy. Although the regulation of dormancy-induction and dormancy-release occur while the CBF regulon is operative in perennial, woody plants, how overexpression of CBF1 affects these dormancy-related changes in gene expression is incompletely understood. The objective of the present study was to characterize global changes in gene expression in peach CBF1-overexpressing and non-transformed apple bark tissues at different states of dormancy via RNA-seq. RNA-seq bioinformatics data was confirmed by RT-qPCR on a number of genes. Results indicate that the greatest number of significantly differentially expressed genes (DEGs) occurred in April when dormancy release and bud break normally occur but are delayed in Line T166. Genes involved in storage and inactivation of auxin, GA, and cytokinin were generally upregulated in T166 in April, while those for biosynthesis, uptake or signal transduction were generally downregulated in T166. Genes for cell division and cambial growth were also downregulated in T166 relative to the non-transformed line. These data suggest that overexpression of the peach CBF1 gene impacts growth hormone homeostasis and as a result the activation of growth in the spring, and most likely growth cessation in the fall as well.

14.
Microbiol Resour Announc ; 8(15)2019 Apr 11.
Artigo em Inglês | MEDLINE | ID: mdl-30975803

RESUMO

We report here the improved draft genome sequence of Bacillus sp. strain YF23, a bacterium originally isolated from switchgrass (Panicum virgatum) plants and shown to exhibit plant growth-promoting activity. The genome comprised 5.82 Mbp, containing 5,933 genes, with 193 as RNA genes, and a GC content of 35.10%.

15.
Microbiol Resour Announc ; 8(15)2019 Apr 11.
Artigo em Inglês | MEDLINE | ID: mdl-30975811

RESUMO

We report here the improved draft genome sequence of Pseudomonas poae strain A2-S9, a bacterium that was originally isolated from switchgrass plants and exhibited the capacity for plant growth promotion. Its genome has a size of 6.68 Mbp and a GC content of 61.3%. The genome encodes 6,022 predicted protein-coding genes.

16.
Hortic Res ; 6: 64, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31069086

RESUMO

Understanding how root systems modulate shoot system phenotypes is a fundamental question in plant biology and will be useful in developing resilient agricultural crops. Grafting is a common horticultural practice that joins the roots (rootstock) of one plant to the shoot (scion) of another, providing an excellent method for investigating how these two organ systems affect each other. In this study, we used the French-American hybrid grapevine 'Chambourcin' (Vitis L.) as a model to explore the rootstock-scion relationship. We examined leaf shape, ion concentrations, and gene expression in 'Chambourcin' grown ungrafted as well as grafted to three different rootstocks ('SO4', '1103P' and '3309C') across 2 years and three different irrigation treatments. We found that a significant amount of the variation in leaf shape could be explained by the interaction between rootstock and irrigation. For ion concentrations, the primary source of variation identified was the position of a leaf in a shoot, although rootstock and rootstock by irrigation interaction also explained a significant amount of variation for most ions. Lastly, we found rootstock-specific patterns of gene expression in grafted plants when compared to ungrafted vines. Thus, our work reveals the subtle and complex effect of grafting on 'Chambourcin' leaf morphology, ionomics, and gene expression.

17.
Front Genet ; 9: 313, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30154828

RESUMO

One of the main benefits of using modern RNA-Sequencing (RNA-Seq) technology is the more accurate gene expression estimations compared with previous generations of expression data, such as the microarray. However, numerous issues can result in the possibility that an RNA-Seq read can be mapped to multiple locations on the reference genome with the same alignment scores, which occurs in plant, animal, and metagenome samples. Such a read is so-called a multiple-mapping read (MMR). The impact of these MMRs is reflected in gene expression estimation and all downstream analyses, including differential gene expression, functional enrichment, etc. Current analysis pipelines lack the tools to effectively test the reliability of gene expression estimations, thus are incapable of ensuring the validity of all downstream analyses. Our investigation into 95 RNA-Seq datasets from seven plant and animal species (totaling 1,951 GB) indicates an average of roughly 22% of all reads are MMRs. Here we present a machine learning-based tool called GeneQC (Gene expression Quality Control), which can accurately estimate the reliability of each gene's expression level derived from an RNA-Seq dataset. The underlying algorithm is designed based on extracted genomic and transcriptomic features, which are then combined using elastic-net regularization and mixture model fitting to provide a clearer picture of mapping uncertainty for each gene. GeneQC allows researchers to determine reliable expression estimations and conduct further analysis on the gene expression that is of sufficient quality. This tool also enables researchers to investigate continued re-alignment methods to determine more accurate gene expression estimates for those with low reliability. Application of GeneQC reveals high level of mapping uncertainty in plant samples and limited, severe mapping uncertainty in animal samples. GeneQC is freely available at http://bmbl.sdstate.edu/GeneQC/home.html.

18.
Genes (Basel) ; 9(6)2018 May 30.
Artigo em Inglês | MEDLINE | ID: mdl-29849014

RESUMO

Regulons, which serve as co-regulated gene groups contributing to the transcriptional regulation of microbial genomes, have the potential to aid in understanding of underlying regulatory mechanisms. In this study, we designed a novel computational pipeline, regulon identification based on comparative genomics and transcriptomics analysis (RECTA), for regulon prediction related to the gene regulatory network under certain conditions. To demonstrate the effectiveness of this tool, we implemented RECTA on Lactococcus lactis MG1363 data to elucidate acid-response regulons. A total of 51 regulons were identified, 14 of which have computational-verified significance. Among these 14 regulons, five of them were computationally predicted to be connected with acid stress response. Validated by literature, 33 genes in Lactococcus lactis MG1363 were found to have orthologous genes which were associated with six regulons. An acid response related regulatory network was constructed, involving two trans-membrane proteins, eight regulons (llrA, llrC, hllA, ccpA, NHP6A, rcfB, regulons #8 and #39), nine functional modules, and 33 genes with orthologous genes known to be associated with acid stress. The predicted response pathways could serve as promising candidates for better acid tolerance engineering in Lactococcus lactis. Our RECTA pipeline provides an effective way to construct a reliable gene regulatory network through regulon elucidation, and has strong application power and can be effectively applied to other bacterial genomes where the elucidation of the transcriptional regulation network is needed.

SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa