Búsqueda | Portal de Búsqueda de la BVS España

A Model-Based Approach for Identifying Functional Intergenic Transcribed Regions and Noncoding RNAs.

Lloyd, John P; Tsai, Zing Tsung-Yeh; Sowers, Rosalie P; Panchy, Nicholas L; Shiu, Shin-Han.

Mol Biol Evol ; 35(6): 1422-1436, 2018 06 01.

Artículo en Inglés | MEDLINE | ID: mdl-29554332

RESUMEN

With advances in transcript profiling, the presence of transcriptional activities in intergenic regions has been well established. However, whether intergenic expression reflects transcriptional noise or activity of novel genes remains unclear. We identified intergenic transcribed regions (ITRs) in 15 diverse flowering plant species and found that the amount of intergenic expression correlates with genome size, a pattern that could be expected if intergenic expression is largely nonfunctional. To further assess the functionality of ITRs, we first built machine learning models using Arabidopsis thaliana as a model that accurately distinguish functional sequences (benchmark protein-coding and RNA genes) and likely nonfunctional ones (pseudogenes and unexpressed intergenic regions) by integrating 93 biochemical, evolutionary, and sequence-structure features. Next, by applying the models genome-wide, we found that 4,427 ITRs (38%) and 796 annotated ncRNAs (44%) had features significantly similar to benchmark protein-coding or RNA genes and thus were likely parts of functional genes. Approximately 60% of ITRs and ncRNAs were more similar to nonfunctional sequences and were likely transcriptional noise. The predictive framework established here provides not only a comprehensive look at how functional, genic sequences are distinct from likely nonfunctional ones, but also a new way to differentiate novel genes from genomic regions with noisy transcriptional activities.

Asunto(s)

ADN Intergénico , Tamaño del Genoma , Genoma de Planta , Modelos Genéticos , ARN no Traducido , Metilación de ADN , Aprendizaje Automático , Magnoliopsida , Fenotipo , Transcripción Genética

Inferring direction of associations between histone modifications using a neural processes-based framework.

Ganesan, Ananthakrishnan; Dermadi, Denis; Kalesinskas, Laurynas; Donato, Michele; Sowers, Rosalie; Utz, Paul J; Khatri, Purvesh.

iScience ; 26(1): 105756, 2023 Jan 20.

Artículo en Inglés | MEDLINE | ID: mdl-36619977

RESUMEN

Current technologies do not allow predicting interactions between histone post-translational modifications (HPTMs) at a system-level. We describe a computational framework, imputation-followed-by-inference, to predict directed association between two HPTMs using EpiTOF, a mass cytometry-based platform that allows profiling multiple HPTMs at a single-cell resolution. Using EpiTOF profiles of >55,000,000 peripheral mononuclear blood cells from 158 healthy human subjects, we show that neural processes (NP) have significantly higher accuracy than linear regression and k-nearest neighbors models to impute the abundance of an HPTM. Next, we infer the direction of association to show we recapitulate known HPTM associations and identify several previously unidentified ones in healthy individuals. Using this framework in an influenza vaccine cohort, we identify changes in associations between 6 pairs of HPTMs 30 days following vaccination, of which several have been shown to be involved in innate memory. These results demonstrate the utility of our framework in identifying directed interactions between HPTMs.

The Penn State Protein Ladder system for inexpensive protein molecular weight markers.

Santilli, Ryan T; Williamson, John E; Shibata, Yoshitaka; Sowers, Rosalie P; Fleischman, Andrew N; Tan, Song.

Sci Rep ; 11(1): 16703, 2021 08 18.

Artículo en Inglés | MEDLINE | ID: mdl-34408191

RESUMEN

We have created the Penn State Protein Ladder system to produce protein molecular weight markers easily and inexpensively (less than a penny a lane). The system includes plasmids which express 10, 15, 20, 30, 40, 50, 60, 80 and 100 kD proteins in E. coli. Each protein migrates appropriately on SDS-PAGE gels, is expressed at very high levels (10-50 mg per liter of culture), is easy to purify via histidine tags and can be detected directly on Western blots via engineered immunoglobulin binding domains. We have also constructed plasmids to express 150 and 250 kD proteins. For more efficient production, we have created two polycistronic expression vectors which coexpress the 10, 30, 50, 100 kD proteins or the 20, 40, 60, 80 kD proteins. 50 ml of culture is sufficient to produce 20,000 lanes of individual ladder protein or 3750 lanes of each set of coexpressed ladder proteins. These Penn State Protein Ladder expression plasmids also constitute useful reagents for teaching laboratories to demonstrate recombinant expression in E. coli and affinity protein purification, and to research laboratories desiring positive controls for recombinant protein expression and purification.

Asunto(s)

Electroforesis en Gel de Poliacrilamida/normas , Escherichia coli/química , Plásmidos , Clonación Molecular , Escherichia coli/genética , Peso Molecular , Proteínas Recombinantes/química , Proteínas Recombinantes/genética , Estándares de Referencia

Evolutionary characteristics of intergenic transcribed regions indicate rare novel genes and widespread noisy transcription in the Poaceae.

Lloyd, John P; Bowman, Megan J; Azodi, Christina B; Sowers, Rosalie P; Moghe, Gaurav D; Childs, Kevin L; Shiu, Shin-Han.

Sci Rep ; 9(1): 12122, 2019 08 20.

Artículo en Inglés | MEDLINE | ID: mdl-31431676

RESUMEN

Extensive transcriptional activity occurring in intergenic regions of genomes has raised the question whether intergenic transcription represents the activity of novel genes or noisy expression. To address this, we evaluated cross-species and post-duplication sequence and expression conservation of intergenic transcribed regions (ITRs) in four Poaceae species. Among 43,301 ITRs across the four species, 34,460 (80%) are species-specific. ITRs found across species tend to be more divergent in expression and have more recent duplicates compared to annotated genes. To assess if ITRs are functional (under selection), machine learning models were established in Oryza sativa (rice) that could accurately distinguish between phenotype genes and pseudogenes (area under curve-receiver operating characteristic = 0.94). Based on the models, 584 (8%) and 4391 (61%) rice ITRs are classified as likely functional and nonfunctional with high confidence, respectively. ITRs with conserved expression and ancient retained duplicates, features that were not part of the model, are frequently classified as likely-functional, suggesting these characteristics could serve as pragmatic rules of thumb for identifying candidate sequences likely to be under selection. This study also provides a framework to identify novel genes using comparative transcriptomic data to improve genome annotation that is fundamental for connecting genotype to phenotype in crop and model systems.

Asunto(s)

ADN Intergénico , Genes de Plantas , Poaceae/genética , Transcripción Genética , Evolución Biológica , Genoma de Planta , Aprendizaje Automático , Modelos Genéticos , Fenotipo , Seudogenes , Especificidad de la Especie

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA