Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 68
Filtrar
Más filtros

Base de datos
Tipo del documento
Intervalo de año de publicación
1.
Nat Commun ; 15(1): 4825, 2024 Jun 11.
Artículo en Inglés | MEDLINE | ID: mdl-38862542

RESUMEN

Our previous research revealed a key microRNA signature that is associated with spaceflight that can be used as a biomarker and to develop countermeasure treatments to mitigate the damage caused by space radiation. Here, we expand on this work to determine the biological factors rescued by the countermeasure treatment. We performed RNA-sequencing and transcriptomic analysis on 3D microvessel cell cultures exposed to simulated deep space radiation (0.5 Gy of Galactic Cosmic Radiation) with and without the antagonists to three microRNAs: miR-16-5p, miR-125b-5p, and let-7a-5p (i.e., antagomirs). Significant reduction of inflammation and DNA double strand breaks (DSBs) activity and rescue of mitochondria functions are observed after antagomir treatment. Using data from astronaut participants in the NASA Twin Study, Inspiration4, and JAXA missions, we reveal the genes and pathways implicated in the action of these antagomirs are altered in humans. Our findings indicate a countermeasure strategy that can potentially be utilized by astronauts in spaceflight missions to mitigate space radiation damage.


Asunto(s)
Astronautas , Radiación Cósmica , MicroARNs , Vuelo Espacial , MicroARNs/genética , MicroARNs/metabolismo , Humanos , Radiación Cósmica/efectos adversos , Roturas del ADN de Doble Cadena/efectos de la radiación , Traumatismos por Radiación/genética , Traumatismos por Radiación/prevención & control , Masculino , Mitocondrias/efectos de la radiación , Mitocondrias/metabolismo , Mitocondrias/genética , Femenino , Adulto
2.
Eur J Hum Genet ; 32(1): 10-20, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-37938797

RESUMEN

COVID-19, the disease caused by SARS-CoV-2, has caused significant morbidity and mortality worldwide. The betacoronavirus continues to evolve with global health implications as we race to learn more to curb its transmission, evolution, and sequelae. The focus of this review, the second of a three-part series, is on the biological effects of the SARS-CoV-2 virus on post-acute disease in the context of tissue and organ adaptations and damage. We highlight the current knowledge and describe how virological, animal, and clinical studies have shed light on the mechanisms driving the varied clinical diagnoses and observations of COVID-19 patients. Moreover, we describe how investigations into SARS-CoV-2 effects have informed the understanding of viral pathogenesis and provide innovative pathways for future research on the mechanisms of viral diseases.


Asunto(s)
COVID-19 , Animales , Humanos , SARS-CoV-2
3.
medRxiv ; 2023 Nov 27.
Artículo en Inglés | MEDLINE | ID: mdl-38076862

RESUMEN

The orphan gene of SARS-CoV-2, ORF10, is the least studied gene in the virus responsible for the COVID-19 pandemic. Recent experimentation indicated ORF10 expression moderates innate immunity in vitro. However, whether ORF10 affects COVID-19 in humans remained unknown. We determine that the ORF10 sequence is identical to the Wuhan-Hu-1 ancestral haplotype in 95% of genomes across five variants of concern (VOC). Four ORF10 variants are associated with less virulent clinical outcomes in the human host: three of these affect ORF10 protein structure, one affects ORF10 RNA structural dynamics. RNA-Seq data from 2070 samples from diverse human cells and tissues reveals ORF10 accumulation is conditionally discordant from that of other SARS-CoV-2 transcripts. Expression of ORF10 in A549 and HEK293 cells perturbs immune-related gene expression networks, alters expression of the majority of mitochondrially-encoded genes of oxidative respiration, and leads to large shifts in levels of 14 newly-identified transcripts. We conclude ORF10 contributes to more severe COVID-19 clinical outcomes in the human host.

4.
Front Plant Sci ; 14: 1342494, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-38093992
5.
Sci Transl Med ; 15(708): eabq1533, 2023 08 09.
Artículo en Inglés | MEDLINE | ID: mdl-37556555

RESUMEN

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) viral proteins bind to host mitochondrial proteins, likely inhibiting oxidative phosphorylation (OXPHOS) and stimulating glycolysis. We analyzed mitochondrial gene expression in nasopharyngeal and autopsy tissues from patients with coronavirus disease 2019 (COVID-19). In nasopharyngeal samples with declining viral titers, the virus blocked the transcription of a subset of nuclear DNA (nDNA)-encoded mitochondrial OXPHOS genes, induced the expression of microRNA 2392, activated HIF-1α to induce glycolysis, and activated host immune defenses including the integrated stress response. In autopsy tissues from patients with COVID-19, SARS-CoV-2 was no longer present, and mitochondrial gene transcription had recovered in the lungs. However, nDNA mitochondrial gene expression remained suppressed in autopsy tissue from the heart and, to a lesser extent, kidney, and liver, whereas mitochondrial DNA transcription was induced and host-immune defense pathways were activated. During early SARS-CoV-2 infection of hamsters with peak lung viral load, mitochondrial gene expression in the lung was minimally perturbed but was down-regulated in the cerebellum and up-regulated in the striatum even though no SARS-CoV-2 was detected in the brain. During the mid-phase SARS-CoV-2 infection of mice, mitochondrial gene expression was starting to recover in mouse lungs. These data suggest that when the viral titer first peaks, there is a systemic host response followed by viral suppression of mitochondrial gene transcription and induction of glycolysis leading to the deployment of antiviral immune defenses. Even when the virus was cleared and lung mitochondrial function had recovered, mitochondrial function in the heart, kidney, liver, and lymph nodes remained impaired, potentially leading to severe COVID-19 pathology.


Asunto(s)
COVID-19 , Cricetinae , Humanos , Animales , Ratones , COVID-19/patología , SARS-CoV-2 , Roedores , Genes Mitocondriales , Pulmón/patología
6.
Theranostics ; 12(8): 3946-3962, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35664076

RESUMEN

Rationale: Viral infections are complex processes based on an intricate network of molecular interactions. The infectious agent hijacks components of the cellular machinery for its profit, circumventing the natural defense mechanisms triggered by the infected cell. The successful completion of the replicative viral cycle within a cell depends on the function of viral components versus the cellular defenses. Non-coding RNAs (ncRNAs) are important cellular modulators, either promoting or preventing the progression of viral infections. Among these ncRNAs, the long non-coding RNA (lncRNA) family is especially relevant due to their intrinsic functional properties and ubiquitous biological roles. Specific lncRNAs have been recently characterized as modulators of the cellular response during infection of human host cells by single stranded RNA viruses. However, the role of host lncRNAs in the infection by human RNA coronaviruses such as SARS-CoV-2 remains uncharacterized. Methods: In the present work, we have performed a transcriptomic study of a cohort of patients with different SARS-CoV-2 viral load and analyzed the involvement of lncRNAs in supporting regulatory networks based on their interaction with RNA-binding proteins (RBPs). Results: Our results revealed the existence of a SARS-CoV-2 infection-dependent pattern of transcriptional up-regulation in which specific lncRNAs are an integral component. To determine the role of these lncRNAs, we performed a functional correlation analysis complemented with the study of the validated interactions between lncRNAs and RBPs. This combination of in silico functional association studies and experimental evidence allowed us to identify a lncRNA signature composed of six elements - NRIR, BISPR, MIR155HG, FMR1-IT1, USP30-AS1, and U62317.2 - associated with the regulation of SARS-CoV-2 infection. Conclusions: We propose a competition mechanism between the viral RNA genome and the regulatory lncRNAs in the sequestering of specific RBPs that modulates the interferon response and the regulation of RNA surveillance by nonsense-mediated decay (NMD).


Asunto(s)
COVID-19 , ARN Largo no Codificante , COVID-19/genética , Proteína de la Discapacidad Intelectual del Síndrome del Cromosoma X Frágil , Genoma Viral , Humanos , Inmunidad , Proteínas Mitocondriales/metabolismo , ARN Largo no Codificante/genética , ARN Largo no Codificante/metabolismo , ARN no Traducido/genética , Proteínas de Unión al ARN/genética , Proteínas de Unión al ARN/metabolismo , SARS-CoV-2/genética , Tioléster Hidrolasas/metabolismo
7.
Eur J Hum Genet ; 30(8): 889-898, 2022 08.
Artículo en Inglés | MEDLINE | ID: mdl-35577935

RESUMEN

COVID-19, the disease caused by SARS-CoV-2, has claimed approximately 5 million lives and 257 million cases reported globally. This virus and disease have significantly affected people worldwide, whether directly and/or indirectly, with a virulent pathogen that continues to evolve as we race to learn how to prevent, control, or cure COVID-19. The focus of this review is on the SARS-CoV-2 virus' mechanism of infection and its proclivity at adapting and restructuring the intracellular environment to support viral replication. We highlight current knowledge and how scientific communities with expertize in viral, cellular, and clinical biology have contributed to increase our understanding of SARS-CoV-2, and how these findings may help explain the widely varied clinical observations of COVID-19 patients.


Asunto(s)
COVID-19 , SARS-CoV-2 , Humanos , Replicación Viral
8.
Cell Rep Med ; 3(2): 100522, 2022 02 15.
Artículo en Inglés | MEDLINE | ID: mdl-35233546

RESUMEN

The molecular mechanisms underlying the clinical manifestations of coronavirus disease 2019 (COVID-19), and what distinguishes them from common seasonal influenza virus and other lung injury states such as acute respiratory distress syndrome, remain poorly understood. To address these challenges, we combine transcriptional profiling of 646 clinical nasopharyngeal swabs and 39 patient autopsy tissues to define body-wide transcriptome changes in response to COVID-19. We then match these data with spatial protein and expression profiling across 357 tissue sections from 16 representative patient lung samples and identify tissue-compartment-specific damage wrought by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection, evident as a function of varying viral loads during the clinical course of infection and tissue-type-specific expression states. Overall, our findings reveal a systemic disruption of canonical cellular and transcriptional pathways across all tissues, which can inform subsequent studies to combat the mortality of COVID-19 and to better understand the molecular dynamics of lethal SARS-CoV-2 and other respiratory infections.


Asunto(s)
COVID-19/genética , COVID-19/patología , Pulmón/patología , SARS-CoV-2 , Transcriptoma/genética , Adulto , Anciano , Anciano de 80 o más Años , COVID-19/metabolismo , COVID-19/virología , Estudios de Casos y Controles , Estudios de Cohortes , Femenino , Regulación de la Expresión Génica , Humanos , Gripe Humana/genética , Gripe Humana/patología , Gripe Humana/virología , Pulmón/metabolismo , Masculino , Persona de Mediana Edad , Orthomyxoviridae , RNA-Seq/métodos , Síndrome de Dificultad Respiratoria/genética , Síndrome de Dificultad Respiratoria/microbiología , Síndrome de Dificultad Respiratoria/patología , Carga Viral
9.
bioRxiv ; 2022 Feb 22.
Artículo en Inglés | MEDLINE | ID: mdl-35233572

RESUMEN

Defects in mitochondrial oxidative phosphorylation (OXPHOS) have been reported in COVID-19 patients, but the timing and organs affected vary among reports. Here, we reveal the dynamics of COVID-19 through transcription profiles in nasopharyngeal and autopsy samples from patients and infected rodent models. While mitochondrial bioenergetics is repressed in the viral nasopharyngeal portal of entry, it is up regulated in autopsy lung tissues from deceased patients. In most disease stages and organs, discrete OXPHOS functions are blocked by the virus, and this is countered by the host broadly up regulating unblocked OXPHOS functions. No such rebound is seen in autopsy heart, results in severe repression of genes across all OXPHOS modules. Hence, targeted enhancement of mitochondrial gene expression may mitigate the pathogenesis of COVID-19.

10.
Nucleic Acids Res ; 50(7): e37, 2022 04 22.
Artículo en Inglés | MEDLINE | ID: mdl-34928390

RESUMEN

Proteins encoded by newly-emerged genes ('orphan genes') share no sequence similarity with proteins in any other species. They provide organisms with a reservoir of genetic elements to quickly respond to changing selection pressures. Here, we systematically assess the ability of five gene prediction pipelines to accurately predict genes in genomes according to phylostratal origin. BRAKER and MAKER are existing, popular ab initio tools that infer gene structures by machine learning. Direct Inference is an evidence-based pipeline we developed to predict gene structures from alignments of RNA-Seq data. The BIND pipeline integrates ab initio predictions of BRAKER and Direct inference; MIND combines Direct Inference and MAKER predictions. We use highly-curated Arabidopsis and yeast annotations as gold-standard benchmarks, and cross-validate in rice. Each pipeline under-predicts orphan genes (as few as 11 percent, under one prediction scenario). Increasing RNA-Seq diversity greatly improves prediction efficacy. The combined methods (BIND and MIND) yield best predictions overall, BIND identifying 68% of annotated orphan genes, 99% of ancient genes, and give the highest sensitivity score regardless dataset in Arabidopsis. We provide a light weight, flexible, reproducible, and well-documented solution to improve gene prediction.


Asunto(s)
Arabidopsis , Oryza , Arabidopsis/genética , Genoma , Oryza/genética , RNA-Seq , Programas Informáticos
11.
Cell Rep ; 37(3): 109839, 2021 10 19.
Artículo en Inglés | MEDLINE | ID: mdl-34624208

RESUMEN

MicroRNAs (miRNAs) are small non-coding RNAs involved in post-transcriptional gene regulation that have a major impact on many diseases and provide an exciting avenue toward antiviral therapeutics. From patient transcriptomic data, we determined that a circulating miRNA, miR-2392, is directly involved with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) machinery during host infection. Specifically, we show that miR-2392 is key in driving downstream suppression of mitochondrial gene expression, increasing inflammation, glycolysis, and hypoxia, as well as promoting many symptoms associated with coronavirus disease 2019 (COVID-19) infection. We demonstrate that miR-2392 is present in the blood and urine of patients positive for COVID-19 but is not present in patients negative for COVID-19. These findings indicate the potential for developing a minimally invasive COVID-19 detection method. Lastly, using in vitro human and in vivo hamster models, we design a miRNA-based antiviral therapeutic that targets miR-2392, significantly reduces SARS-CoV-2 viability in hamsters, and may potentially inhibit a COVID-19 disease state in humans.


Asunto(s)
COVID-19/genética , COVID-19/inmunología , MicroARNs/genética , SARS-CoV-2/genética , Adulto , Anciano , Anciano de 80 o más Años , Animales , Antivirales/farmacología , Biomarcadores/metabolismo , Cricetinae , Femenino , Hurones , Regulación de la Expresión Génica , Glucólisis , Voluntarios Sanos , Humanos , Hipoxia , Inflamación , Masculino , Ratones , Persona de Mediana Edad , Proteómica/métodos , Curva ROC , Ratas , Tratamiento Farmacológico de COVID-19
12.
Front Genet ; 12: 722981, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34484307

RESUMEN

The "dark transcriptome" can be considered the multitude of sequences that are transcribed but not annotated as genes. We evaluated expression of 6,692 annotated genes and 29,354 unannotated open reading frames (ORFs) in the Saccharomyces cerevisiae genome across diverse environmental, genetic and developmental conditions (3,457 RNA-Seq samples). Over 30% of the highly transcribed ORFs have translation evidence. Phylostratigraphic analysis infers most of these transcribed ORFs would encode species-specific proteins ("orphan-ORFs"); hundreds have mean expression comparable to annotated genes. These data reveal unannotated ORFs most likely to be protein-coding genes. We partitioned a co-expression matrix by Markov Chain Clustering; the resultant clusters contain 2,468 orphan-ORFs. We provide the aggregated RNA-Seq yeast data with extensive metadata as a project in MetaOmGraph (MOG), a tool designed for interactive analysis and visualization. This approach enables reuse of public RNA-Seq data for exploratory discovery, providing a rich context for experimentalists to make novel, experimentally testable hypotheses about candidate genes.

13.
NAR Genom Bioinform ; 3(2): lqab049, 2021 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-34085037

RESUMEN

The availability of terabytes of RNA-Seq data and continuous emergence of new analysis tools, enable unprecedented biological insight. There is a pressing requirement for a framework that allows for fast, efficient, manageable, and reproducible RNA-Seq analysis. We have developed a Python package, (pyrpipe), that enables straightforward development of flexible, reproducible and easy-to-debug computational pipelines purely in Python, in an object-oriented manner. pyrpipe provides access to popular RNA-Seq tools, within Python, via high-level APIs. Pipelines can be customized by integrating new Python code, third-party programs, or Python libraries. Users can create checkpoints in the pipeline or integrate pyrpipe into a workflow management system, thus allowing execution on multiple computing environments, and enabling efficient resource management. pyrpipe produces detailed analysis, and benchmark reports which can be shared or included in publications. pyrpipe is implemented in Python and is compatible with Python versions 3.6 and higher. To illustrate the rich functionality of pyrpipe, we provide case studies using RNA-Seq data from GTEx, SARS-CoV-2-infected human cells, and Zea mays. All source code is freely available at https://github.com/urmi-21/pyrpipe; the package can be installed from the source, from PyPI (https://pypi.org/project/pyrpipe), or from bioconda (https://anaconda.org/bioconda/pyrpipe). Documentation is available at (http://pyrpipe.rtfd.io).

14.
bioRxiv ; 2021 Aug 18.
Artículo en Inglés | MEDLINE | ID: mdl-33948587

RESUMEN

MicroRNAs (miRNAs) are small non-coding RNAs involved in post-transcriptional gene regulation that have a major impact on many diseases and provides an exciting avenue towards antiviral therapeutics. From patient transcriptomic data, we have discovered a circulating miRNA, miR-2392, that is directly involved with SARS-CoV-2 machinery during host infection. Specifically, we show that miR-2392 is key in driving downstream suppression of mitochondrial gene expression, increasing inflammation, glycolysis, and hypoxia as well as promoting many symptoms associated with COVID-19 infection. We demonstrate miR-2392 is present in the blood and urine of COVID-19 positive patients, but not detected in COVID-19 negative patients. These findings indicate the potential for developing a novel, minimally invasive, COVID-19 detection method. Lastly, using in vitro human and in vivo hamster models, we have developed a novel miRNA-based antiviral therapeutic that targets miR-2392, significantly reduces SARS-CoV-2 viability in hamsters and may potentially inhibit a COVID-19 disease state in humans.

15.
Sci Rep ; 11(1): 9905, 2021 05 10.
Artículo en Inglés | MEDLINE | ID: mdl-33972602

RESUMEN

The COVID-19 pandemic has affected African American populations disproportionately with respect to prevalence, and mortality. Expression profiles represent snapshots of combined genetic, socio-environmental (including socioeconomic and environmental factors), and physiological effects on the molecular phenotype. As such, they have potential to improve biological understanding of differences among populations, and provide therapeutic biomarkers and environmental mitigation strategies. Here, we undertook a large-scale assessment of patterns of gene expression between African Americans and European Americans, mining RNA-Seq data from 25 non-diseased and diseased (tumor) tissue-types. We observed the widespread enrichment of pathways implicated in COVID-19 and integral to inflammation and reactive oxygen stress. Chemokine CCL3L3 expression is up-regulated in African Americans. GSTM1, encoding a glutathione S-transferase that metabolizes reactive oxygen species and xenobiotics, is upregulated. The little-studied F8A2 gene is up to 40-fold more highly expressed in African Americans; F8A2 encodes HAP40 protein, which mediates endosome movement, potentially altering the cellular response to SARS-CoV-2. African American expression signatures, superimposed on single cell-RNA reference data, reveal increased number or activity of esophageal glandular cells and lung ACE2-positive basal keratinocytes. Our findings establish basal prognostic signatures that can be used to refine approaches to minimize risk of severe infection and improve precision treatment of COVID-19 for African Americans. To enable dissection of causes of divergent molecular phenotypes, we advocate routine inclusion of metadata on genomic and socio-environmental factors for human RNA-sequencing studies.


Asunto(s)
Negro o Afroamericano/genética , COVID-19/genética , Perfilación de la Expresión Génica/métodos , Regulación Neoplásica de la Expresión Génica , Neoplasias/genética , Población Blanca/genética , COVID-19/epidemiología , COVID-19/virología , Quimiocina CCL3/genética , Redes Reguladoras de Genes , Glutatión Transferasa/genética , Humanos , Neoplasias/clasificación , Neoplasias/etnología , Proteínas Nucleares/genética , Pandemias , Pronóstico , RNA-Seq/métodos , SARS-CoV-2/aislamiento & purificación , SARS-CoV-2/fisiología , Factores Socioeconómicos , Estados Unidos/epidemiología
16.
bioRxiv ; 2021 Mar 09.
Artículo en Inglés | MEDLINE | ID: mdl-33758858

RESUMEN

The Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) virus has infected over 115 million people and caused over 2.5 million deaths worldwide. Yet, the molecular mechanisms underlying the clinical manifestations of COVID-19, as well as what distinguishes them from common seasonal influenza virus and other lung injury states such as Acute Respiratory Distress Syndrome (ARDS), remains poorly understood. To address these challenges, we combined transcriptional profiling of 646 clinical nasopharyngeal swabs and 39 patient autopsy tissues, matched with spatial protein and expression profiling (GeoMx) across 357 tissue sections. These results define both body-wide and tissue-specific (heart, liver, lung, kidney, and lymph nodes) damage wrought by the SARS-CoV-2 infection, evident as a function of varying viral load (high vs. low) during the course of infection and specific, transcriptional dysregulation in splicing isoforms, T cell receptor expression, and cellular expression states. In particular, cardiac and lung tissues revealed the largest degree of splicing isoform switching and cell expression state loss. Overall, these findings reveal a systemic disruption of cellular and transcriptional pathways from COVID-19 across all tissues, which can inform subsequent studies to combat the mortality of COVID-19, as well to better understand the molecular dynamics of lethal SARS-CoV-2 infection and other viruses.

17.
Bioinformatics ; 37(18): 3019-3020, 2021 09 29.
Artículo en Inglés | MEDLINE | ID: mdl-33576786

RESUMEN

SUMMARY: Searching for open reading frames is a routine task and a critical step prior to annotating protein coding regions in newly sequenced genomes or de novo transcriptome assemblies. With the tremendous increase in genomic and transcriptomic data, faster tools are needed to handle large input datasets. These tools should be versatile enough to fine-tune search criteria and allow efficient downstream analysis. Here we present a new python based tool, orfipy, which allows the user to flexibly search for open reading frames in genomic and transcriptomic sequences. The search is rapid and is fully customizable, with a choice of FASTA and BED output formats. AVAILABILITY AND IMPLEMENTATION: orfipy is implemented in python and is compatible with python v3.6 and higher. Source code: https://github.com/urmi-21/orfipy. Installation: from the source, or via PyPi (https://pypi.org/project/orfipy) or bioconda (https://anaconda.org/bioconda/orfipy). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Genómica , Programas Informáticos , Sistemas de Lectura Abierta , Genoma , Transcriptoma
18.
Nucleic Acids Res ; 48(4): e23, 2020 02 28.
Artículo en Inglés | MEDLINE | ID: mdl-31956905

RESUMEN

The diverse and growing omics data in public domains provide researchers with tremendous opportunity to extract hidden, yet undiscovered, knowledge. However, the vast majority of archived data remain unused. Here, we present MetaOmGraph (MOG), a free, open-source, standalone software for exploratory analysis of massive datasets. Researchers, without coding, can interactively visualize and evaluate data in the context of its metadata, honing-in on groups of samples or genes based on attributes such as expression values, statistical associations, metadata terms and ontology annotations. Interaction with data is easy via interactive visualizations such as line charts, box plots, scatter plots, histograms and volcano plots. Statistical analyses include co-expression analysis, differential expression analysis and differential correlation analysis, with significance tests. Researchers can send data subsets to R for additional analyses. Multithreading and indexing enable efficient big data analysis. A researcher can create new MOG projects from any numerical data; or explore an existing MOG project. MOG projects, with history of explorations, can be saved and shared. We illustrate MOG by case studies of large curated datasets from human cancer RNA-Seq, where we identify novel putative biomarker genes in different tumors, and microarray and metabolomics data from Arabidopsis thaliana. MOG executable and code: http://metnetweb.gdcb.iastate.edu/ and https://github.com/urmi-21/MetaOmGraph/.


Asunto(s)
Macrodatos , Perfilación de la Expresión Génica/estadística & datos numéricos , Regulación de la Expresión Génica/genética , Programas Informáticos , Análisis de Datos , Interpretación Estadística de Datos , Humanos , Metadatos/estadística & datos numéricos
19.
BMC Bioinformatics ; 20(1): 440, 2019 Aug 27.
Artículo en Inglés | MEDLINE | ID: mdl-31455236

RESUMEN

BACKGROUND: With every new genome that is sequenced, thousands of species-specific genes (orphans) are found, some originating from ultra-rapid mutations of existing genes, many others originating de novo from non-genic regions of the genome. If some of these genes survive across speciations, then extant organisms will contain a patchwork of genes whose ancestors first appeared at different times. Standard phylostratigraphy, the technique of partitioning genes by their age, is based solely on protein similarity algorithms. However, this approach relies on negative evidence ─ a failure to detect a homolog of a query gene. An alternative approach is to limit the search for homologs to syntenic regions. Then, genes can be positively identified as de novo orphans by tracing them to non-coding sequences in related species. RESULTS: We have developed a synteny-based pipeline in the R framework. Fagin determines the genomic context of each query gene in a focal species compared to homologous sequence in target species. We tested the fagin pipeline on two focal species, Arabidopsis thaliana (plus four target species in Brassicaseae) and Saccharomyces cerevisiae (plus six target species in Saccharomyces). Using microsynteny maps, fagin classified the homology relationship of each query gene against each target genome into three main classes, and further subclasses: AAic (has a coding syntenic homolog), NTic (has a non-coding syntenic homolog), and Unknown (has no detected syntenic homolog). fagin inferred over half the "Unknown" A. thaliana query genes, and about 20% for S. cerevisiae, as lacking a syntenic homolog because of local indels or scrambled synteny. CONCLUSIONS: fagin augments standard phylostratigraphy, and extends synteny-based phylostratigraphy with an automated, customizable, and detailed contextual analysis. By comparing synteny-based phylostrata to standard phylostrata, fagin systematically identifies those orphans and lineage-specific genes that are well-supported to have originated de novo. Analyzing within-species genomes should distinguish orphan genes that may have originated through rapid divergence from de novo orphans. Fagin also delineates whether a gene has no syntenic homolog because of technical or biological reasons. These analyses indicate that some orphans may be associated with regions of high genomic perturbation.


Asunto(s)
Arabidopsis/genética , Genes , Filogenia , Saccharomyces cerevisiae/genética , Programas Informáticos , Sintenía/genética , Secuencia de Bases , Genoma , Homología de Secuencia
20.
Bioinformatics ; 35(19): 3617-3627, 2019 10 01.
Artículo en Inglés | MEDLINE | ID: mdl-30873536

RESUMEN

MOTIVATION: The goal of phylostratigraphy is to infer the evolutionary origin of each gene in an organism. This is done by searching for homologs within increasingly broad clades. The deepest clade that contains a homolog of the protein(s) encoded by a gene is that gene's phylostratum. RESULTS: We have created a general R-based framework, phylostratr, to estimate the phylostratum of every gene in a species. The program fully automates analysis: selecting species for balanced representation, retrieving sequences, building databases, inferring phylostrata and returning diagnostics. Key diagnostics include: detection of genes with inferred homologs in old clades, but not intermediate ones; proteome quality assessments; false-positive diagnostics, and checks for missing organellar genomes. phylostratr allows extensive customization and systematic comparisons of the influence of analysis parameters or genomes on phylostrata inference. A user may: modify the automatically generated clade tree or use their own tree; provide custom sequences in place of those automatically retrieved from UniProt; replace BLAST with an alternative algorithm; or tailor the method and sensitivity of the homology inference classifier. We show the utility of phylostratr through case studies in Arabidopsis thaliana and Saccharomyces cerevisiae. AVAILABILITY AND IMPLEMENTATION: Source code available at https://github.com/arendsee/phylostratr. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Algoritmos , Filogenia , Programas Informáticos , Genoma , Saccharomyces cerevisiae
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA