Búsqueda | Portal de Búsqueda de la BVS Colombia

Identifying genes with conserved splicing structure and orthologous isoforms in human, mouse and dog.

Guillaudeux, Nicolas; Belleannée, Catherine; Blanquart, Samuel.

BMC Genomics ; 23(1): 216, 2022 Mar 18.

Artículo en Inglés | MEDLINE | ID: mdl-35303798

RESUMEN

BACKGROUND: In eukaryote transcriptomes, a significant amount of transcript diversity comes from genes' capacity to generate different transcripts through alternative splicing. Identifying orthologous alternative transcripts across multiple species is of particular interest for genome annotators. However, there is no formal definition of transcript orthology based on the splicing structure conservation. Likewise there is no public dataset benchmark providing groups of orthologous transcripts sharing a conserved splicing structure. RESULTS: We introduced a formal definition of splicing structure orthology and we predicted transcript orthologs in human, mouse and dog. Applying a selective strategy, we analyzed 2,167 genes and their 18,109 known transcripts and identified a set of 253 gene orthologs that shared a conserved splicing structure in all three species. We predicted 6,861 transcript CDSs (coding sequence), mainly for dog, an emergent model species. Each predicted transcript was an ortholog of a known transcript: both share the same CDS splicing structure. Evidence for the existence of the predicted CDSs was found in external data. CONCLUSIONS: We generated a dataset of 253 gene triplets, structurally conserved and sharing all their CDSs in human, mouse and dog, which correspond to 879 triplets of spliced CDS orthologs. We have released the dataset both as an SQL database and as tabulated files. The data consists of the 879 CDS orthology groups with their detailed splicing structures, and the predicted CDSs, associated with their experimental evidence. The 6,861 predicted CDSs are provided in GTF files. Our data may contribute to compare highly conserved genes across three species, for comparative transcriptomics at the isoform level, or for benchmarking splice aligners and methods focusing on the identification of splicing orthologs. The data is available at https://data-access.cesgo.org/index.php/s/V97GXxOS66NqTkZ .

Asunto(s)

Genoma , Empalme del ARN , Empalme Alternativo , Animales , Perros , Exones , Humanos , Ratones , Isoformas de Proteínas/metabolismo

Traceability, reproducibility and wiki-exploration for "à-la-carte" reconstructions of genome-scale metabolic models.

Aite, Méziane; Chevallier, Marie; Frioux, Clémence; Trottier, Camille; Got, Jeanne; Cortés, María Paz; Mendoza, Sebastián N; Carrier, Grégory; Dameron, Olivier; Guillaudeux, Nicolas; Latorre, Mauricio; Loira, Nicolás; Markov, Gabriel V; Maass, Alejandro; Siegel, Anne.

PLoS Comput Biol ; 14(5): e1006146, 2018 05.

Artículo en Inglés | MEDLINE | ID: mdl-29791443

RESUMEN

Genome-scale metabolic models have become the tool of choice for the global analysis of microorganism metabolism, and their reconstruction has attained high standards of quality and reliability. Improvements in this area have been accompanied by the development of some major platforms and databases, and an explosion of individual bioinformatics methods. Consequently, many recent models result from "à la carte" pipelines, combining the use of platforms, individual tools and biological expertise to enhance the quality of the reconstruction. Although very useful, introducing heterogeneous tools, that hardly interact with each other, causes loss of traceability and reproducibility in the reconstruction process. This represents a real obstacle, especially when considering less studied species whose metabolic reconstruction can greatly benefit from the comparison to good quality models of related organisms. This work proposes an adaptable workspace, AuReMe, for sustainable reconstructions or improvements of genome-scale metabolic models involving personalized pipelines. At each step, relevant information related to the modifications brought to the model by a method is stored. This ensures that the process is reproducible and documented regardless of the combination of tools used. Additionally, the workspace establishes a way to browse metabolic models and their metadata through the automatic generation of ad-hoc local wikis dedicated to monitoring and facilitating the process of reconstruction. AuReMe supports exploration and semantic query based on RDF databases. We illustrate how this workspace allowed handling, in an integrated way, the metabolic reconstructions of non-model organisms such as an extremophile bacterium or eukaryote algae. Among relevant applications, the latter reconstruction led to putative evolutionary insights of a metabolic pathway.

Asunto(s)

Bases de Datos Factuales , Genómica , Almacenamiento y Recuperación de la Información , Internet , Redes y Vías Metabólicas/genética , Antioxidantes/metabolismo , Genómica/métodos , Genómica/normas , Almacenamiento y Recuperación de la Información/métodos , Almacenamiento y Recuperación de la Información/normas , Microalgas/genética , Microalgas/metabolismo , Modelos Teóricos , Reproducibilidad de los Resultados

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA