Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 48
Filtrar
1.
BMC Biol ; 22(1): 52, 2024 Mar 05.
Artículo en Inglés | MEDLINE | ID: mdl-38439107

RESUMEN

BACKGROUND: Capsella bursa-pastoris, a cosmopolitan weed of hybrid origin, is an emerging model object for the study of early consequences of polyploidy, being a fast growing annual and a close relative of Arabidopsis thaliana. The development of this model is hampered by the absence of a reference genome sequence. RESULTS: We present here a subgenome-resolved chromosome-scale assembly and a genetic map of the genome of Capsella bursa-pastoris. It shows that the subgenomes are mostly colinear, with no massive deletions, insertions, or rearrangements in any of them. A subgenome-aware annotation reveals the lack of genome dominance-both subgenomes carry similar number of genes. While most chromosomes can be unambiguously recognized as derived from either paternal or maternal parent, we also found homeologous exchange between two chromosomes. It led to an emergence of two hybrid chromosomes; this event is shared between distant populations of C. bursa-pastoris. The whole-genome analysis of 119 samples belonging to C. bursa-pastoris and its parental species C. grandiflora/rubella and C. orientalis reveals introgression from C. orientalis but not from C. grandiflora/rubella. CONCLUSIONS: C. bursa-pastoris does not show genome dominance. In the earliest stages of evolution of this species, a homeologous exchange occurred; its presence in all present-day populations of C. bursa-pastoris indicates on a single origin of this species. The evidence coming from whole-genome analysis challenges the current view that C. grandiflora/rubella was a direct progenitor of C. bursa-pastoris; we hypothesize that it was an extinct (or undiscovered) species sister to C. grandiflora/rubella.


Asunto(s)
Arabidopsis , Capsella , Rubéola (Sarampión Alemán) , Capsella/genética , Genómica , Poliploidía
2.
BMC Bioinformatics ; 25(1): 238, 2024 Jul 13.
Artículo en Inglés | MEDLINE | ID: mdl-39003441

RESUMEN

MOTIVATION: Alignment of reads to a reference genome sequence is one of the key steps in the analysis of human whole-genome sequencing data obtained through Next-generation sequencing (NGS) technologies. The quality of the subsequent steps of the analysis, such as the results of clinical interpretation of genetic variants or the results of a genome-wide association study, depends on the correct identification of the position of the read as a result of its alignment. The amount of human NGS whole-genome sequencing data is constantly growing. There are a number of human genome sequencing projects worldwide that have resulted in the creation of large-scale databases of genetic variants of sequenced human genomes. Such information about known genetic variants can be used to improve the quality of alignment at the read alignment stage when analysing sequencing data obtained for a new individual, for example, by creating a genomic graph. While existing methods for aligning reads to a linear reference genome have high alignment speed, methods for aligning reads to a genomic graph have greater accuracy in variable regions of the genome. The development of a read alignment method that takes into account known genetic variants in the linear reference sequence index allows combining the advantages of both sets of methods. RESULTS: In this paper, we present the minimap2_index_modifier tool, which enables the construction of a modified index of a reference genome using known single nucleotide variants and insertions/deletions (indels) specific to a given human population. The use of the modified minimap2 index improves variant calling quality without modifying the bioinformatics pipeline and without significant additional computational overhead. Using the PrecisionFDA Truth Challenge V2 benchmark data (for HG002 short-read data aligned to the GRCh38 linear reference (GCA_000001405.15) with parameters k = 27 and w = 14) it was demonstrated that the number of false negative genetic variants decreased by more than 9500, and the number of false positives decreased by more than 7000 when modifying the index with genetic variants from the Human Pangenome Reference Consortium.


Asunto(s)
Variación Genética , Genoma Humano , Secuenciación Completa del Genoma , Humanos , Secuenciación Completa del Genoma/métodos , Variación Genética/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Polimorfismo de Nucleótido Simple/genética , Alineación de Secuencia/métodos , Programas Informáticos , Algoritmos , Estudio de Asociación del Genoma Completo/métodos
3.
PLoS Comput Biol ; 19(1): e1010743, 2023 01.
Artículo en Inglés | MEDLINE | ID: mdl-36626392

RESUMEN

Interspecific gene comparisons are the keystones for many areas of biological research and are especially important for the translation of knowledge from model organisms to economically important species. Currently they are hampered by the low resolution of methods based on sequence analysis and by the complex evolutionary history of eukaryotic genes. This is especially critical for plants, whose genomes are shaped by multiple whole genome duplications and subsequent gene loss. This requires the development of new methods for comparing the functions of genes in different species. Here, we report ISEEML (Interspecific Similarity of Expression Evaluated using Machine Learning)-a novel machine learning-based algorithm for interspecific gene classification. In contrast to previous studies focused on sequence similarity, our algorithm focuses on functional similarity inferred from the comparison of gene expression profiles. We propose novel metrics for expression pattern similarity-expression score (ES)-that is suitable for species with differing morphologies. As a proof of concept, we compare detailed transcriptome maps of Arabidopsis thaliana, the model species, Zea mays (maize) and Fagopyrum esculentum (common buckwheat), which are species that represent distant clades within flowering plants. The classifier resulted in an AUC of 0.91; under the ES threshold of 0.5, the specificity was 94%, and sensitivity was 72%.


Asunto(s)
Arabidopsis , Transcriptoma , Transcriptoma/genética , Arabidopsis/genética , Evolución Biológica , Regulación de la Expresión Génica de las Plantas/genética , Zea mays/genética
4.
Mol Biol Evol ; 37(8): 2279-2286, 2020 08 01.
Artículo en Inglés | MEDLINE | ID: mdl-32243532

RESUMEN

The basidiomycete Schizophyllum commune has the highest level of genetic polymorphism known among living organisms. In a previous study, it was also found to have a moderately high per-generation mutation rate of 2×10-8, likely contributing to its high polymorphism. However, this rate has been measured only in an experiment on Petri dishes, and it is unclear how it translates to natural populations. Here, we used an experimental design that measures the rate of accumulation of de novo mutations in a linearly growing mycelium. We show that S. commune accumulates mutations at a rate of 1.24×10-7 substitutions per nucleotide per meter of growth, or ∼2.04×10-11 per nucleotide per cell division. In contrast to what has been observed in a number of species with extensive vegetative growth, this rate does not decline in the course of propagation of a mycelium. As a result, even a moderate per-cell-division mutation rate in S. commune can translate into a very high per-generation mutation rate when the number of cell divisions between consecutive meiosis is large.


Asunto(s)
Tasa de Mutación , Schizophyllum/genética , Acumulación de Mutaciones , Micorrizas/genética , Micorrizas/crecimiento & desarrollo , Polimorfismo Genético , Schizophyllum/crecimiento & desarrollo
5.
Nucleic Acids Res ; 47(21): e135, 2019 12 02.
Artículo en Inglés | MEDLINE | ID: mdl-31511888

RESUMEN

As the use of next-generation sequencing (NGS) for the Mendelian diseases diagnosis is expanding, the performance of this method has to be improved in order to achieve higher quality. Typically, performance measures are considered to be designed in the context of each application and, therefore, account for a spectrum of clinically relevant variants. We present EphaGen, a new computational methodology for bioinformatics quality control (QC). Given a single NGS dataset in BAM format and a pre-compiled VCF-file of targeted clinically relevant variants it associates this dataset with a single arbiter parameter. Intrinsically, EphaGen estimates the probability to miss any variant from the defined spectrum within a particular NGS dataset. Such performance measure virtually resembles the diagnostic sensitivity of given NGS dataset. Here we present case studies of the use of EphaGen in context of BRCA1/2 and CFTR sequencing in a series of 14 runs across 43 blood samples and 504 publically available NGS datasets. EphaGen is superior to conventional bioinformatics metrics such as coverage depth and coverage uniformity. We recommend using this software as a QC step in NGS studies in the clinical context. Availability: https://github.com/m4merg/EphaGen or https://hub.docker.com/r/m4merg/ephagen.


Asunto(s)
Biología Computacional/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Polimorfismo de Nucleótido Simple/genética , Control de Calidad , Programas Informáticos , Proteína BRCA1/genética , Proteína BRCA2/genética , Neoplasias de la Mama/genética , Regulador de Conductancia de Transmembrana de Fibrosis Quística/genética , Femenino , Genoma Humano , Genómica/métodos , Humanos , Análisis de la Aleatorización Mendeliana/métodos
6.
Int J Mol Sci ; 22(11)2021 May 27.
Artículo en Inglés | MEDLINE | ID: mdl-34072144

RESUMEN

Cysteine-rich peptides (CRPs) play an important role in plant physiology. However, their role in resistance induced by biogenic elicitors remains poorly understood. Using whole-genome transcriptome sequencing and our CRP search algorithm, we analyzed the repertoire of CRPs in tomato Solanum lycopersicum L. in response to Fusarium oxysporum infection and elicitors from F. sambucinum. We revealed 106 putative CRP transcripts belonging to different families of antimicrobial peptides (AMPs), signaling peptides (RALFs), and peptides with non-defense functions (Major pollen allergen of Olea europaea (Ole e 1 and 6), Maternally Expressed Gene (MEG), Epidermal Patterning Factor (EPF)), as well as pathogenesis-related proteins of families 1 and 4 (PR-1 and 4). We discovered a novel type of 10-Cys-containing hevein-like AMPs named SlHev1, which was up-regulated both by infection and elicitors. Transcript profiling showed that F. oxysporum infection and F. sambucinum elicitors changed the expression levels of different overlapping sets of CRP genes, suggesting the diversification of functions in CRP families. We showed that non-specific lipid transfer proteins (nsLTPs) and snakins mostly contribute to the response of tomato plants to the infection and the elicitors. The involvement of CRPs with non-defense function in stress reactions was also demonstrated. The results obtained shed light on the mode of action of F. sambucinum elicitors and the role of CRP families in the immune response in tomato.


Asunto(s)
Cisteína , Resistencia a la Enfermedad/genética , Péptidos/genética , Enfermedades de las Plantas/genética , Enfermedades de las Plantas/microbiología , Solanum lycopersicum/genética , Solanum lycopersicum/microbiología , Secuencias de Aminoácidos , Secuencia de Aminoácidos , Biología Computacional/métodos , Secuencia Conservada , Cisteína/química , Cisteína/genética , Resistencia a la Enfermedad/inmunología , Perfilación de la Expresión Génica , Solanum lycopersicum/inmunología , Modelos Moleculares , Péptidos/química , Enfermedades de las Plantas/inmunología , Proteínas de Plantas/química , Proteínas de Plantas/genética , Conformación Proteica , Transcriptoma
8.
BMC Genomics ; 21(1): 331, 2020 Apr 29.
Artículo en Inglés | MEDLINE | ID: mdl-32349672

RESUMEN

BACKGROUND: Salivary cell secretion (SCS) plays a critical role in blood feeding by medicinal leeches, making them of use for certain medical purposes even today. RESULTS: We annotated the Hirudo medicinalis genome and performed RNA-seq on salivary cells isolated from three closely related leech species, H. medicinalis, Hirudo orientalis, and Hirudo verbana. Differential expression analysis verified by proteomics identified salivary cell-specific gene expression, many of which encode previously unknown salivary components. However, the genes encoding known anticoagulants have been found to be expressed not only in salivary cells. The function-related analysis of the unique salivary cell genes enabled an update of the concept of interactions between salivary proteins and components of haemostasis. CONCLUSIONS: Here we report a genome draft of Hirudo medicinalis and describe identification of novel salivary proteins and new homologs of genes encoding known anticoagulants in transcriptomes of three medicinal leech species. Our data provide new insights in genetics of blood-feeding lifestyle in leeches.


Asunto(s)
Genoma , Hirudo medicinalis/genética , Proteínas y Péptidos Salivales/genética , Animales , Anticoagulantes/metabolismo , Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Hirudo medicinalis/metabolismo , Sanguijuelas/clasificación , Sanguijuelas/genética , Sanguijuelas/metabolismo , Proteómica , Saliva/metabolismo , Proteínas y Péptidos Salivales/metabolismo
9.
Mol Biol Evol ; 36(1): 127-140, 2019 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-30376122

RESUMEN

The beginning of civilization was a turning point in human evolution. With increasing separation from the natural environment, mankind stimulated new adaptive reactions in response to new environmental factors. In this paper, we describe direct signs of these reactions in the European population during the past 6,000 years. By comparing whole-genome data between Late Neolithic/Bronze Age individuals and modern Europeans, we revealed biological pathways that are significantly differently enriched in nonsynonymous single nucleotide polymorphisms in these two groups and which therefore could be shaped by cultural practices during the past six millennia. They include metabolic transformations, immune response, signal transduction, physical activity, sensory perception, reproduction, and cognitive functions. We demonstrated that these processes were influenced by different types of natural selection. We believe that our study opens new perspectives for more detailed investigations about when and how civilization has been modifying human genomes.


Asunto(s)
Civilización , Evolución Molecular , Genoma Humano , Polimorfismo de Nucleótido Simple , Población Blanca/genética , Humanos , Redes y Vías Metabólicas , Selección Genética
10.
Int J Mol Sci ; 21(8)2020 Apr 14.
Artículo en Inglés | MEDLINE | ID: mdl-32295185

RESUMEN

Accumulation of lipid-laden (foam) cells in the arterial wall is known to be the earliest step in the pathogenesis of atherosclerosis. There is almost no doubt that atherogenic modified low-density lipoproteins (LDL) are the main sources of accumulating lipids in foam cells. Atherogenic modified LDL are taken up by arterial cells, such as macrophages, pericytes, and smooth muscle cells in an unregulated manner bypassing the LDL receptor. The present study was conducted to reveal possible common mechanisms in the interaction of macrophages with associates of modified LDL and non-lipid latex particles of a similar size. To determine regulatory pathways that are potentially responsible for cholesterol accumulation in human macrophages after the exposure to naturally occurring atherogenic or artificially modified LDL, we used transcriptome analysis. Previous studies of our group demonstrated that any type of LDL modification facilitates the self-association of lipoprotein particles. The size of such self-associates hinders their interaction with a specific LDL receptor. As a result, self-associates are taken up by nonspecific phagocytosis bypassing the LDL receptor. That is why we used latex beads as a stimulator of macrophage phagocytotic activity. We revealed at least 12 signaling pathways that were regulated by the interaction of macrophages with the multiple-modified atherogenic naturally occurring LDL and with latex beads in a similar manner. Therefore, modified LDL was shown to stimulate phagocytosis through the upregulation of certain genes. We have identified at least three genes (F2RL1, EIF2AK3, and IL15) encoding inflammatory molecules and associated with signaling pathways that were upregulated in response to the interaction of modified LDL with macrophages. Knockdown of two of these genes, EIF2AK3 and IL15, completely suppressed cholesterol accumulation in macrophages. Correspondingly, the upregulation of EIF2AK3 and IL15 promoted cholesterol accumulation. These data confirmed our hypothesis of the following chain of events in atherosclerosis: LDL particles undergo atherogenic modification; this is accompanied by the formation of self-associates; large LDL associates stimulate phagocytosis; as a result of phagocytosis stimulation, pro-inflammatory molecules are secreted; these molecules cause or at least contribute to the accumulation of intracellular cholesterol. This chain of events may explain the relationship between cholesterol accumulation and inflammation. The primary sequence of events in this chain is related to inflammatory response rather than cholesterol accumulation.


Asunto(s)
Colesterol/metabolismo , Células Espumosas/metabolismo , Metabolismo de los Lípidos , Transducción de Señal , Biomarcadores , Susceptibilidad a Enfermedades , Células Espumosas/patología , Perfilación de la Expresión Génica , Humanos , Inflamación/etiología , Inflamación/metabolismo , Inflamación/patología , Mediadores de Inflamación/metabolismo , Macrófagos/metabolismo , Macrófagos/patología , Modelos Biológicos
12.
BMC Plant Biol ; 19(Suppl 1): 49, 2019 Feb 15.
Artículo en Inglés | MEDLINE | ID: mdl-30813912

RESUMEN

BACKGROUND: Transcriptome map is a powerful tool for a variety of biological studies; transcriptome maps that include different organs, tissues, cells and stages of development are currently available for at least 30 plants. Some of them include samples treated by environmental or biotic stresses. However, most studies explore only limited set of organs and developmental stages (leaves or seedlings). In order to provide broader view of organ-specific strategies of cold stress response we studied expression changes that follow exposure to cold (+ 4 °C) in different aerial parts of plant: cotyledons, hypocotyl, leaves, young flowers, mature flowers and seeds using RNA-seq. RESULTS: The results on differential expression in leaves are congruent with current knowledge on stress response pathways, in particular, the role of CBF genes. In other organs, both essence and dynamics of gene expression changes are different. We show the involvement of genes that are confined to narrow expression patterns in non-stress conditions into stress response. In particular, the genes that control cell wall modification in pollen, are activated in leaves. In seeds, predominant pattern is the change of lipid metabolism. CONCLUSIONS: Stress response is highly organ-specific; different pathways are involved in this process in each type of organs. The results were integrated with previously published transcriptome map of Arabidopsis thaliana and used for an update of a public database TraVa: http://travadb.org/browse/Species=AthStress .


Asunto(s)
Proteínas de Arabidopsis/metabolismo , Arabidopsis/metabolismo , Arabidopsis/genética , Proteínas de Arabidopsis/genética , Respuesta al Choque por Frío/genética , Respuesta al Choque por Frío/fisiología , Perfilación de la Expresión Génica , Regulación de la Expresión Génica de las Plantas , Transcriptoma/genética
13.
BMC Microbiol ; 19(1): 160, 2019 07 12.
Artículo en Inglés | MEDLINE | ID: mdl-31299889

RESUMEN

BACKGROUND: All living organisms experience physiological changes regulated by endogenous circadian rhythms. The main factor controlling the circadian clock is the duration of daylight. The aim of this research was to identify the impact of various lighting conditions on physiological parameters and gut microbiota composition in rats. 3 groups of outbred rats were subjected to normal light-dark cycles, darkness and constant lighting. RESULTS: After 1 and 3 months we studied urinary catecholamine levels in rats; indicators of lipid peroxidation and antioxidant activity in the blood; protein levels of BMAL1, CLOCK and THRA in the hypothalamus; composition and functional activity of the gut microbiota. Subjecting the rats to conditions promoting desynchronosis for 3 months caused disruptions in homeostasis. CONCLUSIONS: Changing the lighting conditions led to changes in almost all the physiological parameters that we studied. Catecholamines can be regarded as a synchronization super system of split-level circadian oscillators. We established a correlation between hypothalamic levels of Bmal1 and urinary catecholamine concentrations. The magnitude of changes in the GM taxonomic composition was different for LL/LD and DD/LD but the direction of these changes was similar. As for the predicted functional properties of the GM which characterize its metabolic activity, they didn't change as dramatically as the taxonomic composition. All differences may be viewed as a compensatory reaction to new environmental conditions and the organism has adapted to those conditions.


Asunto(s)
Catecolaminas/orina , Relojes Circadianos/fisiología , Péptidos y Proteínas de Señalización del Ritmo Circadiano/metabolismo , Ritmo Circadiano/fisiología , Microbioma Gastrointestinal/fisiología , Especies Reactivas de Oxígeno/metabolismo , Animales , Oscuridad , Luz , Masculino , Ratas
14.
Plant J ; 91(2): 278-291, 2017 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-28387959

RESUMEN

Polyploidization and subsequent sub- and neofunctionalization of duplicated genes represent a major mechanism of plant genome evolution. Capsella bursa-pastoris, a widespread ruderal plant, is a recent allotetraploid and, thus, is an ideal model organism for studying early changes following polyploidization. We constructed a high-quality assembly of C. bursa-pastoris genome and a transcriptome atlas covering a broad sample of organs and developmental stages (available online at http://travadb.org/browse/Species=Cbp). We demonstrate that expression of homeologs is mostly symmetric between subgenomes, and identify a set of homeolog pairs with discordant expression. Comparison of promoters within such pairs revealed emerging asymmetry of regulatory elements. Among them there are multiple binding sites for transcription factors controlling the regulation of photosynthesis and plant development by light (PIF3, HY5) and cold stress response (CBF). These results suggest that polyploidization in C. bursa-pastoris enhanced its plasticity of response to light and temperature, and allowed substantial expansion of its distribution range.


Asunto(s)
Capsella/genética , Regulación de la Expresión Génica de las Plantas , Genoma de Planta , Poliploidía , Secuencias Reguladoras de Ácidos Nucleicos , Anotación de Secuencia Molecular
15.
Nucleic Acids Res ; 44(D1): D116-25, 2016 Jan 04.
Artículo en Inglés | MEDLINE | ID: mdl-26586801

RESUMEN

Models of transcription factor (TF) binding sites provide a basis for a wide spectrum of studies in regulatory genomics, from reconstruction of regulatory networks to functional annotation of transcripts and sequence variants. While TFs may recognize different sequence patterns in different conditions, it is pragmatic to have a single generic model for each particular TF as a baseline for practical applications. Here we present the expanded and enhanced version of HOCOMOCO (http://hocomoco.autosome.ru and http://www.cbrc.kaust.edu.sa/hocomoco10), the collection of models of DNA patterns, recognized by transcription factors. HOCOMOCO now provides position weight matrix (PWM) models for binding sites of 601 human TFs and, in addition, PWMs for 396 mouse TFs. Furthermore, we introduce the largest up to date collection of dinucleotide PWM models for 86 (52) human (mouse) TFs. The update is based on the analysis of massive ChIP-Seq and HT-SELEX datasets, with the validation of the resulting models on in vivo data. To facilitate a practical application, all HOCOMOCO models are linked to gene and protein databases (Entrez Gene, HGNC, UniProt) and accompanied by precomputed score thresholds. Finally, we provide command-line tools for PWM and diPWM threshold estimation and motif finding in nucleotide sequences.


Asunto(s)
Bases de Datos Genéticas , Elementos Reguladores de la Transcripción , Factores de Transcripción/metabolismo , Animales , Sitios de Unión , Inmunoprecipitación de Cromatina , Humanos , Ratones , Modelos Biológicos , Análisis de Secuencia de ADN
16.
Plant J ; 88(6): 1058-1070, 2016 12.
Artículo en Inglés | MEDLINE | ID: mdl-27549386

RESUMEN

Arabidopsis thaliana is a long established model species for plant molecular biology, genetics and genomics, and studies of A. thaliana gene function provide the basis for formulating hypotheses and designing experiments involving other plants, including economically important species. A comprehensive understanding of the A. thaliana genome and a detailed and accurate understanding of the expression of its associated genes is therefore of great importance for both fundamental research and practical applications. Such goal is reliant on the development of new genetic and genomic resources, involving new methods of data acquisition and analysis. We present here the genome-wide analysis of A. thaliana gene expression profiles across different organs and developmental stages using high-throughput transcriptome sequencing. The expression of 25 706 protein-coding genes, as well as their stability and their spatiotemporal specificity, was assessed in 79 organs and developmental stages. A search for alternative splicing events identified 37 873 previously unreported splice junctions, approximately 30% of them occurred in intergenic regions. These potentially represent novel spliced genes that are not included in the TAIR10 database. These data are housed in an open-access web-based database, TraVA (Transcriptome Variation Analysis, http://travadb.org/), which allows visualization and analysis of gene expression profiles and differential gene expression between organs and developmental stages.


Asunto(s)
Proteínas de Arabidopsis/metabolismo , Arabidopsis/metabolismo , Transcriptoma/genética , Empalme Alternativo/genética , Arabidopsis/genética , Proteínas de Arabidopsis/genética , Biología Computacional , Perfilación de la Expresión Génica , Regulación de la Expresión Génica de las Plantas/genética , Factores de Transcripción/genética , Factores de Transcripción/metabolismo
17.
BMC Evol Biol ; 17(Suppl 2): 258, 2017 12 28.
Artículo en Inglés | MEDLINE | ID: mdl-29297306

RESUMEN

BACKGROUND: Gray whale, Eschrichtius robustus (E. robustus), is a single member of the family Eschrichtiidae, which is considered to be the most primitive in the class Cetacea. Gray whale is often described as a "living fossil". It is adapted to extreme marine conditions and has a high life expectancy (77 years). The assembly of a gray whale genome and transcriptome will allow to carry out further studies of whale evolution, longevity, and resistance to extreme environment. RESULTS: In this work, we report the first de novo assembly and primary analysis of the E. robustus genome and transcriptome based on kidney and liver samples. The presented draft genome assembly is complete by 55% in terms of a total genome length, but only by 24% in terms of the BUSCO complete gene groups, although 10,895 genes were identified. Transcriptome annotation and comparison with other whale species revealed robust expression of DNA repair and hypoxia-response genes, which is expected for whales. CONCLUSIONS: This preliminary study of the gray whale genome and transcriptome provides new data to better understand the whale evolution and the mechanisms of their adaptation to the hypoxic conditions.


Asunto(s)
Genoma , Transcriptoma/genética , Ballenas/genética , Animales , Regulación de la Expresión Génica , Biblioteca de Genes , Anotación de Secuencia Molecular , Filogenia
18.
Microb Ecol ; 70(3): 819-34, 2015 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-25894918

RESUMEN

In this study, we report the first completely annotated genome sequence of the Russia origin Bifidobacterium longum subsp. longum strain GT15. Comparative genomic analysis of this genome with other available completely annotated genome sequences of B. longum strains isolated from other countries has revealed a high degree of conservation and synteny across the entire genomes. However, it was discovered that the open reading frames to 35 genes were detected only from the B. longum GT15 genome and absent from other genomes B. longum strains (not of Russian origin). These so-called unique genes (UGs) represent a total length of 39,066 bp, with G + C content ranging from 37 to 65 %. Interestingly, certain genes were detected in other B. longum strains of Russian origin. In our analysis, we examined genes for global regulatory systems: proteins of toxin-antitoxin (TA) systems type II, serine/threonine protein kinases (STPKs) of eukaryotic type, and genes of the WhiB-like family proteins. In addition, we have made in silico analysis of all the most significant probiotic genes and considered genes involved in epigenetic regulation and genes responsible for producing various neuromediators. This genome sequence may elucidate the biology of this probiotic strain as a promising candidate for practical (pharmaceutical) applications.


Asunto(s)
Bifidobacterium/genética , Cromosomas Bacterianos/genética , Genoma Bacteriano , Bifidobacterium/metabolismo , Mapeo Cromosómico , Cromosomas Bacterianos/metabolismo , Epigénesis Genética , Datos de Secuencia Molecular , Filogenia , Federación de Rusia , Análisis de Secuencia de ADN
19.
Nucleic Acids Res ; 41(Database issue): D195-202, 2013 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-23175603

RESUMEN

Transcription factor (TF) binding site (TFBS) models are crucial for computational reconstruction of transcription regulatory networks. In existing repositories, a TF often has several models (also called binding profiles or motifs), obtained from different experimental data. Having a single TFBS model for a TF is more pragmatic for practical applications. We show that integration of TFBS data from various types of experiments into a single model typically results in the improved model quality probably due to partial correction of source specific technique bias. We present the Homo sapiens comprehensive model collection (HOCOMOCO, http://autosome.ru/HOCOMOCO/, http://cbrc.kaust.edu.sa/hocomoco/) containing carefully hand-curated TFBS models constructed by integration of binding sequences obtained by both low- and high-throughput methods. To construct position weight matrices to represent these TFBS models, we used ChIPMunk software in four computational modes, including newly developed periodic positional prior mode associated with DNA helix pitch. We selected only one TFBS model per TF, unless there was a clear experimental evidence for two rather distinct TFBS models. We assigned a quality rating to each model. HOCOMOCO contains 426 systematically curated TFBS models for 401 human TFs, where 172 models are based on more than one data source.


Asunto(s)
Bases de Datos Genéticas , Elementos Reguladores de la Transcripción , Factores de Transcripción/metabolismo , Sitios de Unión , Humanos , Internet , Modelos Genéticos , Posición Específica de Matrices de Puntuación
20.
Plant Methods ; 20(1): 128, 2024 Aug 17.
Artículo en Inglés | MEDLINE | ID: mdl-39152473

RESUMEN

BACKGROUND: As genomes of many eukaryotic species, especially plants, are large and complex, their de novo sequencing and assembly is still a difficult task despite progress in sequencing technologies. An alternative to genome assembly is the assembly of transcriptome, the set of RNA products of the expressed genes. While a bunch of de novo transcriptome assemblers exists, the challenges of transcriptomes (the existence of isoforms, the uneven expression levels across genes) complicates the generation of high-quality assemblies suitable for downstream analyses. RESULTS: We developed Trans2express - a web-based tool and a pipeline of de novo hybrid transcriptome assembly and postprocessing based on rnaSPAdes with a set of subsequent filtrations. The pipeline was tested on Arabidopsis thaliana cDNA sequencing data obtained using Illumina and Oxford Nanopore Technologies platforms and three non-model plant species. The comparison of structural characteristics of the transcriptome assembly with reference Arabidopsis genome revealed the high quality of assembled transcriptome with 86.1% of Arabidopsis expressed genes assembled as a single contig. We tested the applicability of the transcriptome assembly for gene expression analysis. For both Arabidopsis and non-model species the results showed high congruence of gene expression levels and sets of differentially expressed genes between analyses based on genome and based on the transcriptome assembly. CONCLUSIONS: We present Trans2express - a protocol for de novo hybrid transcriptome assembly aimed at recovering of a single transcript per gene. We expect this protocol to promote the characterization of transcriptomes and gene expression analysis in non-model plants and web-based tool to be of use to a wide range of plant biologists.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA