Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 275
Filtrar
1.
Heliyon ; 10(15): e35456, 2024 Aug 15.
Artículo en Inglés | MEDLINE | ID: mdl-39170392

RESUMEN

Streptococcus suis (S. suis) is a Gram-positive bacterium and the main culprit behind zoonotic outbreaks, posing a serious threat to public health. The prevalent strains in China are mainly of sequence types (ST) 1 and 7, with few cases of human infections caused by other sequence type being reported. This study presents the first isolation of a ST25 strain from the blood of a septicemic patient. A 57-year-old febrile patient was admitted to a hospital in Hainan of China, diagnosed as septicemia and hepatic dysfunction. A strain of S. suis was isolated from blood culture and confirmed to be serotype 2 and ST25 through 16S rRNA sequencing and whole-genome sequencing, and its genome was further analyzed for gene functions and presence of drug resistance genes. The full-length genome of strain HN28 spans 2,280,124 bp and encodes a total of 2291 proteins. Genes annotated in COG, GO, KEGG, CAZy, and PHl databases accounted for 75.38 %, 69.14 %, 55.35 %, 4.58 %, and 11.87 % of the total predicted proteins, respectively. Virulence factor analysis revealed the presence of seven putative virulence genes in strain HN28. Analysis using the CARD database identified 51 resistance genes in HN28, alongside abundant exocytosis systems. These findings underscore the occurrence of S. suis infections in humans caused by less common ST, emphasizing the need for enhanced epidemiological investigations and monitoring of S. suis infections in the human population.

2.
Curr Protoc ; 4(8): e1120, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-39126338

RESUMEN

JBrowse 2 is a modular genome browser that can visualize many common genomic file formats. While JBrowse 2 supports a variety of different usages, it is particularly suited for deployment on websites, such as model organism databases or other web-based genomic data resources. This protocol provides detailed instructions for setting up JBrowse 2 on an Ubuntu Linux web server, loading a reference genome from a FASTA format file, and adding a gene annotation track from a GFF3 format file. By the end of the protocol, users will have a working JBrowse 2 instance that is accessible via the web. © 2024 The Author(s). Current Protocols published by Wiley Periodicals LLC. Basic Protocol: Setting up JBrowse 2 on your web server.


Asunto(s)
Genómica , Genómica/métodos , Programas Informáticos , Navegador Web , Bases de Datos Genéticas , Internet , Genoma/genética , Humanos , Interfaz Usuario-Computador
3.
BMC Genomics ; 25(1): 775, 2024 Aug 09.
Artículo en Inglés | MEDLINE | ID: mdl-39118001

RESUMEN

BACKGROUND: Appropriate regulation of genes expressed in oocytes and embryos is essential for acquisition of developmental competence in mammals. Here, we hypothesized that several genes expressed in oocytes and pre-implantation embryos remain unknown. Our goal was to reconstruct the transcriptome of oocytes (germinal vesicle and metaphase II) and pre-implantation cattle embryos (blastocysts) using short-read and long-read sequences to identify putative new genes. RESULTS: We identified 274,342 transcript sequences and 3,033 of those loci do not match a gene present in official annotations and thus are potential new genes. Notably, 63.67% (1,931/3,033) of potential novel genes exhibited coding potential. Also noteworthy, 97.92% of the putative novel genes overlapped annotation with transposable elements. Comparative analysis of transcript abundance identified that 1,840 novel genes (recently added to the annotation) or potential new genes were differentially expressed between developmental stages (FDR < 0.01). We also determined that 522 novel or potential new genes (448 and 34, respectively) were upregulated at eight-cell embryos compared to oocytes (FDR < 0.01). In eight-cell embryos, 102 novel or putative new genes were co-expressed (|r|> 0.85, P < 1 × 10-8) with several genes annotated with gene ontology biological processes related to pluripotency maintenance and embryo development. CRISPR-Cas9 genome editing confirmed that the disruption of one of the novel genes highly expressed in eight-cell embryos reduced blastocyst development (ENSBTAG00000068261, P = 1.55 × 10-7). CONCLUSIONS: Our results revealed several putative new genes that need careful annotation. Many of the putative new genes have dynamic regulation during pre-implantation development and are important components of gene regulatory networks involved in pluripotency and blastocyst formation.


Asunto(s)
Blastocisto , Desarrollo Embrionario , Regulación del Desarrollo de la Expresión Génica , Oocitos , Animales , Bovinos , Desarrollo Embrionario/genética , Oocitos/metabolismo , Blastocisto/metabolismo , Transcriptoma , Anotación de Secuencia Molecular , Perfilación de la Expresión Génica , Femenino
4.
Front Microbiol ; 15: 1410024, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38962131

RESUMEN

The Deinococcus genus is renowned for its remarkable resilience against environmental stresses, including ionizing radiation, desiccation, and oxidative damage. This resilience is attributed to its sophisticated DNA repair mechanisms and robust defense systems, enabling it to recover from extensive damage and thrive under extreme conditions. Central to Deinococcus research, the D. radiodurans strains ATCC BAA-816 and ATCC 13939 facilitate extensive studies into this remarkably resilient genus. This study focused on delineating genetic discrepancies between these strains by sequencing our laboratory's ATCC 13939 specimen (ATCC 13939K) and juxtaposing it with ATCC BAA-816. We uncovered 436 DNA sequence differences within ATCC 13939K, including 100 single nucleotide variations, 278 insertions, and 58 deletions, which could induce frameshifts altering protein-coding genes. Gene annotation revisions accounting for gene fusions and the reconciliation of gene lengths uncovered novel protein-coding genes and refined the functional categorizations of established ones. Additionally, the analysis pointed out genome structural variations due to insertion sequence (IS) elements, underscoring the D. radiodurans genome's plasticity. Notably, ATCC 13939K exhibited a loss of six ISDra2 elements relative to BAA-816, restoring genes fragmented by ISDra2, such as those encoding for α/ß hydrolase and serine protease, and revealing new open reading frames, including genes imperative for acetoin decomposition. This comparative genomic study offers vital insights into the metabolic capabilities and resilience strategies of D. radiodurans.

5.
Bio Protoc ; 14(13): e5023, 2024 Jul 05.
Artículo en Inglés | MEDLINE | ID: mdl-39007158

RESUMEN

In recent years, the increase in genome sequencing across diverse plant species has provided a significant advantage for phylogenomics studies, allowing the analysis of one of the most diverse gene families in plants: nucleotide-binding leucine-rich repeat receptors (NLRs). However, due to the sequence diversity of the NLR gene family, identifying key molecular features and functionally conserved sequence patterns is challenging through multiple sequence alignment. Here, we present a step-by-step protocol for a computational pipeline designed to identify evolutionarily conserved motifs in plant NLR proteins. In this protocol, we use a large-scale NLR dataset, including 1,862 NLR genes annotated from monocot and dicot species, to predict conserved sequence motifs, such as the MADA and EDVID motifs, within the coiled-coil (CC)-NLR subfamily. Our pipeline can be applied to identify molecular signatures that have remained conserved in the gene family over evolutionary time across plant species. Key features • Phylogenomics analysis of plant NLR immune receptor family. • Identification of functionally conserved sequence patterns among plant NLRs.

6.
PeerJ ; 12: e17651, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38993980

RESUMEN

Background: Genomic resource development for non-model organisms is rapidly progressing, seeking to uncover molecular mechanisms and evolutionary adaptations enabling thriving in diverse environments. Limited genomic data for bat species hinder insights into their evolutionary processes, particularly within the diverse Myotis genus of the Vespertilionidae family. In Mexico, 15 Myotis species exist, with three-M. vivesi, M. findleyi, and M. planiceps-being endemic and of conservation concern. Methods: We obtained samples of Myotis vivesi, M. findleyi, and M. planiceps for genomic analysis. Each of three genomic DNA was extracted, sequenced, and assembled. The scaffolding was carried out utilizing the M. yumanensis genome via a genome-referenced approach within the ntJoin program. GapCloser was employed to fill gaps. Repeat elements were characterized, and gene prediction was done via ab initio and homology methods with MAKER pipeline. Functional annotation involved InterproScan, BLASTp, and KEGG. Non-coding RNAs were annotated with INFERNAL, and tRNAscan-SE. Orthologous genes were clustered using Orthofinder, and a phylogenomic tree was reconstructed using IQ-TREE. Results: We present genome assemblies of these endemic species using Illumina NovaSeq 6000, each exceeding 2.0 Gb, with over 90% representing single-copy genes according to BUSCO analyses. Transposable elements, including LINEs and SINEs, constitute over 30% of each genome. Helitrons, consistent with Vespertilionids, were identified. Values around 20,000 genes from each of the three assemblies were derived from gene annotation and their correlation with specific functions. Comparative analysis of orthologs among eight Myotis species revealed 20,820 groups, with 4,789 being single copy orthogroups. Non-coding RNA elements were annotated. Phylogenomic tree analysis supported evolutionary chiropterans' relationships. These resources contribute significantly to understanding gene evolution, diversification patterns, and aiding conservation efforts for these endangered bat species.


Asunto(s)
Quirópteros , Genoma , Genómica , Filogenia , Animales , México , Genoma/genética , Quirópteros/genética , Genómica/métodos
7.
Genome Biol Evol ; 16(7)2024 Jul 03.
Artículo en Inglés | MEDLINE | ID: mdl-38946321

RESUMEN

Oecanthus is a genus of cricket known for its distinctive chirping and distributed across major zoogeographical regions worldwide. This study focuses on Oecanthus rufescens, and conducts a comprehensive examination of its genome through genome sequencing technologies and bioinformatic analysis. A high-quality chromosome-level genome of O. rufescens was successfully obtained, revealing significant features of its genome structure. The genome size is 877.9 Mb, comprising ten pseudo-chromosomes and 70 other sequences, with a GC content of 41.38% and an N50 value of 157,110,771 bp, indicating a high level of continuity. BUSCO assessment results demonstrate that the genome's integrity and quality are high (of which 96.8% are single-copy and 1.6% are duplicated). Comprehensive genome annotation was also performed, identifying approximately 310 Mb of repetitive sequences, accounting for 35.3% of the total genome sequence, and discovering 15,481 tRNA genes, 4,082 rRNA genes, and 1,212 other noncoding genes. Furthermore, 15,031 protein-coding genes were identified, with BUSCO assessment results showing that 98.4% (of which 96.3% are single-copy and 1.6% are duplicated) of the genes were annotated.


Asunto(s)
Genoma de los Insectos , Anotación de Secuencia Molecular , Animales , Cromosomas de Insectos/genética , Gryllidae/genética , Ortópteros/genética , Ortópteros/clasificación
8.
Microbiol Resour Announc ; 13(7): e0035724, 2024 Jul 18.
Artículo en Inglés | MEDLINE | ID: mdl-38898546

RESUMEN

As a noteworthy biocontrol fungus, Clonostachys chloroleuca currently lacks a high-quality reference genome. Here, we present the first high-quality genome assembly of C. chloroleuca strain Cc878 achieved through Oxford Nanopore Long-Read sequencing. The nuclear genome of Cc878 was assembled into four contigs, totaling 59.38 Mb.

9.
Genes (Basel) ; 15(5)2024 04 26.
Artículo en Inglés | MEDLINE | ID: mdl-38790180

RESUMEN

Kohlrabi is an important swollen-stem cabbage variety belonging to the Brassicaceae family. However, few complete chloroplast genome sequences of this genus have been reported. Here, a complete chloroplast genome with a quadripartite cycle of 153,364 bp was obtained. A total of 132 genes were identified, including 87 protein-coding genes, 37 transfer RNA genes and eight ribosomal RNA genes. The base composition analysis showed that the overall GC content was 36.36% of the complete chloroplast genome sequence. Relative synonymous codon usage frequency (RSCU) analysis showed that most codons with values greater than 1 ended with A or U, while most codons with values less than 1 ended with C or G. Thirty-five scattered repeats were identified and most of them were distributed in the large single-copy (LSC) region. A total of 290 simple sequence repeats (SSRs) were found and 188 of them were distributed in the LSC region. Phylogenetic relationship analysis showed that five Brassica oleracea subspecies were clustered into one group and the kohlrabi chloroplast genome was closely related to that of B. oleracea var. botrytis. Our results provide a basis for understanding chloroplast-dependent metabolic studies and provide new insight for understanding the polyploidization of Brassicaceae species.


Asunto(s)
Brassica , Genoma del Cloroplasto , Filogenia , Genoma del Cloroplasto/genética , Brassica/genética , Repeticiones de Microsatélite/genética , Composición de Base/genética , Uso de Codones , Cloroplastos/genética , Secuenciación Completa del Genoma/métodos
10.
BMC Genomics ; 25(1): 430, 2024 May 01.
Artículo en Inglés | MEDLINE | ID: mdl-38693501

RESUMEN

BACKGROUND: Although multiple chicken genomes have been assembled and annotated, the numbers of protein-coding genes in chicken genomes and their variation among breeds are still uncertain due to the low quality of these genome assemblies and limited resources used in their gene annotations. To fill these gaps, we recently assembled genomes of four indigenous chicken breeds with distinct traits at chromosome-level. In this study, we annotated genes in each of these assembled genomes using a combination of RNA-seq- and homology-based approaches. RESULTS: We identified varying numbers (17,497-17,718) of protein-coding genes in the four indigenous chicken genomes, while recovering 51 of the 274 "missing" genes in birds in general, and 36 of the 174 "missing" genes in chickens in particular. Intriguingly, based on deeply sequenced RNA-seq data collected in multiple tissues in the four breeds, we found 571 ~ 627 protein-coding genes in each genome, which were missing in the annotations of the reference chicken genomes (GRCg6a and GRCg7b/w). After removing redundancy, we ended up with a total of 1,420 newly annotated genes (NAGs). The NAGs tend to be found in subtelomeric regions of macro-chromosomes (chr1 to chr5, plus chrZ) and middle chromosomes (chr6 to chr13, plus chrW), as well as in micro-chromosomes (chr14 to chr39) and unplaced contigs, where G/C contents are high. Moreover, the NAGs have elevated quadruplexes G frequencies, while both G/C contents and quadruplexes G frequencies in their surrounding regions are also high. The NAGs showed tissue-specific expression, and we were able to verify 39 (92.9%) of 42 randomly selected ones in various tissues of the four chicken breeds using RT-qPCR experiments. Most of the NAGs were also encoded in the reference chicken genomes, thus, these genomes might harbor more genes than previously thought. CONCLUSION: The NAGs are widely distributed in wild, indigenous and commercial chickens, and they might play critical roles in chicken physiology. Counting these new genes, chicken genomes harbor more genes than originally thought.


Asunto(s)
Pollos , Genoma , Anotación de Secuencia Molecular , Animales , Pollos/genética , Composición de Base , Telómero/genética , Cromosomas/genética , Genómica/métodos
11.
bioRxiv ; 2024 May 10.
Artículo en Inglés | MEDLINE | ID: mdl-38766115

RESUMEN

Dendroctonus frontalis, also known as southern pine beetle (SPB), represents the most damaging forest pest in the southeastern United States. Strategies to predict, monitor and suppress SPB outbreaks have had limited success. Genomic data are critical to inform on pest biology and to identify molecular targets to develop improved management approaches. Here, we produced a chromosome-level genome assembly of SPB using long-read sequencing data. Synteny analyses confirmed the conservation of the core coleopteran Stevens elements and validated the bona fide SPB X chromosome. Transcriptomic data were used to obtain 39,588 transcripts corresponding to 13,354 putative protein-coding loci. Comparative analyses of gene content across 14 beetle and 3 other insects revealed several losses of conserved genes in the Dendroctonus clade and gene gains in SPB and Dendroctonus that were enriched for loci encoding membrane proteins and extracellular matrix proteins. While lineage-specific gene losses contributed to the gene content reduction observed in Dendroctonus, we also showed that widespread misannotation of transposable elements represents a major cause of the apparent gene expansion in several non-Dendroctonus species. Our findings uncovered distinctive features of the SPB gene complement and disentangled the role of biological and annotation-related factors contributing to gene content variation across beetles.

12.
BMC Genom Data ; 25(1): 48, 2024 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-38783174

RESUMEN

OBJECTIVES: Ottelia Pers. is in the Hydrocharitaceae family. Species in the genus are aquatic, and China is their centre of origin in Asia. Ottelia alismoides (L.) Pers., which is distributed worldwide, is a distinguishing element in China, while other species of this genus are endemic to China. However, O. alismoides is also considered endangered due to habitat loss and pollution in some Asian countries. Ottelia alismoides is the only submerged macrophyte that contains three carbon dioxide-concentrating mechanisms, i.e. bicarbonate (HCO3-) use, crassulacean acid metabolism and the C4 pathway. In this study, we present its first genome assembly to help illustrate the various carbon metabolism mechanisms and to enable genetic conservation in the future. DATA DESCRIPTION: Using DNA and RNA extracted from one O. alismoides leaf, this work produced ∼ 73.4 Gb HiFi reads, ∼ 126.4 Gb whole genome sequencing short reads and ∼ 21.9 Gb RNA-seq reads. The de novo genome assembly was 6,455,939,835 bp in length, with 11,923 scaffolds/contigs and an N50 of 790,733 bp. Genome assembly completeness assessment with Benchmarking Universal Single-Copy Orthologs revealed a score of 94.4%. The repetitive sequence in the assembly was 4,875,817,144 bp (75.5%). A total of 116,176 genes were predicted. The protein sequences were functionally annotated against multiple databases, facilitating comparative genomic analysis.


Asunto(s)
Carbono , Genoma de Planta , Hydrocharitaceae , Hydrocharitaceae/genética , Hydrocharitaceae/metabolismo , Carbono/metabolismo , Anotación de Secuencia Molecular , Secuenciación Completa del Genoma , China
13.
Front Genet ; 15: 1377130, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38694873

RESUMEN

Introduction: Nellore cattle (Bos taurus indicus) is the main beef cattle breed raised in Brazil. This breed is well adapted to tropical conditions and, more recently, has experienced intensive genetic selection for multiple performance traits. Over the past 43 years, an experimental breeding program has been developed in the Institute of Animal Science (IZ, Sertaozinho, SP, Brazil), which resulted in three differentially-selected lines known as Nellore Control (NeC), Nellore Selection (NeS), and Nellore Traditional (NeT). The primary goal of this selection experiment was to determine the response to selection for yearling weight (YW) and residual feed intake (RFI) on Nellore cattle. The main objectives of this study were to: 1) identify copy number variation (CNVs) in Nellore cattle from three selection lines; 2) identify and characterize CNV regions (CNVR) on these three lines; and 3) perform functional enrichment analyses of the CNVR identified. Results: A total of 14,914 unique CNVs and 1,884 CNVRs were identified when considering all lines as a single population. The CNVRs were non-uniformly distributed across the chromosomes of the three selection lines included in the study. The NeT line had the highest number of CNVRs (n = 1,493), followed by the NeS (n = 823) and NeC (n = 482) lines. The CNVRs covered 23,449,890 bp (0.94%), 40,175,556 bp (1.61%), and 63,212,273 bp (2.54%) of the genome of the NeC, NeS, and NeT lines, respectively. Two CNVRs were commonly identified between the three lines, and six, two, and four exclusive regions were identified for NeC, NeS, and NeT, respectively. All the exclusive regions overlap with important genes, such as SMARCD3, SLC15A1, and MAPK1. Key biological processes associated with the candidate genes were identified, including pathways related to growth and metabolism. Conclusion: This study revealed large variability in CNVs and CNVRs across three Nellore lines differentially selected for YW and RFI. Gene annotation and gene ontology analyses of the exclusive CNVRs to each line revealed specific genes and biological processes involved in the expression of growth and feed efficiency traits. These findings contribute to the understanding of the genetic mechanisms underlying the phenotypic differences among the three Nellore selection lines.

14.
Insects ; 15(2)2024 Feb 17.
Artículo en Inglés | MEDLINE | ID: mdl-38392552

RESUMEN

Hermetia illucens is a species of great interest for numerous industrial applications. A high-quality reference genome is already available for H. illucens. However, the worldwide maintenance of numerous captive populations of H. illucens, each with its own genotypic and phenotypic characteristics, made it of interest to perform a de novo genome assembly on one population of H. illucens to define a chromosome-scale genome assembly. By combining the PacBio and the Omni-C proximity ligation technologies, a new H. illucens chromosome-scale genome of 888.59 Mb, with a scaffold N50 value of 162.19 Mb, was assembled. The final chromosome-scale assembly obtained a BUSCO completeness of 89.1%. By exploiting the Omni-C proximity ligation technology, topologically associated domains and other topological features that play a key role in the regulation of gene expression were identified. Further, 65.62% of genomic sequences were masked as repeated sequences, and 32,516 genes were annotated using the MAKER pipeline. The H. illucens Lsp-2 genes that were annotated were further characterized, and the three-dimensional organization of the encoded proteins was predicted. A new chromosome-scale genome assembly of good quality for H. illucens was assembled, and the genomic annotation phase was initiated. The availability of this new chromosome-scale genome assembly enables the further characterization, both genotypically and phenotypically, of a species of interest for several biotechnological applications.

15.
ArXiv ; 2024 Feb 14.
Artículo en Inglés | MEDLINE | ID: mdl-38410643

RESUMEN

This paper presents the Ensemble Nucleotide Byte-level Encoder-Decoder (ENBED) foundation model, analyzing DNA sequences at byte-level precision with an encoder-decoder Transformer architecture. ENBED uses a sub-quadratic implementation of attention to develop an efficient model capable of sequence-to-sequence transformations, generalizing previous genomic models with encoder-only or decoder-only architectures. We use Masked Language Modeling to pre-train the foundation model using reference genome sequences and apply it in the following downstream tasks: (1) identification of enhancers, promotors and splice sites, (2) recognition of sequences containing base call mismatches and insertion/deletion errors, an advantage over tokenization schemes involving multiple base pairs, which lose the ability to analyze with byte-level precision, (3) identification of biological function annotations of genomic sequences, and (4) generating mutations of the Influenza virus using the encoder-decoder architecture and validating them against real-world observations. In each of these tasks, we demonstrate significant improvement as compared to the existing state-of-the-art results.

16.
Brief Funct Genomics ; 23(4): 484-494, 2024 Jul 19.
Artículo en Inglés | MEDLINE | ID: mdl-38422352

RESUMEN

Massive gene expression analyses are widely used to find differentially expressed genes under specific conditions. The results of these experiments are often available in public databases that are undergoing a growth similar to that of molecular sequence databases in the past. This now allows novel secondary computational tools to emerge that use such information to gain new knowledge. If several genes have a similar expression profile across heterogeneous transcriptomics experiments, they could be functionally related. These associations are usually useful for the annotation of uncharacterized genes. In addition, the search for genes with opposite expression profiles is useful for finding negative regulators and proposing inhibitory compounds in drug repurposing projects. Here we present a new web application, Automatic and Serial Analysis of CO-expression (ASACO), which has the potential to discover positive and negative correlator genes to a given query gene, based on thousands of public transcriptomics experiments. In addition, examples of use are presented, comparing with previous contrasted knowledge. The results obtained propose ASACO as a useful tool to improve knowledge about genes associated with human diseases and noncoding genes. ASACO is available at http://www.bioinfocabd.upo.es/asaco/.


Asunto(s)
Reposicionamiento de Medicamentos , Reposicionamiento de Medicamentos/métodos , Humanos , Programas Informáticos , Perfilación de la Expresión Génica/métodos , Biología Computacional/métodos , Bases de Datos Genéticas , Transcriptoma/genética
17.
Genome Biol Evol ; 16(2)2024 02 01.
Artículo en Inglés | MEDLINE | ID: mdl-38368625

RESUMEN

The clouded apollo (Parnassius mnemosyne) is a palearctic butterfly distributed over a large part of western Eurasia, but population declines and fragmentation have been observed in many parts of the range. The development of genomic tools can help to shed light on the genetic consequences of the decline and to make informed decisions about direct conservation actions. Here, we present a high-contiguity, chromosome-level genome assembly of a female clouded apollo butterfly and provide detailed annotations of genes and transposable elements. We find that the large genome (1.5 Gb) of the clouded apollo is extraordinarily repeat rich (73%). Despite that, the combination of sequencing techniques allowed us to assemble all chromosomes (nc = 29) to a high degree of completeness. The annotation resulted in a relatively high number of protein-coding genes (22,854) compared with other Lepidoptera, of which a large proportion (21,635) could be assigned functions based on homology with other species. A comparative analysis indicates that overall genome structure has been largely conserved, both within the genus and compared with the ancestral lepidopteran karyotype. The high-quality genome assembly and detailed annotation presented here will constitute an important tool for forthcoming efforts aimed at understanding the genetic consequences of fragmentation and decline, as well as for assessments of genetic diversity, population structure, inbreeding, and genetic load in the clouded apollo butterfly.


Asunto(s)
Mariposas Diurnas , Animales , Femenino , Mariposas Diurnas/genética , Conservación de los Recursos Naturales , Genómica , Elementos Transponibles de ADN , Cromosomas , Anotación de Secuencia Molecular
18.
Mol Cell Proteomics ; 23(2): 100719, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38242438

RESUMEN

Although the human gene annotation has been continuously improved over the past 2 decades, numerous studies demonstrated the existence of a "dark proteome", consisting of proteins that were critical for biological processes but not included in widely used gene catalogs. The Genotype-Tissue Expression project generated more than 15,000 RNA-seq datasets from multiple tissues, which modeled 30 million transcripts in the human genome. To provide a resource of high-confidence novel proteins from the dark proteome, we screened 50,000 mass spectrometry runs from over 900 projects to identify proteins translated from the Genotype-Tissue Expression transcript model with proteomic support. We also integrated 3.8 million common genetic variants from the gnomAD database to improve peptide identification. As a result, we identified 170,529 novel peptides with proteomic evidence, of which 6048 passed the strictest standard we defined and were supported by PepQuery. We provided a user-friendly website (https://ncorf.genes.fun/) for researchers to check the evidence of novel peptides from their studies. The findings will improve our understanding of coding genes and facilitate genomic data interpretation in biomedical research.


Asunto(s)
Proteogenómica , Humanos , Proteogenómica/métodos , Proteoma/metabolismo , Proteómica/métodos , Péptidos/genética , Genoma Humano
19.
Mitochondrial DNA B Resour ; 9(1): 128-132, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38259357

RESUMEN

The mitogenome of Bauhinia variegate was assembled and characterized in this study. The mitogenome size was 437,271 bp, and its GC content was 45.5%. 36 protein-coding genes, 17 tRNAs and 3 rRNAs were annotated in the mitogenome. A total of 12 MTPTs, ranging from 71 bp to 3562 bp, were identified in the mitogenome and covered 1.46% (6373 bp) of the mitogenome. Phylogenetic analysis of 15 species of Leguminosae based on 23 core protein-coding genes showed that B. variegata was sister to Tylosema esculentum, another member from the subfamily Cercidoideae. The mitogenome of B. variegata provides a valuable genetic resource for further phylogenetic studies of this family.

20.
BMC Biotechnol ; 23(1): 51, 2023 12 04.
Artículo en Inglés | MEDLINE | ID: mdl-38049781

RESUMEN

BACKGROUND: Goat rumen microbial communities are perceived as one of the most potential biochemical reservoirs of multi-functional enzymes, which are applicable to enhance wide array of bioprocesses such as the hydrolysis of cellulose and hemi-cellulose into fermentable sugar for biofuel and other value-added biochemical production. Even though, the limited understanding of rumen microbial genetic diversity and the absence of effective screening culture methods have impeded the full utilization of these potential enzymes. In this study, we applied culture independent metagenomics sequencing approach to isolate, and identify microbial communities in goat rumen, meanwhile, clone and functionally characterize novel cellulase and xylanase genes in goat rumen bacterial communities. RESULTS: Bacterial DNA samples were extracted from goat rumen fluid. Three genomic libraries were sequenced using Illumina HiSeq 2000 for paired-end 100-bp (PE100) and Illumina HiSeq 2500 for paired-end 125-bp (PE125). A total of 435gb raw reads were generated. Taxonomic analysis using Graphlan revealed that Fibrobacter, Prevotella, and Ruminococcus are the most abundant genera of bacteria in goat rumen. SPAdes assembly and prodigal annotation were performed. The contigs were also annotated using the DOE-JGI pipeline. In total, 117,502 CAZymes, comprising endoglucanases, exoglucanases, beta-glucosidases, xylosidases, and xylanases, were detected in all three samples. Two genes with predicted cellulolytic/xylanolytic activities were cloned and expressed in E. coli BL21(DE3). The endoglucanases and xylanase enzymatic activities of the recombinant proteins were confirmed using substrate plate assay and dinitrosalicylic acid (DNS) analysis. The 3D structures of endoglucanase A and endo-1,4-beta xylanase was predicted using the Swiss Model. Based on the 3D structure analysis, the two enzymes isolated from goat's rumen metagenome are unique with only 56-59% similarities to those homologous proteins in protein data bank (PDB) meanwhile, the structures of the enzymes also displayed greater stability, and higher catalytic activity. CONCLUSIONS: In summary, this study provided the database resources of bacterial metagenomes from goat's rumen fluid, including gene sequences with annotated functions and methods for gene isolation and over-expression of cellulolytic enzymes; and a wealth of genes in the metabolic pathways affecting food and nutrition of ruminant animals.


Asunto(s)
Celulasa , Celulasas , Animales , Celulasa/metabolismo , Metagenoma , Cabras/genética , Cabras/metabolismo , Cabras/microbiología , Rumen/metabolismo , Rumen/microbiología , Escherichia coli/genética , Bacterias , Celulasas/genética , Celulosa
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...