Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 279
Filtrar
Más filtros

Intervalo de año de publicación
1.
Cell ; 186(9): 2018-2034.e21, 2023 04 27.
Artículo en Inglés | MEDLINE | ID: mdl-37080200

RESUMEN

Functional genomic strategies have become fundamental for annotating gene function and regulatory networks. Here, we combined functional genomics with proteomics by quantifying protein abundances in a genome-scale knockout library in Saccharomyces cerevisiae, using data-independent acquisition mass spectrometry. We find that global protein expression is driven by a complex interplay of (1) general biological properties, including translation rate, protein turnover, the formation of protein complexes, growth rate, and genome architecture, followed by (2) functional properties, such as the connectivity of a protein in genetic, metabolic, and physical interaction networks. Moreover, we show that functional proteomics complements current gene annotation strategies through the assessment of proteome profile similarity, protein covariation, and reverse proteome profiling. Thus, our study reveals principles that govern protein expression and provides a genome-spanning resource for functional annotation.


Asunto(s)
Proteoma , Proteómica , Proteómica/métodos , Proteoma/metabolismo , Genómica/métodos , Genoma , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo
2.
Cell ; 167(2): 553-565.e12, 2016 Oct 06.
Artículo en Inglés | MEDLINE | ID: mdl-27693354

RESUMEN

Genome-metabolism interactions enable cell growth. To probe the extent of these interactions and delineate their functional contributions, we quantified the Saccharomyces amino acid metabolome and its response to systematic gene deletion. Over one-third of coding genes, in particular those important for chromatin dynamics, translation, and transport, contribute to biosynthetic metabolism. Specific amino acid signatures characterize genes of similar function. This enabled us to exploit functional metabolomics to connect metabolic regulators to their effectors, as exemplified by TORC1, whose inhibition in exponentially growing cells is shown to match an interruption in endomembrane transport. Providing orthogonal information compared to physical and genetic interaction networks, metabolomic signatures cluster more than half of the so far uncharacterized yeast genes and provide functional annotation for them. A major part of coding genes is therefore participating in gene-metabolism interactions that expose the metabolism regulatory network and enable access to an underexplored space in gene function.


Asunto(s)
Aminoácidos/biosíntesis , Metaboloma , Proteínas de Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/metabolismo , Factores de Transcripción/metabolismo , Aminoácidos/genética , Cromatina/metabolismo , Eliminación de Gen , Regulación Fúngica de la Expresión Génica , Redes Reguladoras de Genes , Metaboloma/genética , Metabolómica/métodos , Familia de Multigenes , Fosfatidilinositol 3-Quinasas/genética , Fosfatidilinositol 3-Quinasas/metabolismo , Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/genética , Factores de Transcripción/genética , Transcripción Genética
3.
Mol Cell Proteomics ; 23(2): 100719, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38242438

RESUMEN

Although the human gene annotation has been continuously improved over the past 2 decades, numerous studies demonstrated the existence of a "dark proteome", consisting of proteins that were critical for biological processes but not included in widely used gene catalogs. The Genotype-Tissue Expression project generated more than 15,000 RNA-seq datasets from multiple tissues, which modeled 30 million transcripts in the human genome. To provide a resource of high-confidence novel proteins from the dark proteome, we screened 50,000 mass spectrometry runs from over 900 projects to identify proteins translated from the Genotype-Tissue Expression transcript model with proteomic support. We also integrated 3.8 million common genetic variants from the gnomAD database to improve peptide identification. As a result, we identified 170,529 novel peptides with proteomic evidence, of which 6048 passed the strictest standard we defined and were supported by PepQuery. We provided a user-friendly website (https://ncorf.genes.fun/) for researchers to check the evidence of novel peptides from their studies. The findings will improve our understanding of coding genes and facilitate genomic data interpretation in biomedical research.


Asunto(s)
Proteogenómica , Humanos , Proteogenómica/métodos , Proteoma/metabolismo , Proteómica/métodos , Péptidos/genética , Genoma Humano
4.
BMC Genomics ; 25(1): 775, 2024 Aug 09.
Artículo en Inglés | MEDLINE | ID: mdl-39118001

RESUMEN

BACKGROUND: Appropriate regulation of genes expressed in oocytes and embryos is essential for acquisition of developmental competence in mammals. Here, we hypothesized that several genes expressed in oocytes and pre-implantation embryos remain unknown. Our goal was to reconstruct the transcriptome of oocytes (germinal vesicle and metaphase II) and pre-implantation cattle embryos (blastocysts) using short-read and long-read sequences to identify putative new genes. RESULTS: We identified 274,342 transcript sequences and 3,033 of those loci do not match a gene present in official annotations and thus are potential new genes. Notably, 63.67% (1,931/3,033) of potential novel genes exhibited coding potential. Also noteworthy, 97.92% of the putative novel genes overlapped annotation with transposable elements. Comparative analysis of transcript abundance identified that 1,840 novel genes (recently added to the annotation) or potential new genes were differentially expressed between developmental stages (FDR < 0.01). We also determined that 522 novel or potential new genes (448 and 34, respectively) were upregulated at eight-cell embryos compared to oocytes (FDR < 0.01). In eight-cell embryos, 102 novel or putative new genes were co-expressed (|r|> 0.85, P < 1 × 10-8) with several genes annotated with gene ontology biological processes related to pluripotency maintenance and embryo development. CRISPR-Cas9 genome editing confirmed that the disruption of one of the novel genes highly expressed in eight-cell embryos reduced blastocyst development (ENSBTAG00000068261, P = 1.55 × 10-7). CONCLUSIONS: Our results revealed several putative new genes that need careful annotation. Many of the putative new genes have dynamic regulation during pre-implantation development and are important components of gene regulatory networks involved in pluripotency and blastocyst formation.


Asunto(s)
Blastocisto , Desarrollo Embrionario , Regulación del Desarrollo de la Expresión Génica , Oocitos , Animales , Bovinos , Desarrollo Embrionario/genética , Oocitos/metabolismo , Blastocisto/metabolismo , Transcriptoma , Anotación de Secuencia Molecular , Perfilación de la Expresión Génica , Femenino
5.
BMC Genomics ; 25(1): 430, 2024 May 01.
Artículo en Inglés | MEDLINE | ID: mdl-38693501

RESUMEN

BACKGROUND: Although multiple chicken genomes have been assembled and annotated, the numbers of protein-coding genes in chicken genomes and their variation among breeds are still uncertain due to the low quality of these genome assemblies and limited resources used in their gene annotations. To fill these gaps, we recently assembled genomes of four indigenous chicken breeds with distinct traits at chromosome-level. In this study, we annotated genes in each of these assembled genomes using a combination of RNA-seq- and homology-based approaches. RESULTS: We identified varying numbers (17,497-17,718) of protein-coding genes in the four indigenous chicken genomes, while recovering 51 of the 274 "missing" genes in birds in general, and 36 of the 174 "missing" genes in chickens in particular. Intriguingly, based on deeply sequenced RNA-seq data collected in multiple tissues in the four breeds, we found 571 ~ 627 protein-coding genes in each genome, which were missing in the annotations of the reference chicken genomes (GRCg6a and GRCg7b/w). After removing redundancy, we ended up with a total of 1,420 newly annotated genes (NAGs). The NAGs tend to be found in subtelomeric regions of macro-chromosomes (chr1 to chr5, plus chrZ) and middle chromosomes (chr6 to chr13, plus chrW), as well as in micro-chromosomes (chr14 to chr39) and unplaced contigs, where G/C contents are high. Moreover, the NAGs have elevated quadruplexes G frequencies, while both G/C contents and quadruplexes G frequencies in their surrounding regions are also high. The NAGs showed tissue-specific expression, and we were able to verify 39 (92.9%) of 42 randomly selected ones in various tissues of the four chicken breeds using RT-qPCR experiments. Most of the NAGs were also encoded in the reference chicken genomes, thus, these genomes might harbor more genes than previously thought. CONCLUSION: The NAGs are widely distributed in wild, indigenous and commercial chickens, and they might play critical roles in chicken physiology. Counting these new genes, chicken genomes harbor more genes than originally thought.


Asunto(s)
Pollos , Genoma , Anotación de Secuencia Molecular , Animales , Pollos/genética , Composición de Base , Telómero/genética , Cromosomas/genética , Genómica/métodos
6.
J Mol Evol ; 2024 Sep 13.
Artículo en Inglés | MEDLINE | ID: mdl-39269459

RESUMEN

Treacher Collins syndrome (TCS) is a genetic disorder affecting facial development, primarily caused by mutations in the TCOF1 gene. TCOF1, along with NOLC1, play important roles in ribosomal RNA transcription and processing. Previously, a zebrafish model of TCS successfully recapitulated the main characteristics of the syndrome by knocking down the expression of a gene on chromosome 13 (coding for Uniprot ID B8JIY2), which was identified as the TCOF1 orthologue. However, database updates renamed this gene as nolc1 and the zebrafish database (ZFIN) identified a different gene on chromosome 14 as the TCOF1 orthologue (coding for Uniprot ID E7F9D9). NOLC1 and TCOF1 are large proteins with unstructured regions and repetitive sequences that complicate alignments and comparisons. Also, the additional whole genome duplication of teleosts sets further difficulty. In this study, we present evidence that endorses that NOLC1 and TCOF1 are paralogs, and that the zebrafish gene on chromosome 14 is a low-complexity LisH domain-containing factor that displays homology to NOLC1 but lacks essential sequence features to accomplish TCOF1 nucleolar functions. Our analysis also supports the idea that zebrafish, as has been suggested for other non-tetrapod vertebrates, lack the TCOF1 gene that is associated with tripartite nucleolus. Using BLAST searches in a group of teleost genomes, we identified fish-specific sequences similar to E7F9D9 zebrafish protein. We propose naming them "LisH-containing Low Complexity Proteins" (LLCP). Interestingly, the gene on chromosome 13 (nolc1) displays the sequence features, developmental expression patterns, and phenotypic impact of depletion that are characteristic of TCOF1 functions. These findings suggest that in teleost fish, the nucleolar functions described for both NOLC1 and TCOF1 mediated by their repeated motifs, are carried out by a single gene, nolc1. Our study, which is mainly based on computational tools available as free web-based algorithms, could help to solve similar conflicts regarding gene orthology in zebrafish.

7.
Brief Bioinform ; 23(5)2022 09 20.
Artículo en Inglés | MEDLINE | ID: mdl-35534181

RESUMEN

Proteogenomics refers to the integrated analysis of the genome and proteome that leverages mass-spectrometry (MS)-based proteomics data to improve genome annotations, understand gene expression control through proteoforms and find sequence variants to develop novel insights for disease classification and therapeutic strategies. However, proteogenomic studies often suffer from reduced sensitivity and specificity due to inflated database size. To control the error rates, proteogenomics depends on the target-decoy search strategy, the de-facto method for false discovery rate (FDR) estimation in proteomics. The proteogenomic databases constructed from three- or six-frame nucleotide database translation not only increase the search space and compute-time but also violate the equivalence of target and decoy databases. These searches result in poorer separation between target and decoy scores, leading to stringent FDR thresholds. Understanding these factors and applying modified strategies such as two-pass database search or peptide-class-specific FDR can result in a better interpretation of MS data without introducing additional statistical biases. Based on these considerations, a user can interpret the proteogenomics results appropriately and control false positives and negatives in a more informed manner. In this review, first, we briefly discuss the proteogenomic workflows and limitations in database construction, followed by various considerations that can influence potential novel discoveries in a proteogenomic study. We conclude with suggestions to counter these challenges for better proteogenomic data interpretation.


Asunto(s)
Proteogenómica , Bases de Datos de Proteínas , Nucleótidos , Péptidos/química , Proteogenómica/métodos , Proteoma , Proteómica/métodos
8.
Proteomics ; 23(7-8): e2200013, 2023 04.
Artículo en Inglés | MEDLINE | ID: mdl-36349817

RESUMEN

There are multiple reasons why the next generation of biological and medical studies require increasing numbers of samples. Biological systems are dynamic, and the effect of a perturbation depends on the genetic background and environment. As a consequence, many conditions need to be considered to reach generalizable conclusions. Moreover, human population and clinical studies only reach sufficient statistical power if conducted at scale and with precise measurement methods. Finally, many proteins remain without sufficient functional annotations, because they have not been systematically studied under a broad range of conditions. In this review, we discuss the latest technical developments in mass spectrometry (MS)-based proteomics that facilitate large-scale studies by fast and efficient chromatography, fast scanning mass spectrometers, data-independent acquisition (DIA), and new software. We further highlight recent studies which demonstrate how high-throughput (HT) proteomics can be applied to capture biological diversity, to annotate gene functions or to generate predictive and prognostic models for human diseases.


Asunto(s)
Proteómica , Biología de Sistemas , Humanos , Proteómica/métodos , Proteínas/análisis , Espectrometría de Masas/métodos , Programas Informáticos
9.
BMC Genomics ; 24(1): 159, 2023 Mar 29.
Artículo en Inglés | MEDLINE | ID: mdl-36991339

RESUMEN

BACKGROUND: Tomato (Solanum lycopersicum) is both an important agricultural product and an excellent model system for studying plant-pathogen interactions. It is susceptible to bacterial wilt caused by Ralstonia solanacearum (Rs), and infection can result in severe yield and quality losses. To investigate which genes are involved in the resistance response to this pathogen, we sequenced the transcriptomes of both resistant and susceptible tomato inbred lines before and after Rs inoculation. RESULTS: In total, 75.02 Gb of high-quality reads were generated from 12 RNA-seq libraries. A total of 1,312 differentially expressed genes (DEGs) were identified, including 693 up-regulated and 621 down-regulated genes. Additionally, 836 unique DEGs were obtained when comparing two tomato lines, including 27 co-expression hub genes. A total of 1,290 DEGs were functionally annotated using eight databases, most of which were found to be involved in biological pathways such as DNA and chromatin activity, plant-pathogen interaction, plant hormone signal transduction, secondary metabolite biosynthesis, and defense response. Among the core-enriched genes in 12 key pathways related to resistance, 36 genotype-specific DEGs were identified. RT-qPCR integrated analysis revealed that multiple DEGs may play a significant role in tomato response to Rs. In particular, Solyc01g073985.1 (NLR disease resistance protein) and Solyc04g058170.1 (calcium-binding protein) in plant-pathogen interaction are likely to be involved in the resistance. CONCLUSION: We analyzed the transcriptomes of both resistant and susceptible tomato lines during control and inoculated conditions and identified several key genotype-specific hub genes involved in a variety of different biological processes. These findings lay a foundation for better understanding the molecular basis by which resistant tomato lines respond to Rs.


Asunto(s)
Ralstonia solanacearum , Solanum lycopersicum , Solanum lycopersicum/genética , Perfilación de la Expresión Génica , Ralstonia solanacearum/genética , Transcriptoma , Genotipo , Enfermedades de las Plantas/genética , Enfermedades de las Plantas/microbiología
10.
BMC Biotechnol ; 23(1): 51, 2023 12 04.
Artículo en Inglés | MEDLINE | ID: mdl-38049781

RESUMEN

BACKGROUND: Goat rumen microbial communities are perceived as one of the most potential biochemical reservoirs of multi-functional enzymes, which are applicable to enhance wide array of bioprocesses such as the hydrolysis of cellulose and hemi-cellulose into fermentable sugar for biofuel and other value-added biochemical production. Even though, the limited understanding of rumen microbial genetic diversity and the absence of effective screening culture methods have impeded the full utilization of these potential enzymes. In this study, we applied culture independent metagenomics sequencing approach to isolate, and identify microbial communities in goat rumen, meanwhile, clone and functionally characterize novel cellulase and xylanase genes in goat rumen bacterial communities. RESULTS: Bacterial DNA samples were extracted from goat rumen fluid. Three genomic libraries were sequenced using Illumina HiSeq 2000 for paired-end 100-bp (PE100) and Illumina HiSeq 2500 for paired-end 125-bp (PE125). A total of 435gb raw reads were generated. Taxonomic analysis using Graphlan revealed that Fibrobacter, Prevotella, and Ruminococcus are the most abundant genera of bacteria in goat rumen. SPAdes assembly and prodigal annotation were performed. The contigs were also annotated using the DOE-JGI pipeline. In total, 117,502 CAZymes, comprising endoglucanases, exoglucanases, beta-glucosidases, xylosidases, and xylanases, were detected in all three samples. Two genes with predicted cellulolytic/xylanolytic activities were cloned and expressed in E. coli BL21(DE3). The endoglucanases and xylanase enzymatic activities of the recombinant proteins were confirmed using substrate plate assay and dinitrosalicylic acid (DNS) analysis. The 3D structures of endoglucanase A and endo-1,4-beta xylanase was predicted using the Swiss Model. Based on the 3D structure analysis, the two enzymes isolated from goat's rumen metagenome are unique with only 56-59% similarities to those homologous proteins in protein data bank (PDB) meanwhile, the structures of the enzymes also displayed greater stability, and higher catalytic activity. CONCLUSIONS: In summary, this study provided the database resources of bacterial metagenomes from goat's rumen fluid, including gene sequences with annotated functions and methods for gene isolation and over-expression of cellulolytic enzymes; and a wealth of genes in the metabolic pathways affecting food and nutrition of ruminant animals.


Asunto(s)
Celulasa , Celulasas , Animales , Celulasa/metabolismo , Metagenoma , Cabras/genética , Cabras/metabolismo , Cabras/microbiología , Rumen/metabolismo , Rumen/microbiología , Escherichia coli/genética , Bacterias , Celulasas/genética , Celulosa
11.
BMC Plant Biol ; 23(1): 570, 2023 Nov 16.
Artículo en Inglés | MEDLINE | ID: mdl-37974117

RESUMEN

BACKGROUND: Neltuma pallida is a tree that grows in arid soils in northwestern Peru. As a predominant species of the Equatorial Dry Forest ecoregion, it holds significant economic and ecological value for both people and environment. Despite this, the species is severely threatened and there is a lack of genetic and genomic research, hindering the proposal of evidence-based conservation strategies. RESULTS: In this work, we conducted the assembly, annotation, analysis and comparison of the chloroplast genome of a N. pallida specimen with those of related species. The assembled chloroplast genome has a length of 162,381 bp with a typical quadripartite structure (LSC-IRA-SSC-IRB). The calculated GC content was 35.97%. However, this is variable between regions, with a higher GC content observed in the IRs. A total of 132 genes were annotated, of which 19 were duplicates and 22 contained at least one intron in their sequence. A substantial number of repetitive sequences of different types were identified in the assembled genome, predominantly tandem repeats (> 300). In particular, 142 microsatellites (SSR) markers were identified. The phylogenetic reconstruction showed that N. pallida grouped with the other Neltuma species and with Prosopis cineraria. The analysis of sequence divergence between the chloroplast genome sequences of N. pallida, N. juliflora, P. farcta and Strombocarpa tamarugo revealed a high degree of similarity. CONCLUSIONS: The N. pallida chloroplast genome was found to be similar to those of closely related species. With a size of 162,831 bp, it had the classical chloroplast quadripartite structure and GC content of 35.97%. Most of the 132 identified genes were protein-coding genes. Additionally, over 800 repetitive sequences were identified, including 142 SSR markers. In the phylogenetic analysis, N. pallida grouped with other Neltuma spp. and P. cineraria. Furthermore, N. pallida chloroplast was highly conserved when compared with genomes of closely related species. These findings can be of great potential for further diversity studies and genetic improvement of N. pallida.


Asunto(s)
Fabaceae , Genoma del Cloroplasto , Prosopis , Humanos , Anotación de Secuencia Molecular , Prosopis/genética , Genoma del Cloroplasto/genética , Filogenia , Fabaceae/genética
12.
Virus Genes ; 59(5): 752-762, 2023 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-37322310

RESUMEN

Bacteriophages are an important source of novel genetic diversity. Sequencing of phage genomes can reveal new proteins with potential uses in phage therapy and help unravel the diversity of biological mechanisms by which phages take over the machinery of the host during infection. To expand the available collection of phage genomes, we have isolated, sequenced, and assembled the genome sequences of three phages that infect three pathogenic Escherichia coli strains: vB_EcoM_DE15, vB_EcoM_DE16, and vB_EcoM_DE17. Morphological characterization and genomic analysis indicated that all three phages were strictly lytic and free from integrases, virulence factors, toxins, and antimicrobial resistance genes. All three phages contained tRNAs, and especially, vB_EcoM_DE17 contained 25 tRNAs. The genomic features of these phages indicate that natural phages are capable of lysing pathogenic E.coli and have great potential in the biocontrol of bacteria.


Asunto(s)
Bacteriófagos , Bacteriófagos/genética , Escherichia coli/genética , Genoma Viral , Genómica , Bacterias
13.
Virus Genes ; 59(2): 290-300, 2023 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-36607487

RESUMEN

A lysogenic phage vB_EcoP_DE5 (hereafter designated DE5) was isolated from donkey-derived Escherichia coli. The bacteriophage was examined by transmission electron microscopy, and the result showed that DE5 belonged to the genus Kuravirus. DE5 was sensitive to changes in temperature and pH, and it could maintain its activity at pH 7 and below 60 â„ƒ. The whole genome sequencing revealed that DE5 had a double-stranded DNA genome of 77, 305 bp with 42.09% G+C content. A total of 126 open reading frames (ORFs) were identified, including functional genes related to phage integration, DNA replication and modification, transcriptional regulation, structural and packaging proteins, and host cell lysis. One phage integrase gene, one autotransporter adhesin gene, and one tRNA gene were predicted in the whole genome, and no genes associated with drug resistance were identified. The phage DE5 integrase contained 187 amino acids and belonged to the small serine recombinase family. BLASTn analysis revealed that phage DE5 had a high-sequence identity (96%) with E. coli phage SU10. Phylogenetic analysis showed that phage DE5 was a member of the genus Kuravirus. The whole genome sequencing of lysogenic phage DE5 enhanced our understanding of lysogenic phages and their therapeutic applications.


Asunto(s)
Bacteriófagos , Podoviridae , Bacteriófagos/genética , Escherichia coli/genética , Filogenia , Genoma Viral , Podoviridae/genética , Secuenciación Completa del Genoma , Integrasas/genética , Sistemas de Lectura Abierta
14.
Environ Res ; 239(Pt 1): 117315, 2023 Dec 15.
Artículo en Inglés | MEDLINE | ID: mdl-37805180

RESUMEN

Chlorpyrifos (CP) is a pesticide widely used in agricultural production. However, excessive use of CP is risky for human health and the ecological environment. Microbial remediation has become a research hotspot of environmental pollution control. In this study, the effective CP-degrading strain H27 (Bacillus cereus) was screened from farmland soil, and the degradation ratio was more than 80%. Then, the degradation mechanism was discussed in terms of enzymes, pathways, products and genes, and the mechanism was improved in terms of cell motility, secretory transport system and biofilm formation. The key CP-degrading enzymes were mainly intracellular enzymes (IE), and the degradation ratio reached 49.6% within 30 min. The optimal pH for IE was 7.0, and the optimal temperature was 25 °C. Using DFT and HPLC‒MS analysis, it was found that degradation mainly involved oxidation, hydrolysis and other reactions, and 3 degradation pathways and 14 products were identified, among which TCP (3,5,6-trichloro-2-pyridinol) was the main primary degradation product in addition to small molecules such as CO2 and H2O. Finally, the whole genome of strain H27 was sequenced, and the related degrading genes and enzymes were investigated to improve the metabolic pathways. Strain H27 had perfect genes related to flagellar assembly and chemotaxis and tended to tolerate CP. Moreover, it can secrete esterase, phosphatase and other substances, which can form biofilms and degrade CP in the environment. In addition, CP enters the cell under the action of permeases or transporters, and it is metabolized by IE. The degradation mechanism of CP by strain H27 is speculated in this study, which provided a theoretical basis for enriching CP-degrading bacteria resources, improving degradation metabolic pathways and mechanisms, and applying strain H27 to environmental pollution remediation.


Asunto(s)
Bacillus , Cloropirifos , Humanos , Biodegradación Ambiental , Monoéster Fosfórico Hidrolasas , Secuenciación Completa del Genoma
15.
Curr Genomics ; 24(4): 236-249, 2023 Dec 12.
Artículo en Inglés | MEDLINE | ID: mdl-38169762

RESUMEN

Background: The species Pterodon emarginatus and P. pubescens, popularly known as white sucupira or faveira, are native to the Cerrado biome and have the potential for medicinal use and reforestation. They are sister species with evolutionary proximity. Objective: Considering that the chloroplast genome exhibits a conserved structure and genes, the analysis of its sequences can contribute to the understanding of evolutionary, phylogenetic, and diversity issues. Methods: The chloroplast genomes of P. emarginatus and P. pubescens were sequenced on the Illumina MiSeq platform. The genomes were assembled based on the de novo strategy. We performed the annotation of the genes and the repetitive regions of the genomes. The nucleotide diversity and phylogenetic relationships were analyzed using the gene sequences of these species and others of the Leguminosae family, whose genomes are available in databases. Results: The complete chloroplast genome of P. emarginatus is 159,877 bp, and that of P. pubescens is 159,873 bp. The genomes of both species have circular and quadripartite structures. A total of 127 genes were predicted in both species, including 110 single-copy genes and 17 duplicated genes in the inverted regions. 141 microsatellite regions were identified in P. emarginatus and 140 in P. pubescens. The nucleotide diversity estimates of the gene regions in twenty-one species of the Leguminosae family were 0.062 in LSC, 0.086 in SSC, and 0.036 in IR. The phylogenetic analysis demonstrated the proximity between the genera Pterodon and Dipteryx, both from the clade Dipterygeae. Ten pairs of primers with potential for the development of molecular markers were designed. Conclusion: The genetic information obtained on the chloroplast genomes of P. emarginatus and P. pubescens presented here reinforces the similarity and evolutionary proximity between these species, with a similarity percentage of 99.8%.

16.
Plant Dis ; 107(4): 1210-1213, 2023 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-36265141

RESUMEN

Fusarium oxysporum f. sp. cucumerinum, which causes root and vascular wilting, is one of the most devastating diseases infecting cucumber. Here, we report the first genome resource with high-quality assembly for F. oxysporum f. sp. cucumerinum strain Race-4, which is primarily endemic to China. The genome was 59.11 Mb in size and consisted of 48 scaffolds with an N50 of 3.87 Mb using PacBio long reads (301.77×) sequencing, and encodes 14,898 proteins from analyzing RNA-seq data. Gene annotations identified pathogen-host interaction genes, fungal virulence factors, secreted proteins, transcription factors, and secondary metabolite biosynthesis gene. Moreover, functional genes reported in previous studies were also identified in the genome of Race-4. These genes and genome resource may play important roles in understanding F. oxysporum f. sp. cucumerinum-cucumber interactions and will be useful for further research.


Asunto(s)
Cucumis sativus , Fusarium , Cucumis sativus/microbiología , Fusarium/genética , Factores de Virulencia , Interacciones Huésped-Patógeno
17.
Int J Mol Sci ; 24(22)2023 Nov 13.
Artículo en Inglés | MEDLINE | ID: mdl-38003462

RESUMEN

Cordia subcordata trees or shrubs, belonging to the Boraginaceae family, have strong resistance and have adapted to their habitat on a tropical coral island in China, but the lack of genome information regarding its genetic background is unclear. In this study, the genome was assembled using both short/long whole genome sequencing reads and Hi-C reads. The assembled genome was 475.3 Mb, with 468.7 Mb (99.22%) of the sequences assembled into 16 chromosomes. Repeat sequences accounted for 54.41% of the assembled genome. A total of 26,615 genes were predicted, and 25,730 genes were functionally annotated using different annotation databases. Based on its genome and the other 17 species, phylogenetic analysis using 336 single-copy genes obtained from ortholog analysis showed that C. subcordata was a sister to Coffea eugenioides, and the divergence time was estimated to be 77 MYA between the two species. Gene family evolution analysis indicated that the significantly expanded gene families were functionally related to chemical defenses against diseases. These results can provide a reference to a deeper understanding of the genetic background of C. subcordata and can be helpful in exploring its adaptation mechanism on tropical coral islands in the future.


Asunto(s)
Antozoos , Cordia , Animales , Filogenia , Antozoos/genética , Genoma , Secuencias Repetitivas de Ácidos Nucleicos , Anotación de Secuencia Molecular , Cromosomas
18.
World J Microbiol Biotechnol ; 39(11): 307, 2023 Sep 15.
Artículo en Inglés | MEDLINE | ID: mdl-37713136

RESUMEN

Esters were identified as the primary volatile flavor compounds in Chinese Baijiu, exerting a significant influence on its quality and aroma. This study focused on the yeast strain Pichia kudriavzevii, renowned for its high capacity to produce esters. Whole genome sequences were annotated and analyzed using the GO, KEGG, KOG, CAZy, and Pfam databases to determine the genetic basis underly the enhanced ester production capacity. Results showed that P. kudriavzevii gene function was concentrated in biosynthetic capacity, metabolic capacity, amino acid translocation capacity, glycoside hydrolysis capacity and transfer capacity. Additionally, acyltransferase and kinase were predicted as active sites contributing to P. kudriavzevii high ester production. We further compared the volatile composition differences between P. kudriavzevii and Saccharomyces cerevisiae through Headspace solid-phase microextraction-gas Chromatography-mass spectrometry (HS-SPME-GC-MS), revealing P. kudriavzevii produced 3.5 times more esters than S. cerevisiae. Overall, our findings suggest that P. kudriavzevii had potential applications in the Baijiu brewing industry.


Asunto(s)
Pichia , Saccharomyces cerevisiae , Pichia/genética , Aminoácidos , Ésteres
19.
BMC Bioinformatics ; 23(1): 107, 2022 Mar 30.
Artículo en Inglés | MEDLINE | ID: mdl-35354358

RESUMEN

BACKGROUND: RNA sequencing is currently the method of choice for genome-wide profiling of gene expression. A popular approach to quantify expression levels of genes from RNA-seq data is to map reads to a reference genome and then count mapped reads to each gene. Gene annotation data, which include chromosomal coordinates of exons for tens of thousands of genes, are required for this quantification process. There are several major sources of gene annotations that can be used for quantification, such as Ensembl and RefSeq databases. However, there is very little understanding of the effect that the choice of annotation has on the accuracy of gene expression quantification in an RNA-seq analysis. RESULTS: In this paper, we present results from our comparison of Ensembl and RefSeq human annotations on their impact on gene expression quantification using a benchmark RNA-seq dataset generated by the SEQC consortium. We show that the use of RefSeq gene annotation models led to better quantification accuracy, based on the correlation with ground truths including expression data from >800 real-time PCR validated genes, known titration ratios of gene expression and microarray expression data. We also found that the recent expansion of the RefSeq annotation has led to a decrease in its annotation accuracy. Finally, we demonstrated that the RNA-seq quantification differences observed between different annotations were not affected by the use of different normalization methods. CONCLUSION: In conclusion, our study found that the use of the conservative RefSeq gene annotation yields better RNA-seq quantification results than the more comprehensive Ensembl annotation. We also found that, surprisingly, the recent expansion of the RefSeq database, which was primarily driven by the incorporation of sequencing data into the gene annotation process, resulted in a reduction in the accuracy of RNA-seq quantification.


Asunto(s)
Anotación de Secuencia Molecular , Secuencia de Bases , Humanos , RNA-Seq , Análisis de Secuencia de ARN/métodos , Secuenciación del Exoma
20.
Plant J ; 107(2): 613-628, 2021 07.
Artículo en Inglés | MEDLINE | ID: mdl-33960539

RESUMEN

Traditional crops have historically provided accessible and affordable nutrition to millions of rural dwellers but have been neglected, with most modern agricultural systems over-reliant on a small number of internationally traded crops. Traditional crops are typically well-adapted to local agro-ecological conditions and many are nutrient-dense. They can play a vital role in local food systems through enhanced nutrition (particularly where diets are dominated by starch crops), food security and livelihoods for smallholder farmers, and a climate-resilient and biodiverse agriculture. Using short-read, long-read and phased sequencing technologies, we generated a high-quality chromosome-level genome assembly for Amaranthus cruentus, an under-researched crop with micronutrient- and protein-rich leaves and gluten-free seed, but lacking improved varieties, with respect to productivity and quality traits. The 370.9 Mb genome demonstrates a shared whole genome duplication with a related species, Amaranthus hypochondriacus. Comparative genome analysis indicates chromosomal loss and fusion events following genome duplication that are common to both species, as well as fission of chromosome 2 in A. cruentus alone, giving rise to a haploid chromosome number of 17 (versus 16 in A. hypochondriacus). Genomic features potentially underlying the nutritional value of this crop include two A. cruentus-specific genes with a likely role in phytic acid synthesis (an anti-nutrient), expansion of ion transporter gene families, and identification of biosynthetic gene clusters conserved within the amaranth lineage. The A. cruentus genome assembly will underpin much-needed research and global breeding efforts to develop improved varieties for economically viable cultivation and realization of the benefits to global nutrition security and agrobiodiversity.


Asunto(s)
Amaranthus/genética , Cromosomas de las Plantas/genética , Productos Agrícolas/genética , Evolución Molecular , Genoma de Planta/genética , Familia de Multigenes/genética , Valor Nutritivo/genética , Amaranthus/metabolismo , Mapeo Cromosómico , Genes de Plantas/genética , Filogenia
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA