Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 563
Filtrar
Más filtros

Tipo del documento
Intervalo de año de publicación
1.
Brief Bioinform ; 25(3)2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38600664

RESUMEN

Small open reading frames (smORFs) have been acknowledged to play various roles on essential biological pathways and affect human beings from diabetes to tumorigenesis. Predicting smORFs in silico is quite a prerequisite for processing the omics data. Here, we proposed the smORF-coding-potential-predicting framework, sOCP, which provides functions to construct a model for predicting novel smORFs in some species. The sOCP model constructed in human was based on in-frame features and the nucleotide bias around the start codon, and the small feature subset was proved to be competent enough and avoid overfitting problems for complicated models. It showed more advanced prediction metrics than previous methods and could correlate closely with experimental evidence in a heterogeneous dataset. The model was applied to Rattus norvegicus and exhibited satisfactory performance. We then scanned smORFs with ATG and non-ATG start codons from the human genome and generated a database containing about a million novel smORFs with coding potential. Around 72 000 smORFs are located on the lncRNA regions of the genome. The smORF-encoded peptides may be involved in biological pathways rare for canonical proteins, including glucocorticoid catabolic process and the prokaryotic defense system. Our work provides a model and database for human smORF investigation and a convenient tool for further smORF prediction in other species.


Asunto(s)
Genoma Humano , Péptidos , Animales , Humanos , Ratas , Sistemas de Lectura Abierta , Péptidos/genética , Proteínas/genética
2.
Brief Bioinform ; 25(3)2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38581418

RESUMEN

Following the milestone success of the Human Genome Project, the 'Encyclopedia of DNA Elements (ENCODE)' initiative was launched in 2003 to unearth information about the numerous functional elements within the genome. This endeavor coincided with the emergence of numerous novel technologies, accompanied by the provision of vast amounts of whole-genome sequences, high-throughput data such as ChIP-Seq and RNA-Seq. Extracting biologically meaningful information from this massive dataset has become a critical aspect of many recent studies, particularly in annotating and predicting the functions of unknown genes. The core idea behind genome annotation is to identify genes and various functional elements within the genome sequence and infer their biological functions. Traditional wet-lab experimental methods still rely on extensive efforts for functional verification. However, early bioinformatics algorithms and software primarily employed shallow learning techniques; thus, the ability to characterize data and features learning was limited. With the widespread adoption of RNA-Seq technology, scientists from the biological community began to harness the potential of machine learning and deep learning approaches for gene structure prediction and functional annotation. In this context, we reviewed both conventional methods and contemporary deep learning frameworks, and highlighted novel perspectives on the challenges arising during annotation underscoring the dynamic nature of this evolving scientific landscape.


Asunto(s)
Aprendizaje Profundo , Humanos , Genoma , Algoritmos , Programas Informáticos , Biología Computacional/métodos , Anotación de Secuencia Molecular
3.
Plant J ; 119(3): 1596-1612, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-38831668

RESUMEN

Genome annotation files play a critical role in dictating the quality of downstream analyses by providing essential predictions for gene positions and structures. These files are pivotal in decoding the complex information encoded within DNA sequences. Here, we generated experimental data resolving RNA 5'- and 3'-ends as well as full-length RNAs for cassava TME12 sticklings in ambient temperature and cold. We used these data to generate genome annotation files using the TranscriptomeReconstructoR (TR) tool. A careful comparison to high-quality genome annotations suggests that our new TR genome annotations identified additional genes, resolved the transcript boundaries more accurately and identified additional RNA isoforms. We enhanced existing cassava genome annotation files with the information from TR that maintained the different transcript models as RNA isoforms. The resultant merged annotation was subsequently utilized for comprehensive analysis. To examine the effects of genome annotation files on gene expression studies, we compared the detection of differentially expressed genes during cold using the same RNA-seq data but alternative genome annotation files. We found that our merged genome annotation that included cold-specific TR gene models identified about twice as many cold-induced genes. These data indicate that environmentally induced genes may be missing in off-the-shelf genome annotation files. In conclusion, TR offers the opportunity to enhance crop genome annotations with implications for the discovery of differentially expressed candidate genes during plant-environment interactions.


Asunto(s)
Genoma de Planta , Manihot , Anotación de Secuencia Molecular , Manihot/genética , Genoma de Planta/genética , Transcriptoma , Regulación de la Expresión Génica de las Plantas , Perfilación de la Expresión Génica , ARN de Planta/genética
4.
Brief Bioinform ; 24(3)2023 05 19.
Artículo en Inglés | MEDLINE | ID: mdl-36988160

RESUMEN

Small open reading frames (smORFs) encoding proteins less than 100 amino acids (aa) are known to be important regulators of key cellular processes. However, their computational identification remains a challenge. Based on a comprehensive analysis of known prokaryotic small ORFs, we have developed the ProsmORF-pred resource which uses a machine learning (ML)-based method for prediction of smORFs in the prokaryotic genome sequences. ProsmORF-pred consists of two ML models, one for initiation site recognition in nucleic acid sequences upstream of putative start codons and the other uses translated amino acid sequences to decipher functional protein like sequences. The nucleotide sequence-based initiation site recognition model has been trained using longer ORFs (>100 aa) in the same genome while the ML model for identification of protein like sequences has been trained using annotated smORFs from Escherichia coli. Comprehensive benchmarking of ProsmORF-pred reveals that its performance is comparable to other state-of-the-art approaches on the annotated smORF set derived from 32 prokaryotic genomes. Its performance is distinctly superior to other tools like PRODIGAL and RANSEPS for prediction of newly identified smORFs which have a length range of 10-30 aa, where prediction of smORFs has been a major challenge. Apart from identification of smORFs in genomic sequences, ProsmORF-pred can also aid in functional annotation of the predicted smORFs based on sequence similarity and genomic neighbourhood similarity searches in ProsmORFDB, a well-curated database of known smORFs. ProsmORF-pred along with its backend database ProsmORFDB is available as a user-friendly web server (http://www.nii.ac.in/prosmorfpred.html).


Asunto(s)
Genoma , Proteínas , Sistemas de Lectura Abierta , Proteínas/genética , Genómica , Secuencia de Aminoácidos
5.
Proc Natl Acad Sci U S A ; 119(46): e2211197119, 2022 Nov 16.
Artículo en Inglés | MEDLINE | ID: mdl-36343249

RESUMEN

Advances in medicine and biotechnology rely on a deep understanding of biological processes. Despite the increasingly available types and amounts of omics data, significant knowledge gaps remain, with current approaches to identify and curate missing annotations being limited to a set of already known reactions. Here, we introduce Network Integrated Computational Explorer for Gap Annotation of Metabolism (NICEgame), a workflow to identify and curate nonannotated metabolic functions in genomes using the ATLAS of Biochemistry and genome-scale metabolic models (GEMs). To resolve gaps in GEMs, NICEgame provides alternative sets of known and hypothetical reactions, assesses their thermodynamic feasibility, and suggests candidate genes to catalyze these reactions. We identified metabolic gaps and applied NICEgame in the latest GEM of Escherichia coli, iML1515, and enhanced the E. coli genome annotation by resolving 47% of these gaps. NICEgame, applicable to any GEM and functioning from open-source software, should thus enhance all GEM-based predictions and subsequent biotechnological and biomedical applications.


Asunto(s)
Escherichia coli , Redes y Vías Metabólicas , Escherichia coli/genética , Escherichia coli/metabolismo , Flujo de Trabajo , Programas Informáticos , Genoma , Modelos Biológicos
6.
J Proteome Res ; 23(5): 1583-1592, 2024 May 03.
Artículo en Inglés | MEDLINE | ID: mdl-38651221

RESUMEN

MD2 pineapple (Ananas comosus) is the second most important tropical crop that preserves crassulacean acid metabolism (CAM), which has high water-use efficiency and is fast becoming the most consumed fresh fruit worldwide. Despite the significance of environmental efficiency and popularity, until very recently, its genome sequence has not been determined and a high-quality annotated proteome has not been available. Here, we have undertaken a pilot proteogenomic study, analyzing the proteome of MD2 pineapple leaves using liquid chromatography-mass spectrometry (LC-MS/MS), which validates 1781 predicted proteins in the annotated F153 (V3) genome. In addition, a further 603 peptide identifications are found that map exclusively to an independent MD2 transcriptome-derived database but are not found in the standard F153 (V3) annotated proteome. Peptide identifications derived from these MD2 transcripts are also cross-referenced to a more recent and complete MD2 genome annotation, resulting in 402 nonoverlapping peptides, which in turn support 30 high-quality gene candidates novel to both pineapple genomes. Many of the validated F153 (V3) genes are also supported by an independent proteomics data set collected for an ornamental pineapple variety. The contigs and peptides have been mapped to the current F153 genome build and are available as bed files to display a custom gene track on the Ensembl Plants region viewer. These analyses add to the knowledge of experimentally validated pineapple genes and demonstrate the utility of transcript-derived proteomics to discover both novel genes and genetic structure in a plant genome, adding value to its annotation.


Asunto(s)
Ananas , Genoma de Planta , Proteínas de Plantas , Proteogenómica , Espectrometría de Masas en Tándem , Ananas/genética , Ananas/química , Proteogenómica/métodos , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Cromatografía Liquida , Proteoma/genética , Proteoma/análisis , Anotación de Secuencia Molecular , Hojas de la Planta/genética , Hojas de la Planta/química , Péptidos/genética , Péptidos/análisis , Péptidos/química
7.
Plant J ; 113(2): 262-276, 2023 01.
Artículo en Inglés | MEDLINE | ID: mdl-36424853

RESUMEN

The king protea (Protea cynaroides), an early-diverging eudicot, is the most iconic species from the Megadiverse Cape Floristic Region, and the national flower of South Africa. Perhaps best known for its iconic flower head, Protea is a key genus for the South African horticulture industry and cut-flower market. Ecologically, the genus and the family Proteaceae are important models for radiation and adaptation, particularly to soils with limited phosphorus bio-availability. Here, we present a high-quality chromosome-scale assembly of the P. cynaroides genome as the first representative of the fynbos biome. We reveal an ancestral whole-genome duplication event that occurred in the Proteaceae around the late Cretaceous that preceded the divergence of all crown groups within the family and its extant diversity in all Southern continents. The relatively stable genome structure of P. cynaroides is invaluable for comparative studies and for unveiling paleopolyploidy in other groups, such as the distantly related sister group Ranunculales. Comparative genomics in sequenced genomes of the Proteales shows loss of key arbuscular mycorrhizal symbiosis genes likely ancestral to the family, and possibly the order. The P. cynaroides genome empowers new research in plant diversification, horticulture and adaptation, particularly to nutrient-poor soils.


Asunto(s)
Proteaceae , Proteaceae/genética , Ecosistema , Genómica , Sudáfrica , Suelo
8.
BMC Genomics ; 25(1): 405, 2024 Apr 24.
Artículo en Inglés | MEDLINE | ID: mdl-38658835

RESUMEN

Graph-based pangenome is gaining more popularity than linear pangenome because it stores more comprehensive information of variations. However, traditional linear genome browser has its own advantages, especially the tremendous resources accumulated historically. With the fast-growing number of individual genomes and their annotations available, the demand for a genome browser to visualize genome annotation for many individuals together with a graph-based pangenome is getting higher and higher. Here we report a new pangenome browser PPanG, a precise pangenome browser enabling nucleotide-level comparison of individual genome annotations together with a graph-based pangenome. Nine rice genomes with annotations were provided by default as potential references, and any individual genome can be selected as the reference. Our pangenome browser provides unprecedented insights on genome variations at different levels from base to gene, and reveals how the structures of a gene could differ for individuals. PPanG can be applied to any species with multiple individual genomes available and it is available at https://cgm.sjtu.edu.cn/PPanG .


Asunto(s)
Genómica , Genómica/métodos , Oryza/genética , Anotación de Secuencia Molecular , Genoma de Planta , Variación Genética , Programas Informáticos , Navegador Web , Bases de Datos Genéticas , Nucleótidos/genética , Genoma
9.
BMC Genomics ; 25(1): 95, 2024 Jan 23.
Artículo en Inglés | MEDLINE | ID: mdl-38262915

RESUMEN

BACKGROUND: Evolutionarily conserved in plants, the enzyme D-myo-inositol-3-phosphate synthase (MIPS; EC 5.5.1.4) regulates the initial, rate-limiting reaction in the phytic acid biosynthetic pathway. They are reported to be transcriptional regulators involved in various physiological functions in the plants, growth, and biotic/abiotic stress responses. Even though the genomes of most legumes are fully sequenced and available, an all-inclusive study of the MIPS family members in legumes is still ongoing. RESULTS: We found 24 MIPS genes in ten legumes: Arachis hypogea, Cicer arietinum, Cajanus cajan, Glycine max, Lablab purpureus, Medicago truncatula, Pisum sativum, Phaseolus vulgaris, Trifolium pratense and Vigna unguiculata. The total number of MIPS genes found in each species ranged from two to three. The MIPS genes were classified into five clades based on their evolutionary relationships with Arabidopsis genes. The structural patterns of intron/exon and the protein motifs that were conserved in each gene were highly group-specific. In legumes, MIPS genes were inconsistently distributed across their genomes. A comparison of genomes and gene sequences showed that this family was subjected to purifying selection and the gene expansion in MIPS family in legumes was mainly caused by segmental duplication. Through quantitative PCR, expression patterns of MIPS in response to various abiotic stresses, in the vegetative tissues of various legumes were studied. Expression pattern shows that MIPS genes control the development and differentiation of various organs, and have significant responses to salinity and drought stress. CONCLUSION: The MIPS genes in the genomes of legumes have been identified, characterized and their expression was analysed. The findings pave way for understanding their molecular functions and evolution, and lead to identify the putative MIPS genes associated with different cell and tissue development.


Asunto(s)
Arabidopsis , Cajanus , Cicer , Phaseolus , Verduras , Glycine max
10.
BMC Genomics ; 25(1): 619, 2024 Jun 19.
Artículo en Inglés | MEDLINE | ID: mdl-38898442

RESUMEN

Plant genomics plays a pivotal role in enhancing global food security and sustainability by offering innovative solutions for improving crop yield, disease resistance, and stress tolerance. As the number of sequenced genomes grows and the accuracy and contiguity of genome assemblies improve, structural annotation of plant genomes continues to be a significant challenge due to their large size, polyploidy, and rich repeat content. In this paper, we present an overview of the current landscape in crop genomics research, highlighting the diversity of genomic characteristics across various crop species. We also assessed the accuracy of popular gene prediction tools in identifying genes within crop genomes and examined the factors that impact their performance. Our findings highlight the strengths and limitations of BRAKER2 and Helixer as leading structural genome annotation tools and underscore the impact of genome complexity, fragmentation, and repeat content on their performance. Furthermore, we evaluated the suitability of the predicted proteins as a reliable search space in proteomics studies using mass spectrometry data. Our results provide valuable insights for future efforts to refine and advance the field of structural genome annotation.


Asunto(s)
Productos Agrícolas , Genoma de Planta , Anotación de Secuencia Molecular , Proteómica , Productos Agrícolas/genética , Proteómica/métodos , Genómica/métodos , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo
11.
BMC Genomics ; 25(1): 575, 2024 Jun 07.
Artículo en Inglés | MEDLINE | ID: mdl-38849728

RESUMEN

BACKGROUND: Staphylococcus shinii appears as an umbrella species encompassing several strains of Staphylococcus pseudoxylosus and Staphylococcus xylosus. Given its phylogenetic closeness to S. xylosus, S. shinii can be found in similar ecological niches, including the microbiota of fermented meats where the species may contribute to colour and flavour development. In addition to these conventional functionalities, a biopreservation potential based on the production of antagonistic compounds may be available. Such potential, however, remains largely unexplored in contrast to the large body of research that is available on the biopreservative properties of lactic acid bacteria. The present study outlines the exploration of the genetic basis of competitiveness and antimicrobial activity of a fermented meat isolate, S. shinii IMDO-S216. To this end, its genome was sequenced, de novo assembled, and annotated. RESULTS: The genome contained a single circular chromosome and eight plasmid replicons. Focus of the genomic exploration was on secondary metabolite biosynthetic gene clusters coding for ribosomally synthesized and posttranslationally modified peptides. One complete cluster was coding for a bacteriocin, namely lactococcin 972; the genes coding for the pre-bacteriocin, the ATP-binding cassette transporter, and the immunity protein were also identified. Five other complete clusters were identified, possibly functioning as competitiveness factors. These clusters were found to be involved in various responses such as membrane fluidity, iron intake from the medium, a quorum sensing system, and decreased sensitivity to antimicrobial peptides and competing microorganisms. The presence of these clusters was equally studied among a selection of multiple Staphylococcus species to assess their prevalence in closely-related organisms. CONCLUSIONS: Such factors possibly translate in an improved adaptation and competitiveness of S. shinii IMDO-S216 which are, in turn, likely to improve its fitness in a fermented meat matrix.


Asunto(s)
Bacteriocinas , Genoma Bacteriano , Staphylococcus , Staphylococcus/genética , Staphylococcus/metabolismo , Bacteriocinas/genética , Bacteriocinas/metabolismo , Fermentación , Genómica/métodos , Metabolismo Secundario/genética , Carne/microbiología , Familia de Multigenes , Filogenia
12.
BMC Genomics ; 25(1): 265, 2024 Mar 09.
Artículo en Inglés | MEDLINE | ID: mdl-38461236

RESUMEN

BACKGROUND: Over the last decades, it was subject of many studies to investigate the genomic connection of milk production and health traits in dairy cattle. Thereby, incorporating functional information in genomic analyses has been shown to improve the understanding of biological and molecular mechanisms shaping complex traits and the accuracies of genomic prediction, especially in small populations and across-breed settings. Still, little is known about the contribution of different functional and evolutionary genome partitioning subsets to milk production and dairy health. Thus, we performed a uni- and a bivariate analysis of milk yield (MY) and eight health traits using a set of ~34,497 German Holstein cows with 50K chip genotypes and ~17 million imputed sequence variants divided into 27 subsets depending on their functional and evolutionary annotation. In the bivariate analysis, eight trait-combinations were observed that contrasted MY with each health trait. Two genomic relationship matrices (GRM) were included, one consisting of the 50K chip variants and one consisting of each set of subset variants, to obtain subset heritabilities and genetic correlations. In addition, 50K chip heritabilities and genetic correlations were estimated applying merely the 50K GRM. RESULTS: In general, 50K chip heritabilities were larger than the subset heritabilities. The largest heritabilities were found for MY, which was 0.4358 for the 50K and 0.2757 for the subset heritabilities. Whereas all 50K genetic correlations were negative, subset genetic correlations were both, positive and negative (ranging from -0.9324 between MY and mastitis to 0.6662 between MY and digital dermatitis). The subsets containing variants which were annotated as noncoding related, splice sites, untranslated regions, metabolic quantitative trait loci, and young variants ranked highest in terms of their contribution to the traits` genetic variance. We were able to show that linkage disequilibrium between subset variants and adjacent variants did not cause these subsets` high effect. CONCLUSION: Our results confirm the connection of milk production and health traits in dairy cattle via the animals` metabolic state. In addition, they highlight the potential of including functional information in genomic analyses, which helps to dissect the extent and direction of the observed traits` connection in more detail.


Asunto(s)
Leche , Polimorfismo de Nucleótido Simple , Animales , Femenino , Bovinos/genética , Fenotipo , Genotipo , Genómica/métodos , Sitios de Carácter Cuantitativo , Lactancia/genética
13.
Mol Biol Evol ; 40(3)2023 03 04.
Artículo en Inglés | MEDLINE | ID: mdl-36857197

RESUMEN

MitoFish, MitoAnnotator, and MiFish Pipeline are comprehensive databases of fish mitochondrial genomes (mitogenomes), accurate annotation software of fish mitogenomes, and a web platform for metabarcoding analysis of fish mitochondrial environmental DNA (eDNA), respectively. The MitoFish Suite currently receives over 48,000 visits worldwide every year; however, the performance and usefulness of the online platforms can still be improved. Here, we present essential updates on these platforms, including an enrichment of the reference data sets, an enhanced searching function, substantially faster genome annotation and eDNA analysis with the denoising of sequencing errors, and a multisample comparative analysis function. These updates have made our platform more intuitive, effective, and reliable. These updated platforms are freely available at http://mitofish.aori.u-tokyo.ac.jp/.


Asunto(s)
ADN Mitocondrial , Genoma Mitocondrial , Animales , Bases de Datos Factuales , Mitocondrias , Programas Informáticos
14.
Curr Issues Mol Biol ; 46(3): 2497-2513, 2024 Mar 14.
Artículo en Inglés | MEDLINE | ID: mdl-38534774

RESUMEN

Phospholipases find versatile applications across industries, including detergent production, food modification, pharmaceuticals (especially in drug delivery systems), and cell signaling research. In this study, we present a strain of Bacillus paranthracis for the first time, demonstrating significant potential in the production of phosphatidylcholine-specific phospholipase C (PC-PLC). The investigation thoroughly examines the B. paranthracis PUMB_17 strain, focusing on the activity of PC-PLC and its purification process. Notably, the PUMB_17 strain displays extracellular PC-PLC production with high specific activity during the late exponential growth phase. To unravel the genetic makeup of PUMB_17, we employed nanopore-based whole-genome sequencing and subsequently conducted a detailed genome annotation. The genome comprises a solitary circular chromosome spanning 5,250,970 bp, featuring a guanine-cytosine ratio of 35.49. Additionally, two plasmids of sizes 64,250 bp and 5845 bp were identified. The annotation analysis reveals the presence of 5328 genes, encompassing 5186 protein-coding sequences, and 142 RNA genes, including 39 rRNAs, 103 tRNAs, and 5 ncRNAs. The aim of this study was to make a comprehensive genomic exploration that promises to enhance our understanding of the previously understudied and recently documented capabilities of Bacillus paranthracis and to shed light on a potential use of the strain in the industrial production of PC-PLC.

15.
J Mol Evol ; 92(3): 338-357, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38809331

RESUMEN

Brucellosis is a notifiable disease induced by a facultative intracellular Brucella pathogen. In this study, eight Brucella abortus and eighteen Brucella melitensis strains from Egypt were annotated and compared with RB51 and REV1 vaccines respectively. RAST toolkit in the BV-BRC server was used for annotation, revealing genome length of 3,250,377 bp and 3,285,803 bp, 3289 and 3323 CDS, 48 and 49 tRNA genes, the same number of rRNA (3) genes, 583 and 586 hypothetical proteins, 2697 and 2726 functional proteins for B. abortus and B. melitensis respectively. B. abortus strains exhibit a similar number of candidate genes, while B. melitensis strains showed some differences, especially in the SRR19520422 Faiyum strain. Also, B. melitensis clarified differences in antimicrobial resistance genes (KatG, FabL, MtrA, MtrB, OxyR, and VanO-type) in SRR19520319 Faiyum and (Erm C and Tet K) in SRR19520422 Faiyum strain. Additionally, the whole genome phylogeny analysis proved that all B. abortus strains were related to vaccinated animals and all B. melitensis strains of Menoufia clustered together and closely related to Gharbia, Dameitta, and Kafr Elshiek. The Bowtie2 tool identified 338 (eight B. abortus) and 4271 (eighteen B. melitensis) single nucleotide polymorphisms (SNPs) along the genomes. These variants had been annotated according to type and impact. Moreover, thirty candidate genes were predicted and submitted at GenBank (24 in B. abortus) and (6 in B. melitensis). This study contributes significant insights into genetic variation, virulence factors, and vaccine-related associations of Brucella pathogens, enhancing our knowledge of brucellosis epidemiology and evolution in Egypt.


Asunto(s)
Brucella abortus , Brucella melitensis , Genoma Bacteriano , Genómica , Filogenia , Brucella melitensis/genética , Brucella abortus/genética , Egipto , Genómica/métodos , Animales , Brucelosis/microbiología , Vacuna contra la Brucelosis/genética , Vacunas Bacterianas
16.
Brief Bioinform ; 23(2)2022 03 10.
Artículo en Inglés | MEDLINE | ID: mdl-35108356

RESUMEN

Bacterial genomes are massively sequenced, and they provide valuable data to better know the complete set of genes of a species. The analysis of thousands of bacterial strains can identify both shared genes and those appearing only in the pathogenic ones. Current computational gene finders facilitate this task but often miss some existing genes. However, the present availability of different genomes from the same species is useful to estimate the selective pressure applied on genes of complete pangenomes. It may assist in evaluating gene predictions either by checking the certainty of a new gene or annotating it as a gene under positive selection. Here, we estimated the selective pressure of 19 271 genes that are part of the pangenome of the human opportunistic pathogen Acinetobacter baumannii and found that most genes in this bacterium are subject to negative selection. However, 23% of them showed values compatible with positive selection. These latter were mainly uncharacterized proteins or genes required to evade the host defence system including genes related to resistance and virulence whose changes may be favoured to acquire new functions. Finally, we evaluated the utility of measuring selection pressure in the detection of sequencing errors and the validation of gene prediction.


Asunto(s)
Acinetobacter baumannii , Genoma Bacteriano , Acinetobacter baumannii/genética , Acinetobacter baumannii/metabolismo , Bacterias/genética , Secuencia de Bases , Humanos , Filogenia , Virulencia/genética
17.
J Hered ; 115(1): 112-119, 2024 Feb 03.
Artículo en Inglés | MEDLINE | ID: mdl-37988623

RESUMEN

Snakeflies (Raphidioptera) are the smallest order of holometabolous insects that have kept their distinct and name-giving appearance since the Mesozoic, probably since the Jurassic, and possibly even since their emergence in the Carboniferous, more than 300 million years ago. Despite their interesting nature and numerous publications on their morphology, taxonomy, systematics, and biogeography, snakeflies have never received much attention from the general public, and only a few studies were devoted to their molecular biology. Due to this lack of molecular data, it is therefore unknown, if the conserved morphological nature of these living fossils translates to conserved genomic structures. Here, we present the first genome of the species and of the entire order of Raphidioptera. The final genome assembly has a total length of 669 Mbp and reached a high continuity with an N50 of 5.07 Mbp. Further quality controls also indicate a high completeness and no meaningful contamination. The newly generated data was used in a large-scaled phylogenetic analysis of snakeflies using shared orthologous sequences. Quartet score and gene concordance analyses revealed high amounts of conflicting signals within this group that might speak for substantial incomplete lineage sorting and introgression after their presumed re-radiation after the asteroid impact 66 million years ago. Overall, this reference genome will be a door-opening dataset for many future research applications, and we demonstrated its utility in a phylogenetic analysis that provides new insights into the evolution of this group of living fossils.


Asunto(s)
Fósiles , Genoma , Animales , Filogenia , Genómica , Insectos/genética
18.
Antonie Van Leeuwenhoek ; 117(1): 91, 2024 Jun 22.
Artículo en Inglés | MEDLINE | ID: mdl-38907751

RESUMEN

A Gram-stain-negative, facultative anaerobe, rod-shaped strain JX-1T was isolated from UASB sludge treating landfill leachate in Wuhan, China. The isolate is capable of growing under conditions of pH 6.0-11.0 (optimum, pH 7.0-8.0), temperature 4-42 â„ƒ (optimum, 20-30 â„ƒ), 0-8.0% (w/v) NaCl (optimum, 5.0%), and ammonia nitrogen concentration of 200-5000 mg/L (optimum, 500 mg/L) on LB plates. The microorganism can utilize malic acid, D-galactose, L-rhamnose, inosine, and L-glutamic acid as carbon sources, but does not reduce nitrates and nitrites. The major fatty acids are C18:1ω7c/C18:1ω6c, iso-C15:0, and anteiso-C15:0. The respiratory quinones are Q9 (91.92%) and Q8 (8.08%). Polar lipids include aminolipid, aminophospholipid, diphosphatidylglycerol, glycolipid, phosphatidylethanolamine, phosphatidylglycerol, and phospholipid. Compared with other strains, strain JX-1T and Denitrificimonas caeni HY-14T have the highest values in terms of 16S rRNA gene sequence similarity (96.79%), average nucleotide identity (ANI; 76.06%), and average amino acid identity (AAI; 78.89%). Its digital DNA-DNA hybridization (dDDH) result is 20.3%. The genome of strain JX-1T, with a size of 2.78 Mb and 46.12 mol% G + C content, lacks genes for denitrification and dissimilatory nitrate reduction to ammonium (DNRA), but contains genes for ectoine synthesis as a secondary metabolite. The results of this polyphasic study allow genotypic and phenotypic differentiation of the analysed strain from the closest related species and confirm that the strain represents a novel species within the genus Denitrificimonas, for which the name Denitrificimonas halotolerans sp. nov. is proposed with JX-1T (= MCCC 1K08958T = KCTC 8395T) as the type strain.


Asunto(s)
Composición de Base , Filogenia , ARN Ribosómico 16S , Aguas del Alcantarillado , Aguas del Alcantarillado/microbiología , ARN Ribosómico 16S/genética , China , Ácidos Grasos/química , ADN Bacteriano/genética , Técnicas de Tipificación Bacteriana , Aeromonadaceae/genética , Aeromonadaceae/clasificación , Aeromonadaceae/aislamiento & purificación , Aeromonadaceae/metabolismo , Fosfolípidos/análisis
19.
Curr Genomics ; 25(3): 226-235, 2024 May 31.
Artículo en Inglés | MEDLINE | ID: mdl-39086996

RESUMEN

Introduction: Nicotine degradation is a new strategy to block nicotine-induced pathology. The potential of human microbiota to degrade nicotine has not been explored. Aims: This study aimed to uncover the genomic potentials of human microbiota to degrade nicotine. Methods: To address this issue, we performed a systematic annotation of Nicotine-Degrading Enzymes (NDEs) from genomes and metagenomes of human microbiota. A total of 26,295 genomes and 1,596 metagenomes for human microbiota were downloaded from public databases and five types of NDEs were annotated with a custom pipeline. We found 959 NdhB, 785 NdhL, 987 NicX, three NicA1, and three NicA2 homologs. Results: Genomic classification revealed that six phylum-level taxa, including Proteobacteria, Firmicutes, Firmicutes_A, Bacteroidota, Actinobacteriota, and Chloroflexota, can produce NDEs, with Proteobacteria encoding all five types of NDEs studied. Analysis of NicX prevalence revealed differences among body sites. NicX homologs were found in gut and oral samples with a high prevalence but not found in lung samples. NicX was found in samples from both smokers and non-smokers, though the prevalence might be different. Conclusion: This study represents the first systematic investigation of NDEs from the human microbiota, providing new insights into the physiology and ecological functions of human microbiota and shedding new light on the development of nicotine-degrading probiotics for the treatment of smoking-related diseases.

20.
Int J Mol Sci ; 25(7)2024 Mar 25.
Artículo en Inglés | MEDLINE | ID: mdl-38612464

RESUMEN

Immunodominant alloantigens in pig sperm membranes include 15 known gene products and a previously undiscovered Mr 20,000 sperm membrane-specific protein (SMA20). Here we characterize SMA20 and identify it as the unannotated pig ortholog of PMIS2. A composite SMA20 cDNA encoded a 126 amino acid polypeptide comprising two predicted transmembrane segments and an N-terminal alanine- and proline (AP)-rich region with no apparent signal peptide. The Northern blots showed that the composite SMA20 cDNA was derived from a 1.1 kb testis-specific transcript. A BLASTp search retrieved no SMA20 match from the pig genome, but it did retrieve a 99% match to the Pmis2 gene product in warthog. Sequence identity to predicted PMIS2 orthologs from other placental mammals ranged from no more than 80% overall in Cetartiodactyla to less than 60% in Primates, with the AP-rich region showing the highest divergence, including, in the extreme, its absence in most rodents, including the mouse. SMA20 immunoreactivity localized to the acrosome/apical head of methanol-fixed boar spermatozoa but not live, motile cells. Ultrastructurally, the SMA20 AP-rich domain immunolocalized to the inner leaflet of the plasma membrane, the outer acrosomal membrane, and the acrosomal contents of ejaculated spermatozoa. Gene name search failed to retrieve annotated Pmis2 from most mammalian genomes. Nevertheless, individual pairwise interrogation of loci spanning Atp4a-Haus5 identified Pmis2 in all placental mammals, but not in marsupials or monotremes. We conclude that the gene encoding sperm-specific SMA20/PMIS2 arose de novo in Eutheria after divergence from Metatheria, whereupon rapid molecular evolution likely drove the acquisition of a species-divergent function unique to fertilization in placental mammals.


Asunto(s)
Placenta , Semen , Masculino , Femenino , Embarazo , Porcinos , Animales , Ratones , ADN Complementario , Espermatozoides , Euterios , Alanina , Isoantígenos/genética , Fertilización/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA