Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 3.658
Filtrar
1.
BMC Genom Data ; 25(1): 70, 2024 Jul 15.
Artigo em Inglês | MEDLINE | ID: mdl-39009995

RESUMO

OBJECTIVES: Ants are ecologically dominant insects in most terrestrial ecosystems, with more than 14,000 extant species in about 340 genera recorded to date. However, genomic resources are still scarce for most species, especially for species endemic in East or Southeast Asia, limiting the study of phylogeny, speciation and adaptation of this evolutionarily successful animal lineage. Here, we assemble and annotate the genomes of Odontoponera transversa and Camponotus friedae, two ant species with a natural distribution in China, to facilitate future study of ant evolution. DATA DESCRIPTION: We obtained a total of 16 Gb and 51 Gb PacBio HiFi data for O. transversa and C. friedae, respectively, which were assembled into the draft genomes of 339 Mb for O. transversa and 233 Mb for C. friedae. Genome assessments by multiple metrics showed good completeness and high accuracy of the two assemblies. Gene annotations assisted by RNA-seq data yielded a comparable number of protein-coding genes in the two genomes (10,892 for O. transversa and 11,296 for C. friedae), while repeat annotations revealed a remarkable difference of repeat content between these two ant species (149.4 Mb for O. transversa versus 49.7 Mb for C. friedae). Besides, complete mitochondrial genomes for the two species were assembled and annotated.


Assuntos
Formigas , Genoma de Inseto , Animais , Formigas/genética , Formigas/classificação , Genoma de Inseto/genética , Anotação de Sequência Molecular , Filogenia , Genômica/métodos
2.
JMIR Med Inform ; 12: e59680, 2024 Jul 02.
Artigo em Inglês | MEDLINE | ID: mdl-38954456

RESUMO

BACKGROUND: Named entity recognition (NER) is a fundamental task in natural language processing. However, it is typically preceded by named entity annotation, which poses several challenges, especially in the clinical domain. For instance, determining entity boundaries is one of the most common sources of disagreements between annotators due to questions such as whether modifiers or peripheral words should be annotated. If unresolved, these can induce inconsistency in the produced corpora, yet, on the other hand, strict guidelines or adjudication sessions can further prolong an already slow and convoluted process. OBJECTIVE: The aim of this study is to address these challenges by evaluating 2 novel annotation methodologies, lenient span and point annotation, aiming to mitigate the difficulty of precisely determining entity boundaries. METHODS: We evaluate their effects through an annotation case study on a Japanese medical case report data set. We compare annotation time, annotator agreement, and the quality of the produced labeling and assess the impact on the performance of an NER system trained on the annotated corpus. RESULTS: We saw significant improvements in the labeling process efficiency, with up to a 25% reduction in overall annotation time and even a 10% improvement in annotator agreement compared to the traditional boundary-strict approach. However, even the best-achieved NER model presented some drop in performance compared to the traditional annotation methodology. CONCLUSIONS: Our findings demonstrate a balance between annotation speed and model performance. Although disregarding boundary information affects model performance to some extent, this is counterbalanced by significant reductions in the annotator's workload and notable improvements in the speed of the annotation process. These benefits may prove valuable in various applications, offering an attractive compromise for developers and researchers.

3.
Metabolomics ; 20(4): 73, 2024 Jul 09.
Artigo em Inglês | MEDLINE | ID: mdl-38980450

RESUMO

INTRODUCTION: During the Metabolomics 2023 conference, the Metabolomics Quality Assurance and Quality Control Consortium (mQACC) presented a QA/QC workshop for LC-MS-based untargeted metabolomics. OBJECTIVES: The Best Practices Working Group disseminated recent findings from community forums and discussed aspects to include in a living guidance document. METHODS: Presentations focused on reference materials, data quality review, metabolite identification/annotation and quality assurance. RESULTS: Live polling results and follow-up discussions offered a broad international perspective on QA/QC practices. CONCLUSIONS: Community input gathered from this workshop series is being used to shape the living guidance document, a continually evolving QA/QC best practices resource for metabolomics researchers.


Assuntos
Espectrometria de Massas , Metabolômica , Controle de Qualidade , Metabolômica/métodos , Metabolômica/normas , Cromatografia Líquida/métodos , Cromatografia Líquida/normas , Espectrometria de Massas/métodos , Humanos , Consenso , Espectrometria de Massa com Cromatografia Líquida
4.
PeerJ ; 12: e17651, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38993980

RESUMO

Background: Genomic resource development for non-model organisms is rapidly progressing, seeking to uncover molecular mechanisms and evolutionary adaptations enabling thriving in diverse environments. Limited genomic data for bat species hinder insights into their evolutionary processes, particularly within the diverse Myotis genus of the Vespertilionidae family. In Mexico, 15 Myotis species exist, with three-M. vivesi, M. findleyi, and M. planiceps-being endemic and of conservation concern. Methods: We obtained samples of Myotis vivesi, M. findleyi, and M. planiceps for genomic analysis. Each of three genomic DNA was extracted, sequenced, and assembled. The scaffolding was carried out utilizing the M. yumanensis genome via a genome-referenced approach within the ntJoin program. GapCloser was employed to fill gaps. Repeat elements were characterized, and gene prediction was done via ab initio and homology methods with MAKER pipeline. Functional annotation involved InterproScan, BLASTp, and KEGG. Non-coding RNAs were annotated with INFERNAL, and tRNAscan-SE. Orthologous genes were clustered using Orthofinder, and a phylogenomic tree was reconstructed using IQ-TREE. Results: We present genome assemblies of these endemic species using Illumina NovaSeq 6000, each exceeding 2.0 Gb, with over 90% representing single-copy genes according to BUSCO analyses. Transposable elements, including LINEs and SINEs, constitute over 30% of each genome. Helitrons, consistent with Vespertilionids, were identified. Values around 20,000 genes from each of the three assemblies were derived from gene annotation and their correlation with specific functions. Comparative analysis of orthologs among eight Myotis species revealed 20,820 groups, with 4,789 being single copy orthogroups. Non-coding RNA elements were annotated. Phylogenomic tree analysis supported evolutionary chiropterans' relationships. These resources contribute significantly to understanding gene evolution, diversification patterns, and aiding conservation efforts for these endangered bat species.


Assuntos
Quirópteros , Genoma , Genômica , Filogenia , Animais , México , Genoma/genética , Quirópteros/genética , Genômica/métodos
5.
Biom J ; 66(5): e202300182, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-39001709

RESUMO

Spatial count data with an abundance of zeros arise commonly in disease mapping studies. Typically, these data are analyzed using zero-inflated models, which comprise a mixture of a point mass at zero and an ordinary count distribution, such as the Poisson or negative binomial. However, due to their mixture representation, conventional zero-inflated models are challenging to explain in practice because the parameter estimates have conditional latent-class interpretations. As an alternative, several authors have proposed marginalized zero-inflated models that simultaneously model the excess zeros and the marginal mean, leading to a parameterization that more closely aligns with ordinary count models. Motivated by a study examining predictors of COVID-19 death rates, we develop a spatiotemporal marginalized zero-inflated negative binomial model that directly models the marginal mean, thus extending marginalized zero-inflated models to the spatial setting. To capture the spatiotemporal heterogeneity in the data, we introduce region-level covariates, smooth temporal effects, and spatially correlated random effects to model both the excess zeros and the marginal mean. For estimation, we adopt a Bayesian approach that combines full-conditional Gibbs sampling and Metropolis-Hastings steps. We investigate features of the model and use the model to identify key predictors of COVID-19 deaths in the US state of Georgia during the 2021 calendar year.


Assuntos
Teorema de Bayes , Biometria , COVID-19 , Modelos Estatísticos , Humanos , COVID-19/mortalidade , COVID-19/epidemiologia , Georgia/epidemiologia , Biometria/métodos , Análise Espacial , Distribuição Binomial
6.
Genome Biol Evol ; 16(7)2024 Jul 03.
Artigo em Inglês | MEDLINE | ID: mdl-38946321

RESUMO

Oecanthus is a genus of cricket known for its distinctive chirping and distributed across major zoogeographical regions worldwide. This study focuses on Oecanthus rufescens, and conducts a comprehensive examination of its genome through genome sequencing technologies and bioinformatic analysis. A high-quality chromosome-level genome of O. rufescens was successfully obtained, revealing significant features of its genome structure. The genome size is 877.9 Mb, comprising ten pseudo-chromosomes and 70 other sequences, with a GC content of 41.38% and an N50 value of 157,110,771 bp, indicating a high level of continuity. BUSCO assessment results demonstrate that the genome's integrity and quality are high (of which 96.8% are single-copy and 1.6% are duplicated). Comprehensive genome annotation was also performed, identifying approximately 310 Mb of repetitive sequences, accounting for 35.3% of the total genome sequence, and discovering 15,481 tRNA genes, 4,082 rRNA genes, and 1,212 other noncoding genes. Furthermore, 15,031 protein-coding genes were identified, with BUSCO assessment results showing that 98.4% (of which 96.3% are single-copy and 1.6% are duplicated) of the genes were annotated.


Assuntos
Genoma de Inseto , Anotação de Sequência Molecular , Animais , Cromossomos de Insetos/genética , Gryllidae/genética , Ortópteros/genética , Ortópteros/classificação
7.
BioData Min ; 17(1): 22, 2024 Jul 12.
Artigo em Inglês | MEDLINE | ID: mdl-38997749

RESUMO

BACKGROUND: The use of machine learning in medical diagnosis and treatment has grown significantly in recent years with the development of computer-aided diagnosis systems, often based on annotated medical radiology images. However, the lack of large annotated image datasets remains a major obstacle, as the annotation process is time-consuming and costly. This study aims to overcome this challenge by proposing an automated method for annotating a large database of medical radiology images based on their semantic similarity. RESULTS: An automated, unsupervised approach is used to create a large annotated dataset of medical radiology images originating from the Clinical Hospital Centre Rijeka, Croatia. The pipeline is built by data-mining three different types of medical data: images, DICOM metadata and narrative diagnoses. The optimal feature extractors are then integrated into a multimodal representation, which is then clustered to create an automated pipeline for labelling a precursor dataset of 1,337,926 medical images into 50 clusters of visually similar images. The quality of the clusters is assessed by examining their homogeneity and mutual information, taking into account the anatomical region and modality representation. CONCLUSIONS: The results indicate that fusing the embeddings of all three data sources together provides the best results for the task of unsupervised clustering of large-scale medical data and leads to the most concise clusters. Hence, this work marks the initial step towards building a much larger and more fine-grained annotated dataset of medical radiology images.

8.
Data Brief ; 55: 110545, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38952954

RESUMO

This dataset involves a collection of soybean market news through web scraping from a Brazilian website. The news articles gathered span from January 2015 to June 2023 and have undergone a labeling process to categorize them as relevant or non-relevant. The news labeling process was conducted under the guidance of an agricultural economics expert, who collaborated with a group of nine individuals. Ten parameters were considered to assist participants in the labeling process. The dataset comprises approximately 11,000 news articles and serves as a valuable resource for researchers interested in exploring trends in the soybean market. Importantly, this dataset can be utilized for tasks such as classification and natural language processing. It provides insights into labeled soybean market news and supports open science initiatives, facilitating further analysis within the research community.

9.
Med Image Anal ; 97: 103262, 2024 Jul 02.
Artigo em Inglês | MEDLINE | ID: mdl-38986351

RESUMO

Automatic image-based severity estimation is an important task in computer-aided diagnosis. Severity estimation by deep learning requires a large amount of training data to achieve a high performance. In general, severity estimation uses training data annotated with discrete (i.e., quantized) severity labels. Annotating discrete labels is often difficult in images with ambiguous severity, and the annotation cost is high. In contrast, relative annotation, in which the severity between a pair of images is compared, can avoid quantizing severity and thus makes it easier. We can estimate relative disease severity using a learning-to-rank framework with relative annotations, but relative annotation has the problem of the enormous number of pairs that can be annotated. Therefore, the selection of appropriate pairs is essential for relative annotation. In this paper, we propose a deep Bayesian active learning-to-rank that automatically selects appropriate pairs for relative annotation. Our method preferentially annotates unlabeled pairs with high learning efficiency from the model uncertainty of the samples. We prove the theoretical basis for adapting Bayesian neural networks to pairwise learning-to-rank and demonstrate the efficiency of our method through experiments on endoscopic images of ulcerative colitis on both private and public datasets. We also show that our method achieves a high performance under conditions of significant class imbalance because it automatically selects samples from the minority classes.

10.
Magn Reson Med ; 2024 Jul 10.
Artigo em Inglês | MEDLINE | ID: mdl-38987985

RESUMO

PURPOSE: The transverse relaxation time T 2 $$ {}_2 $$ holds significant relevance in clinical applications and research studies. Conventional T 2 $$ {}_2 $$ mapping approaches rely on spin-echo sequences, which require lengthy acquisition times and involve high radiofrequency (RF) power deposition. An alternative gradient echo (GRE) phase-based T 2 $$ {}_2 $$ mapping method, utilizing steady-state acquisitions at one small RF spoil phase increment, was recently demonstrated. Here, a modified magnitude- and phase-based T 2 $$ {}_2 $$ mapping approach is proposed, which improves T 2 $$ {\mathrm{T}}_2 $$ estimations by simultaneous fitting of T 1 $$ {\mathrm{T}}_1 $$ and signal amplitude ( A ∝ P D $$ A\propto PD $$ ) at three or more RF spoiling phase increments, instead of assuming a fixed T 1 $$ {\mathrm{T}}_1 $$ value. METHODS: The feasibility of the magnitude-phase-based method was assessed by simulations, in phantom and in vivo experiments using skipped-CAIPI three-dimensional-echo-planar imaging (3D-EPI) for rapid GRE imaging. T 2 $$ {\mathrm{T}}_2 $$ , T 1 $$ {\mathrm{T}}_1 $$ and PD estimations obtained by our method were compared to T 2 $$ {\mathrm{T}}_2 $$ of the phase-based method and T 1 $$ {\mathrm{T}}_1 $$ and PD of spoiled GRE-based multi-parameter mapping using a multi-echo version of the same sequence. RESULTS: The agreement of the proposed T 2 $$ {\mathrm{T}}_2 $$ with ground truth and reference T 2 $$ {\mathrm{T}}_2 $$ values was higher than that of phase-based T 2 $$ {\mathrm{T}}_2 $$ in simulations and in phantom data. While phase-based T 2 $$ {\mathrm{T}}_2 $$ overestimation increases with actual T 2 $$ {\mathrm{T}}_2 $$ and T 1 $$ {\mathrm{T}}_1 $$ , the proposed method is accurate over a large range of physiologically meaningful T 2 $$ {\mathrm{T}}_2 $$ and T 1 $$ {\mathrm{T}}_1 $$ values. At the same time, precision is improved. In vivo results were in line with these observations. CONCLUSION: Accurate magnitude-phase-based T 2 $$ {}_2 $$ mapping is feasible in less than 5 min scan time for 1 mm nominal isotropic whole-head coverage at 3T and 7T.

11.
Magn Reson Med ; 2024 Jul 10.
Artigo em Inglês | MEDLINE | ID: mdl-38988054

RESUMO

PURPOSE: To standardize T 2 $$ {}_2 $$ -weighted images from clinical Turbo Spin Echo (TSE) scans by generating corresponding T 2 $$ {}_2 $$ maps with the goal of removing scanner- and/or protocol-specific heterogeneity. METHODS: The T 2 $$ {}_2 $$ map is estimated by minimizing an objective function containing a data fidelity term in a Virtual Conjugate Coils (VCC) framework, where the signal evolution model is expressed as a linear constraint. The objective function is minimized by Projected Gradient Descent (PGD). RESULTS: The algorithm achieves accuracy comparable to methods with customized sampling schemes for accelerated T 2 $$ {}_2 $$ mapping. The results are insensitive to the tunable parameters, and the relaxed background phase prior produces better T 2 $$ {}_2 $$ maps compared to the strict real-value enforcement. It is worth noting that the algorithm works well with challenging T 2 $$ {}_2 $$ w-TSE data using typical clinical parameters. The observed normalized root mean square error ranges from 6.8% to 12.3% over grey and white matter, a clinically common level of quantitative map error. CONCLUSION: The novel methodological development creates an efficient algorithm that allows for T 2 $$ {}_2 $$ map generated from TSE data with typical clinical parameters, such as high resolution, long echo train length, and low echo spacing. Reconstruction of T 2 $$ {}_2 $$ maps from TSE data with typical clinical parameters has not been previously reported.

12.
Magn Reson Med ; 2024 Jul 10.
Artigo em Inglês | MEDLINE | ID: mdl-38988040

RESUMO

PURPOSE: To explore the high signal-to-noise ratio (SNR) efficiency of interleaved multishot 3D-EPI with standard image reconstruction for fast and robust high-resolution whole-brain quantitative susceptibility (QSM) and R 2 ∗ $$ {R}_2^{\ast } $$ mapping at 7 and 3T. METHODS: Single- and multi-TE segmented 3D-EPI is combined with conventional CAIPIRINHA undersampling for up to 72-fold effective gradient echo (GRE) imaging acceleration. Across multiple averages, scan parameters are varied (e.g., dual-polarity frequency-encoding) to additionally correct for B 0 $$ {\mathrm{B}}_0 $$ -induced artifacts, geometric distortions and motion retrospectively. A comparison to established GRE protocols is made. Resolutions range from 1.4 mm isotropic (1 multi-TE average in 36 s) up to 0.4 mm isotropic (2 single-TE averages in approximately 6 min) with whole-head coverage. RESULTS: Only 1-4 averages are needed for sufficient SNR with 3D-EPI, depending on resolution and field strength. Fast scanning and small voxels together with retrospective corrections result in substantially reduced image artifacts, which improves susceptibility and R 2 ∗ $$ {R}_2^{\ast } $$ mapping. Additionally, much finer details are obtained in susceptibility-weighted image projections through significantly reduced partial voluming. CONCLUSION: Using interleaved multishot 3D-EPI, single-TE and multi-TE data can readily be acquired 10 times faster than with conventional, accelerated GRE imaging. Even 0.4 mm isotropic whole-head QSM within 6 min becomes feasible at 7T. At 3T, motion-robust 0.8 mm isotropic whole-brain QSM and R 2 ∗ $$ {R}_2^{\ast } $$ mapping with no apparent distortion in less than 7 min becomes clinically feasible. Stronger gradient systems may allow for even higher effective acceleration rates through larger EPI factors while maintaining optimal contrast.

13.
Med Phys ; 2024 Jul 09.
Artigo em Inglês | MEDLINE | ID: mdl-38980220

RESUMO

An Addendum to the AAPM's TG-51 protocol for the determination of absorbed dose to water is presented for electron beams with energies between 4 MeV and 22 MeV ( 1.70 cm ≤ R 50 ≤ 8.70 cm $1.70\nobreakspace {\rm cm} \le R_{\text{50}} \le 8.70\nobreakspace {\rm cm}$ ). This updated formalism allows simplified calibration procedures, including the use of calibrated cylindrical ionization chambers in all electron beams without the use of a gradient correction. New k Q $k_{Q}$ data are provided for electron beams based on Monte Carlo simulations. Implementation guidance is provided. Components of the uncertainty budget in determining absorbed dose to water at the reference depth are discussed. Specifications for a reference-class chamber in electron beams include chamber stability, settling, ion recombination behavior, and polarity dependence. Progress in electron beam reference dosimetry is reviewed. Although this report introduces some major changes (e.g., gradient corrections are implicitly included in the electron beam quality conversion factors), they serve to simplify the calibration procedure. Results for absorbed dose per linac monitor unit are expected to be up to approximately 2 % higher using this Addendum compared to using the original TG-51 protocol.

14.
Anal Bioanal Chem ; 2024 Jul 09.
Artigo em Inglês | MEDLINE | ID: mdl-38980330

RESUMO

Exhaled breath volatilomics is a powerful non-invasive tool for biomarker discovery in medical applications, but compound annotation is essential for pathophysiological insights and technology transfer. This study was aimed at investigating the interest of a hybrid approach combining real-time proton transfer reaction-time-of-flight mass spectrometry (PTR-TOF-MS) with comprehensive thermal desorption-two-dimensional gas chromatography coupled to time-of-flight mass spectrometry (TD-GCxGC-TOF-MS) to enhance the analysis and characterization of VOCs in clinical research, using COVID-19 as a use case. VOC biomarker candidates were selected from clinical research using PTR-TOF-MS fingerprinting in patients with COVID-19 and matched to the Human Breathomic Database. Corresponding analytical standards were analysed using both a liquid calibration unit coupled to PTR-TOF-MS and TD-GCxGC-TOF-MS, together with confirmation on new clinical samples with TD-GCxGC-TOF-MS. From 26 potential VOC biomarkers, 23 were successfully detected with PTR-TOF-MS. All VOCs were successfully detected using TD-GCxGC-TOF-MS, providing effective separation of highly chemically related compounds, including isomers, and enabling high-confidence annotation based on two-dimensional chromatographic separation and mass spectra. Four VOCs were identified with a level 1 annotation in the clinical samples. For future applications, the combination of real-time PTR-TOF-MS and comprehensive TD-GCxGC-TOF-MS, at least on a subset of samples from a whole study, would enhance the performance of VOC annotation, offering potential advancements in biomarker discovery for clinical research.

15.
BMC Genomics ; 25(1): 690, 2024 Jul 13.
Artigo em Inglês | MEDLINE | ID: mdl-39003468

RESUMO

BACKGROUND: Heritability partitioning approaches estimate the contribution of different functional classes, such as coding or regulatory variants, to the genetic variance. This information allows a better understanding of the genetic architecture of complex traits, including complex diseases, but can also help improve the accuracy of genomic selection in livestock species. However, methods have mainly been tested on human genomic data, whereas livestock populations have specific characteristics, such as high levels of relatedness, small effective population size or long-range levels of linkage disequilibrium. RESULTS: Here, we used data from 14,762 cows, imputed at the whole-genome sequence level for 11,537,240 variants, to simulate traits in a typical livestock population and evaluate the accuracy of two state-of-the-art heritability partitioning methods, GREML and a Bayesian mixture model. In simulations where a single functional class had increased contribution to heritability, we observed that the estimators were unbiased but had low precision. When causal variants were enriched in variants with low (< 0.05) or high (> 0.20) minor allele frequency or low (below 1st quartile) or high (above 3rd quartile) linkage disequilibrium scores, it was necessary to partition the genetic variance into multiple classes defined on the basis of allele frequencies or LD scores to obtain unbiased results. When multiple functional classes had variable contributions to heritability, estimators showed higher levels of variation and confounding between certain categories was observed. In addition, estimators from small categories were particularly imprecise. However, the estimates and their ranking were still informative about the contribution of the classes. We also demonstrated that using methods that estimate the contribution of a single category at a time, a commonly used approach, results in an overestimation. Finally, we applied the methods to phenotypes for muscular development and height and estimated that, on average, variants in open chromatin regions had a higher contribution to the genetic variance (> 45%), while variants in coding regions had the strongest individual effects (> 25-fold enrichment on average). Conversely, variants in intergenic or intronic regions showed lower levels of enrichment (0.2 and 0.6-fold on average, respectively). CONCLUSIONS: Heritability partitioning approaches should be used cautiously in livestock populations, in particular for small categories. Two-component approaches that fit only one functional category at a time lead to biased estimators and should not be used.


Assuntos
Desequilíbrio de Ligação , Gado , Animais , Gado/genética , Bovinos/genética , Teorema de Bayes , Modelos Genéticos , Frequência do Gene , Polimorfismo de Nucleotídeo Único , Característica Quantitativa Herdável , Variação Genética , Genômica/métodos , Fenótipo
16.
Bio Protoc ; 14(13): e5023, 2024 Jul 05.
Artigo em Inglês | MEDLINE | ID: mdl-39007158

RESUMO

In recent years, the increase in genome sequencing across diverse plant species has provided a significant advantage for phylogenomics studies, allowing the analysis of one of the most diverse gene families in plants: nucleotide-binding leucine-rich repeat receptors (NLRs). However, due to the sequence diversity of the NLR gene family, identifying key molecular features and functionally conserved sequence patterns is challenging through multiple sequence alignment. Here, we present a step-by-step protocol for a computational pipeline designed to identify evolutionarily conserved motifs in plant NLR proteins. In this protocol, we use a large-scale NLR dataset, including 1,862 NLR genes annotated from monocot and dicot species, to predict conserved sequence motifs, such as the MADA and EDVID motifs, within the coiled-coil (CC)-NLR subfamily. Our pipeline can be applied to identify molecular signatures that have remained conserved in the gene family over evolutionary time across plant species. Key features • Phylogenomics analysis of plant NLR immune receptor family. • Identification of functionally conserved sequence patterns among plant NLRs.

17.
Environ Geochem Health ; 46(9): 321, 2024 Jul 16.
Artigo em Inglês | MEDLINE | ID: mdl-39012543

RESUMO

Highly acidic citrus pomace (CP) is a byproduct of Pericarpium Citri Reticulatae production and causes significant environmental damage. In this study, a newly isolated acid-tolerant strain of Serratia sp. JS-043 was used to treat CP and evaluate the effect of reduced acid citrus pomace (RACP) in passivating heavy metals. The results showed that biological treatment could remove 97.56% of citric acid in CP, the organic matter in the soil increased by 202.60% and the catalase activity in the soil increased from 0 to 0.117 U g-1. Adding RACP into soil can increase the stabilization of Cu, Zn, As, Co, and Pb. Specifically, through the metabolism of strain JS-043, RACP was also involved in the stabilization of Zn and Pb, and Residual Fraction in the total pool of these metals increased by 10.73% and 10.54%, respectively. Finally, the genome sequence of Serratia sp. JS-043 was completed, and the genetic basis of its acid-resistant and acid-reducing characteristics was preliminarily revealed. JS-043 also contains many genes encoding proteins associated with heavy metal ion tolerance and transport. These findings suggest that JS-043 may be a high-potential strain to improve the quality of acidic organic wastes that can then be useful for soil bioremediation.


Assuntos
Biodegradação Ambiental , Metais Pesados , Serratia , Microbiologia do Solo , Poluentes do Solo , Serratia/metabolismo , Serratia/genética , Metais Pesados/metabolismo , Poluentes do Solo/metabolismo , Concentração de Íons de Hidrogênio , Citrus
18.
RNA Biol ; 21(1): 52-74, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38989833

RESUMO

The aim of this study was to compare the circular transcriptome of divergent tissues in order to understand: i) the presence of circular RNAs (circRNAs) that are not exonic circRNAs, i.e. originated from backsplicing involving known exons and, ii) the origin of artificial circRNA (artif_circRNA), i.e. circRNA not generated in-vivo. CircRNA identification is mostly an in-silico process, and the analysis of data from the BovReg project (https://www.bovreg.eu/) provided an opportunity to explore new ways to identify reliable circRNAs. By considering 117 tissue samples, we characterized 23,926 exonic circRNAs, 337 circRNAs from 273 introns (191 ciRNAs, 146 intron circles), 108 circRNAs from small non-coding genes and nearly 36.6K circRNAs classified as other_circRNAs. Furthermore, for 63 of those samples we analysed in parallel data from total-RNAseq (ribosomal RNAs depleted prior to library preparation) with paired mRNAseq (library prepared with poly(A)-selected RNAs). The high number of circRNAs detected in mRNAseq, and the significant number of novel circRNAs, mainly other_circRNAs, led us to consider all circRNAs detected in mRNAseq as artificial. This study provided evidence of 189 false entries in the list of exonic circRNAs: 103 artif_circRNAs identified by total RNAseq/mRNAseq comparison using two circRNA tools, 26 probable artif_circRNAs, and 65 identified by deep annotation analysis. Extensive benchmarking was performed (including analyses with CIRI2 and CIRCexplorer-2) and confirmed 94% of the 23,737 reliable exonic circRNAs. Moreover, this study demonstrates the effectiveness of a panel of highly expressed exonic circRNAs (5-8%) in analysing the tissue specificity of the bovine circular transcriptome.


Assuntos
Éxons , RNA Circular , RNA Circular/genética , Animais , Bovinos , Íntrons , Biologia Computacional/métodos , Transcriptoma , Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos
19.
Methods Mol Biol ; 2836: 19-34, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38995533

RESUMO

Genome annotation has historically ignored small open reading frames (smORFs), which encode a class of proteins shorter than 100 amino acids, collectively referred to as microproteins. This cutoff was established to avoid thousands of false positives due to limitations of pure genomics pipelines. Proteogenomics, a computational approach that combines genomics, transcriptomics, and proteomics, makes it possible to accurately identify these short sequences by overlaying different levels of omics evidence. In this chapter, we showcase the use of µProteInS, a bioinformatics pipeline developed for the identification of unannotated microproteins encoded by smORFs in bacteria. The workflow covers all the steps from quality control and transcriptome assembly to the scoring and post-processing of mass spectrometry data. Additionally, we provide an example on how to apply the pipeline's machine learning method to identify high-confidence spectra and pinpoint the most reliable identifications from large datasets.


Assuntos
Proteínas de Bactérias , Biologia Computacional , Fases de Leitura Aberta , Proteogenômica , Fluxo de Trabalho , Fases de Leitura Aberta/genética , Proteogenômica/métodos , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Biologia Computacional/métodos , Proteômica/métodos , Aprendizado de Máquina , Bactérias/genética , Bactérias/metabolismo , Software , Espectrometria de Massas/métodos , Micropeptídeos
20.
Methods Mol Biol ; 2836: 285-298, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38995546

RESUMO

The Gene Ontology (GO) project describes the functions of the gene products of organisms from all kingdoms of life in a standardized way, enabling powerful analyses of experiments involving genome-wide analysis. The scientific literature is used to convert experimental results into GO annotations that systematically classify gene products' functions. However, to address the fact that only a minor fraction of all genes has been characterized experimentally, multiple predictive methods to assign GO annotations have been developed since the inception of GO. Sequence homologies between novel genes and genes with known functions help to approximate the roles of these non-characterized genes. Here we describe the main sequence homology methods to produce annotations: pairwise comparison (BLAST), protein profile models (InterPro), and phylogenetic-based annotation (PAINT). Some of these methods can be implemented with genome analysis pipelines (BLAST and InterPro2GO), while PAINT is curated by the GO consortium.


Assuntos
Biologia Computacional , Ontologia Genética , Anotação de Sequência Molecular , Anotação de Sequência Molecular/métodos , Biologia Computacional/métodos , Filogenia , Software , Homologia de Sequência , Bases de Dados Genéticas , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA