Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 100
Filtrar
1.
Sci Rep ; 14(1): 21740, 2024 09 18.
Artigo em Inglês | MEDLINE | ID: mdl-39289394

RESUMO

Kidney diseases pose a significant global health challenge, requiring precise diagnostic tools to improve patient outcomes. This study addresses this need by investigating three main categories of renal diseases: kidney stones, cysts, and tumors. Utilizing a comprehensive dataset of 12,446 CT whole abdomen and urogram images, this study developed an advanced AI-driven diagnostic system specifically tailored for kidney disease classification. The innovative approach of this study combines the strengths of traditional convolutional neural network architecture (AlexNet) with modern advancements in ConvNeXt architectures. By integrating AlexNet's robust feature extraction capabilities with ConvNeXt's advanced attention mechanisms, the paper achieved an exceptional classification accuracy of 99.85%. A key advancement in this study's methodology lies in the strategic amalgamation of features from both networks. This paper concatenated hierarchical spatial information and incorporated self-attention mechanisms to enhance classification performance. Furthermore, the study introduced a custom optimization technique inspired by the Adam optimizer, which dynamically adjusts the step size based on gradient norms. This tailored optimizer facilitated faster convergence and more effective weight updates, imporving model performance. The model of this study demonstrated outstanding performance across various metrics, with an average precision of 99.89%, recall of 99.95%, and specificity of 99.83%. These results highlight the efficacy of the hybrid architecture and optimization strategy in accurately diagnosing kidney diseases. Additionally, the methodology of this paper emphasizes interpretability and explainability, which are crucial for the clinical deployment of deep learning models.


Assuntos
Nefropatias , Redes Neurais de Computação , Humanos , Nefropatias/diagnóstico , Nefropatias/diagnóstico por imagem , Tomografia Computadorizada por Raios X/métodos , Cálculos Renais/diagnóstico , Cálculos Renais/diagnóstico por imagem , Aprendizado Profundo , Algoritmos
2.
Virol J ; 21(1): 121, 2024 May 30.
Artigo em Inglês | MEDLINE | ID: mdl-38816844

RESUMO

BACKGROUND: During the pandemic, whole genome sequencing was critical to characterize SARS-CoV-2 for surveillance, clinical and therapeutical purposes. However, low viral loads in specimens often led to suboptimal sequencing, making lineage assignment and phylogenetic analysis difficult. We propose an alternative approach to sequencing these specimens that involves sequencing in triplicate and concatenation of the reads obtained using bioinformatics. This proposal is based on the hypothesis that the uncovered regions in each replicate differ and that concatenation would compensate for these gaps and recover a larger percentage of the sequenced genome. RESULTS: Whole genome sequencing was performed in triplicate on 30 samples with Ct > 32 and the benefit of replicate read concatenation was assessed. After concatenation: i) 28% of samples reached the standard quality coverage threshold (> 90% genome covered > 30x); ii) 39% of samples did not reach the coverage quality thresholds but coverage improved by more than 40%; and iii) SARS-CoV-2 lineage assignment was possible in 68.7% of samples where it had been impaired. CONCLUSIONS: Concatenation of reads from replicate sequencing reactions provides a simple way to access hidden information in the large proportion of SARS-CoV-2-positive specimens eliminated from analysis in standard sequencing schemes. This approach will enhance our potential to rule out involvement in outbreaks, to characterize reinfections and to identify lineages of concern for surveillance or therapeutical purposes.


Assuntos
COVID-19 , Genoma Viral , Filogenia , SARS-CoV-2 , Carga Viral , Sequenciamento Completo do Genoma , SARS-CoV-2/genética , SARS-CoV-2/classificação , SARS-CoV-2/isolamento & purificação , Humanos , COVID-19/virologia , Carga Viral/métodos , Genoma Viral/genética , Sequenciamento Completo do Genoma/métodos , Biologia Computacional/métodos , RNA Viral/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos
3.
Mol Ecol Resour ; 24(7): e13964, 2024 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-38666432

RESUMO

Phylogenetic studies now routinely require manipulating and summarizing thousands of data files. For most of these tasks, currently available software requires considerable computing resources and substantial knowledge of command-line applications. We develop an ultrafast and memory-efficient software, SEGUL, that performs common phylogenomic dataset manipulations and calculates statistics summarizing essential data features. Our software is available as standalone command-line interface (CLI) and graphical user interface (GUI) applications, and as a library for Rust, R and Python, with possible support of other languages. The CLI and library versions run native on Windows, Linux and macOS, including Apple ARM Macs. The GUI version extends support to include mobile iOS, iPadOS and Android operating systems. SEGUL leverages the high performance of the Rust programming language to offer fast execution times and low memory footprints regardless of dataset size and platform choice. The inclusion of a GUI minimizes bioinformatics barriers to phylogenomics while SEGUL's efficiency reduces economic barriers by allowing analysis on inexpensive hardware. Our support for mobile operating systems further enables teaching phylogenomics where access to computing power is limited.


Assuntos
Biologia Computacional , Filogenia , Software , Biologia Computacional/métodos , Interface Usuário-Computador
4.
Sci Rep ; 14(1): 8071, 2024 04 05.
Artigo em Inglês | MEDLINE | ID: mdl-38580700

RESUMO

Over recent years, researchers and practitioners have encountered massive and continuous improvements in the computational resources available for their use. This allowed the use of resource-hungry Machine learning (ML) algorithms to become feasible and practical. Moreover, several advanced techniques are being used to boost the performance of such algorithms even further, which include various transfer learning techniques, data augmentation, and feature concatenation. Normally, the use of these advanced techniques highly depends on the size and nature of the dataset being used. In the case of fine-grained medical image sets, which have subcategories within the main categories in the image set, there is a need to find the combination of the techniques that work the best on these types of images. In this work, we utilize these advanced techniques to find the best combinations to build a state-of-the-art lumber disc herniation computer-aided diagnosis system. We have evaluated the system extensively and the results show that the diagnosis system achieves an accuracy of 98% when it is compared with human diagnosis.


Assuntos
Deslocamento do Disco Intervertebral , Humanos , Deslocamento do Disco Intervertebral/diagnóstico por imagem , Diagnóstico por Computador/métodos , Algoritmos , Aprendizado de Máquina , Computadores
5.
J Proteome Res ; 23(3): 881-890, 2024 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-38327087

RESUMO

Clinical diagnostics and microbiology require high-throughput identification of microorganisms. Sample multiplexing prior to detection is an attractive means to reduce analysis costs and time-to-result. Recent studies have demonstrated the discriminative power of tandem mass spectrometry-based proteotyping. This technology can rapidly identify the most likely taxonomical position of any microorganism, even uncharacterized organisms. Here, we present a simplified label-free multiplexing method to proteotype isolates by tandem mass spectrometry that can identify six microorganisms in a single 20 min analytical run. The strategy involves the production of peptide fractions with distinct hydrophobicity profiles using spin column fractionation. Assemblages of different fractions can then be analyzed using mass spectrometry. Results are subsequently interpreted based on the hydrophobic characteristics of the peptides detected, which make it possible to link each taxon identified to the initial sample. The methodology was tested on 32 distinct sets of six organisms including several worst-scenario assemblages-with differences in sample quantities or the presence of the same organisms in multiple fractions-and proved to be robust. These results pave the way for the deployment of tandem mass spectrometry-based proteotyping in microbiology laboratories.


Assuntos
Fracionamento Químico , Espectrometria de Massas em Tandem , Cromatografia Líquida
6.
BMC Genomics ; 25(1): 122, 2024 Jan 29.
Artigo em Inglês | MEDLINE | ID: mdl-38287261

RESUMO

BACKGROUND: Cancers exhibit complex transcriptomes with aberrant splicing that induces isoform-level differential expression compared to non-diseased tissues. Transcriptomic profiling using short-read sequencing has utility in providing a cost-effective approach for evaluating isoform expression, although short-read assembly displays limitations in the accurate inference of full-length transcripts. Long-read RNA sequencing (Iso-Seq), using the Pacific Biosciences (PacBio) platform, can overcome such limitations by providing full-length isoform sequence resolution which requires no read assembly and represents native expressed transcripts. A constraint of the Iso-Seq protocol is due to fewer reads output per instrument run, which, as an example, can consequently affect the detection of lowly expressed transcripts. To address these deficiencies, we developed a concatenation workflow, PacBio Full-Length Isoform Concatemer Sequencing (PB_FLIC-Seq), designed to increase the number of unique, sequenced PacBio long-reads thereby improving overall detection of unique isoforms. In addition, we anticipate that the increase in read depth will help improve the detection of moderate to low-level expressed isoforms. RESULTS: In sequencing a commercial reference (Spike-In RNA Variants; SIRV) with known isoform complexity we demonstrated a 3.4-fold increase in read output per run and improved SIRV recall when using the PB_FLIC-Seq method compared to the same samples processed with the Iso-Seq protocol. We applied this protocol to a translational cancer case, also demonstrating the utility of the PB_FLIC-Seq method for identifying differential full-length isoform expression in a pediatric diffuse midline glioma compared to its adjacent non-malignant tissue. Our data analysis revealed increased expression of extracellular matrix (ECM) genes within the tumor sample, including an isoform of the Secreted Protein Acidic and Cysteine Rich (SPARC) gene that was expressed 11,676-fold higher than in the adjacent non-malignant tissue. Finally, by using the PB_FLIC-Seq method, we detected several cancer-specific novel isoforms. CONCLUSION: This work describes a concatenation-based methodology for increasing the number of sequenced full-length isoform reads on the PacBio platform, yielding improved discovery of expressed isoforms. We applied this workflow to profile the transcriptome of a pediatric diffuse midline glioma and adjacent non-malignant tissue. Our findings of cancer-specific novel isoform expression further highlight the importance of long-read sequencing for characterization of complex tumor transcriptomes.


Assuntos
Glioma , Transcriptoma , Humanos , Criança , Perfilação da Expressão Gênica/métodos , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Splicing de RNA , Análise de Sequência de RNA , Sequenciamento de Nucleotídeos em Larga Escala/métodos
7.
Plant J ; 117(2): 342-363, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-37831618

RESUMO

Attenuated strains of the naturally occurring plant pathogen Agrobacterium tumefaciens can transfer virtually any DNA sequence of interest to model plants and crops. This has made Agrobacterium-mediated transformation (AMT) one of the most commonly used tools in agricultural biotechnology. Understanding AMT, and its functional consequences, is of fundamental importance given that it sits at the intersection of many fundamental fields of study, including plant-microbe interactions, DNA repair/genome stability, and epigenetic regulation of gene expression. Despite extensive research and use of AMT over the last 40 years, the extent of genomic disruption associated with integrating exogenous DNA into plant genomes using this method remains underappreciated. However, new technologies like long-read sequencing make this disruption more apparent, complementing previous findings from multiple research groups that have tackled this question in the past. In this review, we cover progress on the molecular mechanisms involved in Agrobacterium-mediated DNA integration into plant genomes. We also discuss localized mutations at the site of insertion and describe the structure of these DNA insertions, which can range from single copy insertions to large concatemers, consisting of complex DNA originating from different sources. Finally, we discuss the prevalence of large-scale genomic rearrangements associated with the integration of DNA during AMT with examples. Understanding the intended and unintended effects of AMT on genome stability is critical to all plant researchers who use this methodology to generate new genetic variants.


Assuntos
Epigênese Genética , Plantas , Plantas/genética , Plantas/microbiologia , Agrobacterium tumefaciens/genética , Genômica , DNA , Instabilidade Genômica/genética , Transformação Genética , DNA Bacteriano/genética , Plantas Geneticamente Modificadas/genética
8.
BMC Ecol Evol ; 23(1): 75, 2023 12 12.
Artigo em Inglês | MEDLINE | ID: mdl-38087247

RESUMO

BACKGROUND: Despite recent advances, reliable tools to simultaneously handle different types of sequencing data (e.g., target capture, genome skimming) for phylogenomics are still scarce. Here, we evaluate the performance of the recently developed pipeline Captus in comparison with the well-known target capture pipelines HybPiper and SECAPR. As test data, we analyzed newly generated sequences for the genus Thladiantha (Cucurbitaceae) for which no well-resolved phylogeny estimate has been available so far, as well as simulated reads derived from the genome of Arabidopsis thaliana. RESULTS: Our pipeline comparisons are based on (1) the time needed for data assembly and locus extraction, (2) locus recovery per sample, (3) the number of informative sites in nucleotide alignments, and (4) the topology of the nuclear and plastid phylogenies. Additionally, the simulated reads derived from the genome of Arabidopsis thaliana were used to evaluate the accuracy and completeness of the recovered loci. In terms of computation time, locus recovery per sample, and informative sites, Captus outperforms HybPiper and SECAPR. The resulting topologies of Captus and SECAPR are identical for coalescent trees but differ when trees are inferred from concatenated alignments. The HybPiper phylogeny is similar to Captus in both methods. The nuclear genes recover a deep split of Thladiantha in two clades, but this is not supported by the plastid data. CONCLUSIONS: Captus is the best choice among the three pipelines in terms of computation time and locus recovery. Even though there is no significant topological difference between the Thladiantha species trees produced by the three pipelines, Captus yields a higher number of gene trees in agreement with the topology of the species tree (i.e., fewer genes in conflict with the species tree topology).


Assuntos
Arabidopsis , Cucurbitaceae , Filogenia , Cucurbitaceae/genética , Arabidopsis/genética , Genoma
9.
Front Plant Sci ; 14: 1153505, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37434602

RESUMO

An improved YOLOv5s model was proposed and validated on a new fruit dataset to solve the real-time detection task in a complex environment. With the incorporation of feature concatenation and an attention mechanism into the original YOLOv5s network, the improved YOLOv5s recorded 122 layers, 4.4 × 106 params, 12.8 GFLOPs, and 8.8 MB weight size, which are 45.5%, 30.2%, 14.1%, and 31.3% smaller than the original YOLOv5s, respectively. Meanwhile, the obtained 93.4% of mAP tested on the valid set, 96.0% of mAP tested on the test set, and 74 fps of speed tested on videos using improved YOLOv5s is 0.6%, 0.5%, and 10.4% higher than the original YOLOv5s model, respectively. Using videos, the fruit tracking and counting tested on the improved YOLOv5s observed less missed and incorrect detections compared to the original YOLOv5s. Furthermore, the aggregated detection performance of improved YOLOv5s outperformed the network of GhostYOLOv5s, YOLOv4-tiny, and YOLOv7-tiny, including other mainstream YOLO variants. Therefore, the improved YOLOv5s is lightweight with reduced computation costs, can better generalize against complex conditions, and is applicable for real-time detection in fruit picking robots and low-power devices.

10.
Sensors (Basel) ; 23(6)2023 Mar 20.
Artigo em Inglês | MEDLINE | ID: mdl-36991967

RESUMO

This study proposes an electrocardiogram (ECG) signal stitching scheme to detect arrhythmias in drivers during driving. When the ECG is measured through the steering wheel during driving, the data are always exposed to noise caused by vehicle vibrations, bumpy road conditions, and the driver's steering wheel gripping force. The proposed scheme extracts stable ECG signals and transforms them into full 10 s ECG signals to classify arrhythmias using convolutional neural networks (CNN). Before the ECG stitching algorithm is applied, data preprocessing is performed. To extract the cycle from the collected ECG data, the R peaks are found and the TP interval segmentation is applied. An abnormal P peak is very difficult to find. Therefore, this study also introduces a P peak estimation method. Finally, 4 × 2.5 s ECG segments are collected. To classify arrhythmias with stitched ECG data, each time series' ECG signal is transformed via the continuous wavelet transform (CWT) and short-time Fourier transform (STFT), and transfer learning is performed for classification using CNNs. Finally, the parameters of the networks that provide the best performance are investigated. According to the classification accuracy, GoogleNet with the CWT image set shows the best results. The classification accuracy is 82.39% for the stitched ECG data, while it is 88.99% for the original ECG data.


Assuntos
Aprendizado Profundo , Humanos , Arritmias Cardíacas/diagnóstico , Redes Neurais de Computação , Algoritmos , Eletrocardiografia
11.
Crit Rev Food Sci Nutr ; 63(32): 10995-11009, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-35730201

RESUMO

Enological evaluations capture the chemical and sensory space of wine using different techniques; many sensory methods as well as a variety of analytical chemistry techniques contribute to the amount of information generated. Data fusion, especially integrating data sets, is important when working with complex systems. The success reported when trying to integrate different modalities is generally low and has been attributed to the lack of statistically considerate strategies focusing on the data handling process. Multiple stages of data handling must be carefully considered when dealing with multi-modal data. In this review, the different stages in the data analysis process were examined. The study revealed misconceptions surrounding the process and elucidated rules for purpose-driven approaches by examining the complexities of each stage and the impact the decisions made at each stage have on the resulting models. The two major modeling approaches are either supervised (discrimination, classification, prediction) or unsupervised (exploration). Supervised approaches were emphatic on the pre-processing steps and prioritized increasing performance. Unsupervised approaches were mostly used for preliminary steps. The review found aspects often neglected when it came to the data collection and capturing which in the end contributed to the low success in combining sensory and chemistry data.


Assuntos
Quimiometria , Vinho
12.
Imeta ; 2(1): e87, 2023 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38868339

RESUMO

Phylogenetic analysis has entered the genomics (multilocus) era. For less experienced researchers, conquering the large number of software programs required for a multilocus-based phylogenetic reconstruction can be somewhat daunting and time-consuming. PhyloSuite, a software with a user-friendly GUI, was designed to make this process more accessible by integrating multiple software programs needed for multilocus and single-gene phylogenies and further streamlining the whole process. In this protocol, we aim to explain how to conduct each step of the phylogenetic pipeline and tree-based analyses in PhyloSuite. We also present a new version of PhyloSuite (v1.2.3), wherein we fixed some bugs, made some optimizations, and introduced some new functions, including a number of tree-based analyses, such as signal-to-noise calculation, saturation analysis, spurious species identification, and etc. The step-by-step protocol includes background information (i.e., what the step does), reasons (i.e., why do the step), and operations (i.e., how to do it). This protocol will help researchers quick-start their way through the multilocus phylogenetic analysis, especially those interested in conducting organelle-based analyses.

13.
BMC Oral Health ; 22(1): 571, 2022 12 07.
Artigo em Inglês | MEDLINE | ID: mdl-36476146

RESUMO

BACKGROUND: Assessing the time required for tooth extraction is the most important factor to consider before surgeries. The purpose of this study was to create a practical predictive model for assessing the time to extract the mandibular third molar tooth using deep learning. The accuracy of the model was evaluated by comparing the extraction time predicted by deep learning with the actual time required for extraction. METHODS: A total of 724 panoramic X-ray images and clinical data were used for artificial intelligence (AI) prediction of extraction time. Clinical data such as age, sex, maximum mouth opening, body weight, height, the time from the start of incision to the start of suture, and surgeon's experience were recorded. Data augmentation and weight balancing were used to improve learning abilities of AI models. Extraction time predicted by the concatenated AI model was compared with the actual extraction time. RESULTS: The final combined model (CNN + MLP) model achieved an R value of 0.8315, an R-squared value of 0.6839, a p-value of less than 0.0001, and a mean absolute error (MAE) of 2.95 min with the test dataset. CONCLUSIONS: Our proposed model for predicting time to extract the mandibular third molar tooth performs well with a high accuracy in clinical practice.


Assuntos
Inteligência Artificial , Aprendizado Profundo , Humanos , Dente Serotino/diagnóstico por imagem , Dente Serotino/cirurgia , Extração Dentária , Duração da Cirurgia
14.
Front Bioinform ; 2: 1074802, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36568700

RESUMO

The reconstruction of phylogenomic trees containing multiple genes is best achieved by using a supermatrix. The advent of NGS technology made it easier and cheaper to obtain multiple gene data in one sequencing run. When numerous genes and organisms are used in the phylogenomic analysis, it is difficult to organize all information and manually align the gene sequences to further concatenate them. This study describes SPLACE, a tool to automatically SPLit, Align, and ConcatenatE the genes of all species of interest to generate a supermatrix file, and consequently, a phylogenetic tree, while handling possible missing data. In our findings, SPLACE was the only tool that could automatically align gene sequences and also handle missing data; and, it required only a few minutes to produce a supermatrix FASTA file containing 83 aligned and concatenated genes from the chloroplast genomes of 270 plant species. It is an open-source tool and is publicly available at https://github.com/reinator/splace.

15.
Life (Basel) ; 12(11)2022 Nov 11.
Artigo em Inglês | MEDLINE | ID: mdl-36430983

RESUMO

Due to various reasons, the incidence rate of communicable diseases in humans is steadily rising, and timely detection and handling will reduce the disease distribution speed. Tuberculosis (TB) is a severe communicable illness caused by the bacterium Mycobacterium-Tuberculosis (M. tuberculosis), which predominantly affects the lungs and causes severe respiratory problems. Due to its significance, several clinical level detections of TB are suggested, including lung diagnosis with chest X-ray images. The proposed work aims to develop an automatic TB detection system to assist the pulmonologist in confirming the severity of the disease, decision-making, and treatment execution. The proposed system employs a pre-trained VGG19 with the following phases: (i) image pre-processing, (ii) mining of deep features, (iii) enhancing the X-ray images with chosen procedures and mining of the handcrafted features, (iv) feature optimization using Seagull-Algorithm and serial concatenation, and (v) binary classification and validation. The classification is executed with 10-fold cross-validation in this work, and the proposed work is investigated using MATLAB® software. The proposed research work was executed using the concatenated deep and handcrafted features, which provided a classification accuracy of 98.6190% with the SVM-Medium Gaussian (SVM-MG) classifier.

16.
Physiol Meas ; 43(10)2022 10 31.
Artigo em Inglês | MEDLINE | ID: mdl-36195081

RESUMO

Objective.Due to the variability of human movements, muscle activations vary among trials and subjects. However, few studies investigated how data organization methods for addressing variability impact the extracted muscle synergies.Approach.Fifteen healthy subjects performed a large set of upper limb multi-directional point-to-point reaching movements. Then, the study extracted muscle synergies under different data settings and investigated how data structure prior to synergy extraction, namely concatenation, averaging, and single trial, the number of considered trials, and the number of reaching directions affected the number and components of muscle synergies.Main results.The results showed that the number and components of synergies were significantly affected by the data structure. The concatenation method identified the highest number of synergies, and the averaging method usually found a smaller number of synergies. When the concatenated trials or reaching directions was lower than a minimum value, the number of synergies increased with the increase of the number of trials or reaching directions; however, when the number of trials or reaching directions reached a threshold, the number of synergies was usually constant or with less variation even when novel directions and trials were added. Similarity analysis also showed a slight increase when the number of trials or reaching directions was lower than a threshold. This study recommends that at least five trials and four reaching directions and the concatenation method are considered in muscle synergies analysis during upper limb tasks.Significance.This study makes the researchers focus on the variability analysis induced by the diseases rather than the techniques applied for synergies analysis and promotes applications of muscle synergies in clinical scenarios.


Assuntos
Movimento , Músculo Esquelético , Humanos , Eletromiografia , Fenômenos Biomecânicos , Músculo Esquelético/fisiologia , Movimento/fisiologia , Extremidade Superior
17.
Healthcare (Basel) ; 10(10)2022 Oct 18.
Artigo em Inglês | MEDLINE | ID: mdl-36292519

RESUMO

The novel coronavirus 2019 (COVID-19) spread rapidly around the world and its outbreak has become a pandemic. Due to an increase in afflicted cases, the quantity of COVID-19 tests kits available in hospitals has decreased. Therefore, an autonomous detection system is an essential tool for reducing infection risks and spreading of the virus. In the literature, various models based on machine learning (ML) and deep learning (DL) are introduced to detect many pneumonias using chest X-ray images. The cornerstone in this paper is the use of pretrained deep learning CNN architectures to construct an automated system for COVID-19 detection and diagnosis. In this work, we used the deep feature concatenation (DFC) mechanism to combine features extracted from input images using the two modern pre-trained CNN models, AlexNet and Xception. Hence, we propose COVID-AleXception: a neural network that is a concatenation of the AlexNet and Xception models for the overall improvement of the prediction capability of this pandemic. To evaluate the proposed model and build a dataset of large-scale X-ray images, there was a careful selection of multiple X-ray images from several sources. The COVID-AleXception model can achieve a classification accuracy of 98.68%, which shows the superiority of the proposed model over AlexNet and Xception that achieved a classification accuracy of 94.86% and 95.63%, respectively. The performance results of this proposed model demonstrate its pertinence to help radiologists diagnose COVID-19 more quickly.

18.
Entropy (Basel) ; 24(7)2022 Jun 25.
Artigo em Inglês | MEDLINE | ID: mdl-35885098

RESUMO

Densely connected convolutional networks (DenseNet) behave well in image processing. However, for regression tasks, convolutional DenseNet may lose essential information from independent input features. To tackle this issue, we propose a novel DenseNet regression model where convolution and pooling layers are replaced by fully connected layers and the original concatenation shortcuts are maintained to reuse the feature. To investigate the effects of depth and input dimensions of the proposed model, careful validations are performed by extensive numerical simulation. The results give an optimal depth (19) and recommend a limited input dimension (under 200). Furthermore, compared with the baseline models, including support vector regression, decision tree regression, and residual regression, our proposed model with the optimal depth performs best. Ultimately, DenseNet regression is applied to predict relative humidity, and the outcome shows a high correlation with observations, which indicates that our model could advance environmental data science.

19.
Mol Biol Evol ; 39(6)2022 06 02.
Artigo em Inglês | MEDLINE | ID: mdl-35642314

RESUMO

Traditionally, single-copy orthologs have been the gold standard in phylogenomics. Most phylogenomic studies identify putative single-copy orthologs using clustering approaches and retain families with a single sequence per species. This limits the amount of data available by excluding larger families. Recent advances have suggested several ways to include data from larger families. For instance, tree-based decomposition methods facilitate the extraction of orthologs from large families. Additionally, several methods for species tree inference are robust to the inclusion of paralogs and could use all of the data from larger families. Here, we explore the effects of using all families for phylogenetic inference by examining relationships among 26 primate species in detail and by analyzing five additional data sets. We compare single-copy families, orthologs extracted using tree-based decomposition approaches, and all families with all data. We explore several species tree inference methods, finding that identical trees are returned across nearly all subsets of the data and methods for primates. The relationships among Platyrrhini remain contentious; however, the species tree inference method matters more than the subset of data used. Using data from larger gene families drastically increases the number of genes available and leads to consistent estimates of branch lengths, nodal certainty and concordance, and inferences of introgression in primates. For the other data sets, topological inferences are consistent whether single-copy families or orthologs extracted using decomposition approaches are analyzed. Using larger gene families is a promising approach to include more data in phylogenomics without sacrificing accuracy, at least when high-quality genomes are available.


Assuntos
Genoma , Animais , Análise por Conglomerados , Filogenia
20.
BMC Ecol Evol ; 22(1): 55, 2022 04 30.
Artigo em Inglês | MEDLINE | ID: mdl-35501703

RESUMO

BACKGROUND: The genus Ligusticum belongs to Apiaceae, and its taxonomy has long been a major difficulty. A robust phylogenetic tree is the basis of accurate taxonomic classification of Ligusticum. We herein used 26 (including 14 newly sequenced) plastome-scale data to generate reliable phylogenetic trees to explore the phylogenetic relationships of Chinese Ligusticum. RESULTS: We found that these plastid genomes exhibited diverse plastome characteristics across all four currently identified clades in China, while the plastid protein-coding genes were conserved. The phylogenetic analyses by the concatenation and coalescent methods obtained a more robust molecular phylogeny than prior studies and showed the non-monophyly of Chinese Ligusticum. In the concatenation-based phylogeny analyses, the two datasets yielded slightly different topologies that may be primarily due to the discrepancy in the number of variable sites. CONCLUSIONS: Our plastid phylogenomics analyses emphasized that the current circumscription of the Chinese Ligusticum should be reduced, and the taxonomy of Ligusticum urgently needs revision. Wider taxon sampling including the related species of Ligusticum will be necessary to explore the phylogenetic relationships of this genus. Overall, our study provided new insights into the taxonomic classification of Ligusticum and would serve as a framework for future studies on taxonomy and delimitation of Ligusticum from the perspective of the plastid genome.


Assuntos
Apiaceae , Genomas de Plastídeos , Ligusticum , Evolução Molecular , Filogenia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA