Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 100
Filtrar
Más filtros











Base de datos
Intervalo de año de publicación
1.
Sci Rep ; 14(1): 21740, 2024 09 18.
Artículo en Inglés | MEDLINE | ID: mdl-39289394

RESUMEN

Kidney diseases pose a significant global health challenge, requiring precise diagnostic tools to improve patient outcomes. This study addresses this need by investigating three main categories of renal diseases: kidney stones, cysts, and tumors. Utilizing a comprehensive dataset of 12,446 CT whole abdomen and urogram images, this study developed an advanced AI-driven diagnostic system specifically tailored for kidney disease classification. The innovative approach of this study combines the strengths of traditional convolutional neural network architecture (AlexNet) with modern advancements in ConvNeXt architectures. By integrating AlexNet's robust feature extraction capabilities with ConvNeXt's advanced attention mechanisms, the paper achieved an exceptional classification accuracy of 99.85%. A key advancement in this study's methodology lies in the strategic amalgamation of features from both networks. This paper concatenated hierarchical spatial information and incorporated self-attention mechanisms to enhance classification performance. Furthermore, the study introduced a custom optimization technique inspired by the Adam optimizer, which dynamically adjusts the step size based on gradient norms. This tailored optimizer facilitated faster convergence and more effective weight updates, imporving model performance. The model of this study demonstrated outstanding performance across various metrics, with an average precision of 99.89%, recall of 99.95%, and specificity of 99.83%. These results highlight the efficacy of the hybrid architecture and optimization strategy in accurately diagnosing kidney diseases. Additionally, the methodology of this paper emphasizes interpretability and explainability, which are crucial for the clinical deployment of deep learning models.


Asunto(s)
Enfermedades Renales , Redes Neurales de la Computación , Humanos , Enfermedades Renales/diagnóstico , Enfermedades Renales/diagnóstico por imagen , Tomografía Computarizada por Rayos X/métodos , Cálculos Renales/diagnóstico , Cálculos Renales/diagnóstico por imagen , Aprendizaje Profundo , Algoritmos
2.
Virol J ; 21(1): 121, 2024 May 30.
Artículo en Inglés | MEDLINE | ID: mdl-38816844

RESUMEN

BACKGROUND: During the pandemic, whole genome sequencing was critical to characterize SARS-CoV-2 for surveillance, clinical and therapeutical purposes. However, low viral loads in specimens often led to suboptimal sequencing, making lineage assignment and phylogenetic analysis difficult. We propose an alternative approach to sequencing these specimens that involves sequencing in triplicate and concatenation of the reads obtained using bioinformatics. This proposal is based on the hypothesis that the uncovered regions in each replicate differ and that concatenation would compensate for these gaps and recover a larger percentage of the sequenced genome. RESULTS: Whole genome sequencing was performed in triplicate on 30 samples with Ct > 32 and the benefit of replicate read concatenation was assessed. After concatenation: i) 28% of samples reached the standard quality coverage threshold (> 90% genome covered > 30x); ii) 39% of samples did not reach the coverage quality thresholds but coverage improved by more than 40%; and iii) SARS-CoV-2 lineage assignment was possible in 68.7% of samples where it had been impaired. CONCLUSIONS: Concatenation of reads from replicate sequencing reactions provides a simple way to access hidden information in the large proportion of SARS-CoV-2-positive specimens eliminated from analysis in standard sequencing schemes. This approach will enhance our potential to rule out involvement in outbreaks, to characterize reinfections and to identify lineages of concern for surveillance or therapeutical purposes.


Asunto(s)
COVID-19 , Genoma Viral , Filogenia , SARS-CoV-2 , Carga Viral , Secuenciación Completa del Genoma , SARS-CoV-2/genética , SARS-CoV-2/clasificación , SARS-CoV-2/aislamiento & purificación , Humanos , COVID-19/virología , Carga Viral/métodos , Genoma Viral/genética , Secuenciación Completa del Genoma/métodos , Biología Computacional/métodos , ARN Viral/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos
3.
Mol Ecol Resour ; 24(7): e13964, 2024 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-38666432

RESUMEN

Phylogenetic studies now routinely require manipulating and summarizing thousands of data files. For most of these tasks, currently available software requires considerable computing resources and substantial knowledge of command-line applications. We develop an ultrafast and memory-efficient software, SEGUL, that performs common phylogenomic dataset manipulations and calculates statistics summarizing essential data features. Our software is available as standalone command-line interface (CLI) and graphical user interface (GUI) applications, and as a library for Rust, R and Python, with possible support of other languages. The CLI and library versions run native on Windows, Linux and macOS, including Apple ARM Macs. The GUI version extends support to include mobile iOS, iPadOS and Android operating systems. SEGUL leverages the high performance of the Rust programming language to offer fast execution times and low memory footprints regardless of dataset size and platform choice. The inclusion of a GUI minimizes bioinformatics barriers to phylogenomics while SEGUL's efficiency reduces economic barriers by allowing analysis on inexpensive hardware. Our support for mobile operating systems further enables teaching phylogenomics where access to computing power is limited.


Asunto(s)
Biología Computacional , Filogenia , Programas Informáticos , Biología Computacional/métodos , Interfaz Usuario-Computador
4.
Sci Rep ; 14(1): 8071, 2024 04 05.
Artículo en Inglés | MEDLINE | ID: mdl-38580700

RESUMEN

Over recent years, researchers and practitioners have encountered massive and continuous improvements in the computational resources available for their use. This allowed the use of resource-hungry Machine learning (ML) algorithms to become feasible and practical. Moreover, several advanced techniques are being used to boost the performance of such algorithms even further, which include various transfer learning techniques, data augmentation, and feature concatenation. Normally, the use of these advanced techniques highly depends on the size and nature of the dataset being used. In the case of fine-grained medical image sets, which have subcategories within the main categories in the image set, there is a need to find the combination of the techniques that work the best on these types of images. In this work, we utilize these advanced techniques to find the best combinations to build a state-of-the-art lumber disc herniation computer-aided diagnosis system. We have evaluated the system extensively and the results show that the diagnosis system achieves an accuracy of 98% when it is compared with human diagnosis.


Asunto(s)
Desplazamiento del Disco Intervertebral , Humanos , Desplazamiento del Disco Intervertebral/diagnóstico por imagen , Diagnóstico por Computador/métodos , Algoritmos , Aprendizaje Automático , Computadores
5.
J Proteome Res ; 23(3): 881-890, 2024 03 01.
Artículo en Inglés | MEDLINE | ID: mdl-38327087

RESUMEN

Clinical diagnostics and microbiology require high-throughput identification of microorganisms. Sample multiplexing prior to detection is an attractive means to reduce analysis costs and time-to-result. Recent studies have demonstrated the discriminative power of tandem mass spectrometry-based proteotyping. This technology can rapidly identify the most likely taxonomical position of any microorganism, even uncharacterized organisms. Here, we present a simplified label-free multiplexing method to proteotype isolates by tandem mass spectrometry that can identify six microorganisms in a single 20 min analytical run. The strategy involves the production of peptide fractions with distinct hydrophobicity profiles using spin column fractionation. Assemblages of different fractions can then be analyzed using mass spectrometry. Results are subsequently interpreted based on the hydrophobic characteristics of the peptides detected, which make it possible to link each taxon identified to the initial sample. The methodology was tested on 32 distinct sets of six organisms including several worst-scenario assemblages-with differences in sample quantities or the presence of the same organisms in multiple fractions-and proved to be robust. These results pave the way for the deployment of tandem mass spectrometry-based proteotyping in microbiology laboratories.


Asunto(s)
Fraccionamiento Químico , Espectrometría de Masas en Tándem , Cromatografía Liquida
6.
BMC Genomics ; 25(1): 122, 2024 Jan 29.
Artículo en Inglés | MEDLINE | ID: mdl-38287261

RESUMEN

BACKGROUND: Cancers exhibit complex transcriptomes with aberrant splicing that induces isoform-level differential expression compared to non-diseased tissues. Transcriptomic profiling using short-read sequencing has utility in providing a cost-effective approach for evaluating isoform expression, although short-read assembly displays limitations in the accurate inference of full-length transcripts. Long-read RNA sequencing (Iso-Seq), using the Pacific Biosciences (PacBio) platform, can overcome such limitations by providing full-length isoform sequence resolution which requires no read assembly and represents native expressed transcripts. A constraint of the Iso-Seq protocol is due to fewer reads output per instrument run, which, as an example, can consequently affect the detection of lowly expressed transcripts. To address these deficiencies, we developed a concatenation workflow, PacBio Full-Length Isoform Concatemer Sequencing (PB_FLIC-Seq), designed to increase the number of unique, sequenced PacBio long-reads thereby improving overall detection of unique isoforms. In addition, we anticipate that the increase in read depth will help improve the detection of moderate to low-level expressed isoforms. RESULTS: In sequencing a commercial reference (Spike-In RNA Variants; SIRV) with known isoform complexity we demonstrated a 3.4-fold increase in read output per run and improved SIRV recall when using the PB_FLIC-Seq method compared to the same samples processed with the Iso-Seq protocol. We applied this protocol to a translational cancer case, also demonstrating the utility of the PB_FLIC-Seq method for identifying differential full-length isoform expression in a pediatric diffuse midline glioma compared to its adjacent non-malignant tissue. Our data analysis revealed increased expression of extracellular matrix (ECM) genes within the tumor sample, including an isoform of the Secreted Protein Acidic and Cysteine Rich (SPARC) gene that was expressed 11,676-fold higher than in the adjacent non-malignant tissue. Finally, by using the PB_FLIC-Seq method, we detected several cancer-specific novel isoforms. CONCLUSION: This work describes a concatenation-based methodology for increasing the number of sequenced full-length isoform reads on the PacBio platform, yielding improved discovery of expressed isoforms. We applied this workflow to profile the transcriptome of a pediatric diffuse midline glioma and adjacent non-malignant tissue. Our findings of cancer-specific novel isoform expression further highlight the importance of long-read sequencing for characterization of complex tumor transcriptomes.


Asunto(s)
Glioma , Transcriptoma , Humanos , Niño , Perfilación de la Expresión Génica/métodos , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Empalme del ARN , Análisis de Secuencia de ARN , Secuenciación de Nucleótidos de Alto Rendimiento/métodos
7.
Plant J ; 117(2): 342-363, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-37831618

RESUMEN

Attenuated strains of the naturally occurring plant pathogen Agrobacterium tumefaciens can transfer virtually any DNA sequence of interest to model plants and crops. This has made Agrobacterium-mediated transformation (AMT) one of the most commonly used tools in agricultural biotechnology. Understanding AMT, and its functional consequences, is of fundamental importance given that it sits at the intersection of many fundamental fields of study, including plant-microbe interactions, DNA repair/genome stability, and epigenetic regulation of gene expression. Despite extensive research and use of AMT over the last 40 years, the extent of genomic disruption associated with integrating exogenous DNA into plant genomes using this method remains underappreciated. However, new technologies like long-read sequencing make this disruption more apparent, complementing previous findings from multiple research groups that have tackled this question in the past. In this review, we cover progress on the molecular mechanisms involved in Agrobacterium-mediated DNA integration into plant genomes. We also discuss localized mutations at the site of insertion and describe the structure of these DNA insertions, which can range from single copy insertions to large concatemers, consisting of complex DNA originating from different sources. Finally, we discuss the prevalence of large-scale genomic rearrangements associated with the integration of DNA during AMT with examples. Understanding the intended and unintended effects of AMT on genome stability is critical to all plant researchers who use this methodology to generate new genetic variants.


Asunto(s)
Epigénesis Genética , Plantas , Plantas/genética , Plantas/microbiología , Agrobacterium tumefaciens/genética , Genómica , ADN , Inestabilidad Genómica/genética , Transformación Genética , ADN Bacteriano/genética , Plantas Modificadas Genéticamente/genética
8.
BMC Ecol Evol ; 23(1): 75, 2023 12 12.
Artículo en Inglés | MEDLINE | ID: mdl-38087247

RESUMEN

BACKGROUND: Despite recent advances, reliable tools to simultaneously handle different types of sequencing data (e.g., target capture, genome skimming) for phylogenomics are still scarce. Here, we evaluate the performance of the recently developed pipeline Captus in comparison with the well-known target capture pipelines HybPiper and SECAPR. As test data, we analyzed newly generated sequences for the genus Thladiantha (Cucurbitaceae) for which no well-resolved phylogeny estimate has been available so far, as well as simulated reads derived from the genome of Arabidopsis thaliana. RESULTS: Our pipeline comparisons are based on (1) the time needed for data assembly and locus extraction, (2) locus recovery per sample, (3) the number of informative sites in nucleotide alignments, and (4) the topology of the nuclear and plastid phylogenies. Additionally, the simulated reads derived from the genome of Arabidopsis thaliana were used to evaluate the accuracy and completeness of the recovered loci. In terms of computation time, locus recovery per sample, and informative sites, Captus outperforms HybPiper and SECAPR. The resulting topologies of Captus and SECAPR are identical for coalescent trees but differ when trees are inferred from concatenated alignments. The HybPiper phylogeny is similar to Captus in both methods. The nuclear genes recover a deep split of Thladiantha in two clades, but this is not supported by the plastid data. CONCLUSIONS: Captus is the best choice among the three pipelines in terms of computation time and locus recovery. Even though there is no significant topological difference between the Thladiantha species trees produced by the three pipelines, Captus yields a higher number of gene trees in agreement with the topology of the species tree (i.e., fewer genes in conflict with the species tree topology).


Asunto(s)
Arabidopsis , Cucurbitaceae , Filogenia , Cucurbitaceae/genética , Arabidopsis/genética , Genoma
9.
Front Plant Sci ; 14: 1153505, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37434602

RESUMEN

An improved YOLOv5s model was proposed and validated on a new fruit dataset to solve the real-time detection task in a complex environment. With the incorporation of feature concatenation and an attention mechanism into the original YOLOv5s network, the improved YOLOv5s recorded 122 layers, 4.4 × 106 params, 12.8 GFLOPs, and 8.8 MB weight size, which are 45.5%, 30.2%, 14.1%, and 31.3% smaller than the original YOLOv5s, respectively. Meanwhile, the obtained 93.4% of mAP tested on the valid set, 96.0% of mAP tested on the test set, and 74 fps of speed tested on videos using improved YOLOv5s is 0.6%, 0.5%, and 10.4% higher than the original YOLOv5s model, respectively. Using videos, the fruit tracking and counting tested on the improved YOLOv5s observed less missed and incorrect detections compared to the original YOLOv5s. Furthermore, the aggregated detection performance of improved YOLOv5s outperformed the network of GhostYOLOv5s, YOLOv4-tiny, and YOLOv7-tiny, including other mainstream YOLO variants. Therefore, the improved YOLOv5s is lightweight with reduced computation costs, can better generalize against complex conditions, and is applicable for real-time detection in fruit picking robots and low-power devices.

10.
Sensors (Basel) ; 23(6)2023 Mar 20.
Artículo en Inglés | MEDLINE | ID: mdl-36991967

RESUMEN

This study proposes an electrocardiogram (ECG) signal stitching scheme to detect arrhythmias in drivers during driving. When the ECG is measured through the steering wheel during driving, the data are always exposed to noise caused by vehicle vibrations, bumpy road conditions, and the driver's steering wheel gripping force. The proposed scheme extracts stable ECG signals and transforms them into full 10 s ECG signals to classify arrhythmias using convolutional neural networks (CNN). Before the ECG stitching algorithm is applied, data preprocessing is performed. To extract the cycle from the collected ECG data, the R peaks are found and the TP interval segmentation is applied. An abnormal P peak is very difficult to find. Therefore, this study also introduces a P peak estimation method. Finally, 4 × 2.5 s ECG segments are collected. To classify arrhythmias with stitched ECG data, each time series' ECG signal is transformed via the continuous wavelet transform (CWT) and short-time Fourier transform (STFT), and transfer learning is performed for classification using CNNs. Finally, the parameters of the networks that provide the best performance are investigated. According to the classification accuracy, GoogleNet with the CWT image set shows the best results. The classification accuracy is 82.39% for the stitched ECG data, while it is 88.99% for the original ECG data.


Asunto(s)
Aprendizaje Profundo , Humanos , Arritmias Cardíacas/diagnóstico , Redes Neurales de la Computación , Algoritmos , Electrocardiografía
11.
Crit Rev Food Sci Nutr ; 63(32): 10995-11009, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-35730201

RESUMEN

Enological evaluations capture the chemical and sensory space of wine using different techniques; many sensory methods as well as a variety of analytical chemistry techniques contribute to the amount of information generated. Data fusion, especially integrating data sets, is important when working with complex systems. The success reported when trying to integrate different modalities is generally low and has been attributed to the lack of statistically considerate strategies focusing on the data handling process. Multiple stages of data handling must be carefully considered when dealing with multi-modal data. In this review, the different stages in the data analysis process were examined. The study revealed misconceptions surrounding the process and elucidated rules for purpose-driven approaches by examining the complexities of each stage and the impact the decisions made at each stage have on the resulting models. The two major modeling approaches are either supervised (discrimination, classification, prediction) or unsupervised (exploration). Supervised approaches were emphatic on the pre-processing steps and prioritized increasing performance. Unsupervised approaches were mostly used for preliminary steps. The review found aspects often neglected when it came to the data collection and capturing which in the end contributed to the low success in combining sensory and chemistry data.


Asunto(s)
Quimiometría , Vino
12.
Imeta ; 2(1): e87, 2023 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38868339

RESUMEN

Phylogenetic analysis has entered the genomics (multilocus) era. For less experienced researchers, conquering the large number of software programs required for a multilocus-based phylogenetic reconstruction can be somewhat daunting and time-consuming. PhyloSuite, a software with a user-friendly GUI, was designed to make this process more accessible by integrating multiple software programs needed for multilocus and single-gene phylogenies and further streamlining the whole process. In this protocol, we aim to explain how to conduct each step of the phylogenetic pipeline and tree-based analyses in PhyloSuite. We also present a new version of PhyloSuite (v1.2.3), wherein we fixed some bugs, made some optimizations, and introduced some new functions, including a number of tree-based analyses, such as signal-to-noise calculation, saturation analysis, spurious species identification, and etc. The step-by-step protocol includes background information (i.e., what the step does), reasons (i.e., why do the step), and operations (i.e., how to do it). This protocol will help researchers quick-start their way through the multilocus phylogenetic analysis, especially those interested in conducting organelle-based analyses.

13.
BMC Oral Health ; 22(1): 571, 2022 12 07.
Artículo en Inglés | MEDLINE | ID: mdl-36476146

RESUMEN

BACKGROUND: Assessing the time required for tooth extraction is the most important factor to consider before surgeries. The purpose of this study was to create a practical predictive model for assessing the time to extract the mandibular third molar tooth using deep learning. The accuracy of the model was evaluated by comparing the extraction time predicted by deep learning with the actual time required for extraction. METHODS: A total of 724 panoramic X-ray images and clinical data were used for artificial intelligence (AI) prediction of extraction time. Clinical data such as age, sex, maximum mouth opening, body weight, height, the time from the start of incision to the start of suture, and surgeon's experience were recorded. Data augmentation and weight balancing were used to improve learning abilities of AI models. Extraction time predicted by the concatenated AI model was compared with the actual extraction time. RESULTS: The final combined model (CNN + MLP) model achieved an R value of 0.8315, an R-squared value of 0.6839, a p-value of less than 0.0001, and a mean absolute error (MAE) of 2.95 min with the test dataset. CONCLUSIONS: Our proposed model for predicting time to extract the mandibular third molar tooth performs well with a high accuracy in clinical practice.


Asunto(s)
Inteligencia Artificial , Aprendizaje Profundo , Humanos , Tercer Molar/diagnóstico por imagen , Tercer Molar/cirugía , Extracción Dental , Tempo Operativo
14.
Front Bioinform ; 2: 1074802, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36568700

RESUMEN

The reconstruction of phylogenomic trees containing multiple genes is best achieved by using a supermatrix. The advent of NGS technology made it easier and cheaper to obtain multiple gene data in one sequencing run. When numerous genes and organisms are used in the phylogenomic analysis, it is difficult to organize all information and manually align the gene sequences to further concatenate them. This study describes SPLACE, a tool to automatically SPLit, Align, and ConcatenatE the genes of all species of interest to generate a supermatrix file, and consequently, a phylogenetic tree, while handling possible missing data. In our findings, SPLACE was the only tool that could automatically align gene sequences and also handle missing data; and, it required only a few minutes to produce a supermatrix FASTA file containing 83 aligned and concatenated genes from the chloroplast genomes of 270 plant species. It is an open-source tool and is publicly available at https://github.com/reinator/splace.

15.
Life (Basel) ; 12(11)2022 Nov 11.
Artículo en Inglés | MEDLINE | ID: mdl-36430983

RESUMEN

Due to various reasons, the incidence rate of communicable diseases in humans is steadily rising, and timely detection and handling will reduce the disease distribution speed. Tuberculosis (TB) is a severe communicable illness caused by the bacterium Mycobacterium-Tuberculosis (M. tuberculosis), which predominantly affects the lungs and causes severe respiratory problems. Due to its significance, several clinical level detections of TB are suggested, including lung diagnosis with chest X-ray images. The proposed work aims to develop an automatic TB detection system to assist the pulmonologist in confirming the severity of the disease, decision-making, and treatment execution. The proposed system employs a pre-trained VGG19 with the following phases: (i) image pre-processing, (ii) mining of deep features, (iii) enhancing the X-ray images with chosen procedures and mining of the handcrafted features, (iv) feature optimization using Seagull-Algorithm and serial concatenation, and (v) binary classification and validation. The classification is executed with 10-fold cross-validation in this work, and the proposed work is investigated using MATLAB® software. The proposed research work was executed using the concatenated deep and handcrafted features, which provided a classification accuracy of 98.6190% with the SVM-Medium Gaussian (SVM-MG) classifier.

16.
Physiol Meas ; 43(10)2022 10 31.
Artículo en Inglés | MEDLINE | ID: mdl-36195081

RESUMEN

Objective.Due to the variability of human movements, muscle activations vary among trials and subjects. However, few studies investigated how data organization methods for addressing variability impact the extracted muscle synergies.Approach.Fifteen healthy subjects performed a large set of upper limb multi-directional point-to-point reaching movements. Then, the study extracted muscle synergies under different data settings and investigated how data structure prior to synergy extraction, namely concatenation, averaging, and single trial, the number of considered trials, and the number of reaching directions affected the number and components of muscle synergies.Main results.The results showed that the number and components of synergies were significantly affected by the data structure. The concatenation method identified the highest number of synergies, and the averaging method usually found a smaller number of synergies. When the concatenated trials or reaching directions was lower than a minimum value, the number of synergies increased with the increase of the number of trials or reaching directions; however, when the number of trials or reaching directions reached a threshold, the number of synergies was usually constant or with less variation even when novel directions and trials were added. Similarity analysis also showed a slight increase when the number of trials or reaching directions was lower than a threshold. This study recommends that at least five trials and four reaching directions and the concatenation method are considered in muscle synergies analysis during upper limb tasks.Significance.This study makes the researchers focus on the variability analysis induced by the diseases rather than the techniques applied for synergies analysis and promotes applications of muscle synergies in clinical scenarios.


Asunto(s)
Movimiento , Músculo Esquelético , Humanos , Electromiografía , Fenómenos Biomecánicos , Músculo Esquelético/fisiología , Movimiento/fisiología , Extremidad Superior
17.
Healthcare (Basel) ; 10(10)2022 Oct 18.
Artículo en Inglés | MEDLINE | ID: mdl-36292519

RESUMEN

The novel coronavirus 2019 (COVID-19) spread rapidly around the world and its outbreak has become a pandemic. Due to an increase in afflicted cases, the quantity of COVID-19 tests kits available in hospitals has decreased. Therefore, an autonomous detection system is an essential tool for reducing infection risks and spreading of the virus. In the literature, various models based on machine learning (ML) and deep learning (DL) are introduced to detect many pneumonias using chest X-ray images. The cornerstone in this paper is the use of pretrained deep learning CNN architectures to construct an automated system for COVID-19 detection and diagnosis. In this work, we used the deep feature concatenation (DFC) mechanism to combine features extracted from input images using the two modern pre-trained CNN models, AlexNet and Xception. Hence, we propose COVID-AleXception: a neural network that is a concatenation of the AlexNet and Xception models for the overall improvement of the prediction capability of this pandemic. To evaluate the proposed model and build a dataset of large-scale X-ray images, there was a careful selection of multiple X-ray images from several sources. The COVID-AleXception model can achieve a classification accuracy of 98.68%, which shows the superiority of the proposed model over AlexNet and Xception that achieved a classification accuracy of 94.86% and 95.63%, respectively. The performance results of this proposed model demonstrate its pertinence to help radiologists diagnose COVID-19 more quickly.

18.
Entropy (Basel) ; 24(7)2022 Jun 25.
Artículo en Inglés | MEDLINE | ID: mdl-35885098

RESUMEN

Densely connected convolutional networks (DenseNet) behave well in image processing. However, for regression tasks, convolutional DenseNet may lose essential information from independent input features. To tackle this issue, we propose a novel DenseNet regression model where convolution and pooling layers are replaced by fully connected layers and the original concatenation shortcuts are maintained to reuse the feature. To investigate the effects of depth and input dimensions of the proposed model, careful validations are performed by extensive numerical simulation. The results give an optimal depth (19) and recommend a limited input dimension (under 200). Furthermore, compared with the baseline models, including support vector regression, decision tree regression, and residual regression, our proposed model with the optimal depth performs best. Ultimately, DenseNet regression is applied to predict relative humidity, and the outcome shows a high correlation with observations, which indicates that our model could advance environmental data science.

19.
Mol Biol Evol ; 39(6)2022 06 02.
Artículo en Inglés | MEDLINE | ID: mdl-35642314

RESUMEN

Traditionally, single-copy orthologs have been the gold standard in phylogenomics. Most phylogenomic studies identify putative single-copy orthologs using clustering approaches and retain families with a single sequence per species. This limits the amount of data available by excluding larger families. Recent advances have suggested several ways to include data from larger families. For instance, tree-based decomposition methods facilitate the extraction of orthologs from large families. Additionally, several methods for species tree inference are robust to the inclusion of paralogs and could use all of the data from larger families. Here, we explore the effects of using all families for phylogenetic inference by examining relationships among 26 primate species in detail and by analyzing five additional data sets. We compare single-copy families, orthologs extracted using tree-based decomposition approaches, and all families with all data. We explore several species tree inference methods, finding that identical trees are returned across nearly all subsets of the data and methods for primates. The relationships among Platyrrhini remain contentious; however, the species tree inference method matters more than the subset of data used. Using data from larger gene families drastically increases the number of genes available and leads to consistent estimates of branch lengths, nodal certainty and concordance, and inferences of introgression in primates. For the other data sets, topological inferences are consistent whether single-copy families or orthologs extracted using decomposition approaches are analyzed. Using larger gene families is a promising approach to include more data in phylogenomics without sacrificing accuracy, at least when high-quality genomes are available.


Asunto(s)
Genoma , Animales , Análisis por Conglomerados , Filogenia
20.
BMC Ecol Evol ; 22(1): 55, 2022 04 30.
Artículo en Inglés | MEDLINE | ID: mdl-35501703

RESUMEN

BACKGROUND: The genus Ligusticum belongs to Apiaceae, and its taxonomy has long been a major difficulty. A robust phylogenetic tree is the basis of accurate taxonomic classification of Ligusticum. We herein used 26 (including 14 newly sequenced) plastome-scale data to generate reliable phylogenetic trees to explore the phylogenetic relationships of Chinese Ligusticum. RESULTS: We found that these plastid genomes exhibited diverse plastome characteristics across all four currently identified clades in China, while the plastid protein-coding genes were conserved. The phylogenetic analyses by the concatenation and coalescent methods obtained a more robust molecular phylogeny than prior studies and showed the non-monophyly of Chinese Ligusticum. In the concatenation-based phylogeny analyses, the two datasets yielded slightly different topologies that may be primarily due to the discrepancy in the number of variable sites. CONCLUSIONS: Our plastid phylogenomics analyses emphasized that the current circumscription of the Chinese Ligusticum should be reduced, and the taxonomy of Ligusticum urgently needs revision. Wider taxon sampling including the related species of Ligusticum will be necessary to explore the phylogenetic relationships of this genus. Overall, our study provided new insights into the taxonomic classification of Ligusticum and would serve as a framework for future studies on taxonomy and delimitation of Ligusticum from the perspective of the plastid genome.


Asunto(s)
Apiaceae , Genoma de Plastidios , Ligusticum , Evolución Molecular , Filogenia
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA