Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 62
Filtrar
1.
J Environ Manage ; 368: 122136, 2024 Aug 10.
Artigo em Inglês | MEDLINE | ID: mdl-39128344

RESUMO

Environmental DNA (eDNA) metabarcoding is an emerging tool for monitoring biological communities in aquatic ecosystems. The selection of bioinformatic pipelines significantly impacts the results of biodiversity assessments. However, there is currently no consensus on the appropriate bioinformatic pipelines for fish community analysis in eDNA metabarcoding. In this study, we compared three bioinformatic pipelines (Uparse, DADA2, and UNOISE3) using real and mock (constructed with 15/30 known fish) communities to investigate the differences in biological interpretation during the data analysis process in eDNA metabarcoding. Performance evaluation and diversity analyses revealed that the choice of bioinformatic pipeline could impact the biological results of metabarcoding experiments. Among the three pipelines, the operational taxonomic units (OTU)-based pipeline (Uparse) showed the best performance (sensitivity: 0.6250 ± 0.0166; compositional similarity: 0.4000 ± 0.0571), the highest richness (25-102) and minimal inter-group differences in alpha diversity. It suggested the OTU-based pipeline possessed superior capability in fish diversity monitoring compared to ASV/ZOTU-based pipeline. Additionally, the Bray-Curtis distance matrix achieved the highest discriminative effect in the PCoA (43.3%-53.89%) and inter-group analysis (P < 0.01), indicating it was better at distinguishing compositional differences or specific genera of fish community at different sampling sites than other distance matrices. These findings provide new insights into fish community monitoring through eDNA metabarcoding in estuarine environments.

2.
Spectrochim Acta A Mol Biomol Spectrosc ; 322: 124783, 2024 Dec 05.
Artigo em Inglês | MEDLINE | ID: mdl-38972098

RESUMO

Due to the high-dimensionality, redundancy, and non-linearity of the near-infrared (NIR) spectra data, as well as the influence of attributes such as producing area and grade of the sample, which can all affect the similarity measure between samples. This paper proposed a t-distributed stochastic neighbor embedding algorithm based on Sinkhorn distance (St-SNE) combined with multi-attribute data information. Firstly, the Sinkhorn distance was introduced which can solve problems such as KL divergence asymmetry and sparse data distribution in high-dimensional space, thereby constructing probability distributions that make low-dimensional space similar to high-dimensional space. In addition, to address the impact of multi-attribute features of samples on similarity measure, a multi-attribute distance matrix was constructed using information entropy, and then combined with the numerical matrix of spectral data to obtain a mixed data matrix. In order to validate the effectiveness of the St-SNE algorithm, dimensionality reduction projection was performed on NIR spectral data and compared with PCA, LPP, and t-SNE algorithms. The results demonstrated that the St-SNE algorithm effectively distinguishes samples with different attribute information, and produced more distinct projection boundaries of sample category in low-dimensional space. Then we tested the classification performance of St-SNE for different attributes by using the tobacco and mango datasets, and compared it with LPP, t-SNE, UMAP, and Fisher t-SNE algorithms. The results showed that St-SNE algorithm had the highest classification accuracy for different attributes. Finally, we compared the results of searching the most similar sample with the target tobacco for cigarette formulas, and experiments showed that the St-SNE had the highest consistency with the recommendation of the experts than that of the other algorithms. It can provide strong support for the maintenance and design of the product formula.

3.
J Cheminform ; 16(1): 52, 2024 May 12.
Artigo em Inglês | MEDLINE | ID: mdl-38735985

RESUMO

Protein-ligand binding affinity plays a pivotal role in drug development, particularly in identifying potential ligands for target disease-related proteins. Accurate affinity predictions can significantly reduce both the time and cost involved in drug development. However, highly precise affinity prediction remains a research challenge. A key to improve affinity prediction is to capture interactions between proteins and ligands effectively. Existing deep-learning-based computational approaches use 3D grids, 4D tensors, molecular graphs, or proximity-based adjacency matrices, which are either resource-intensive or do not directly represent potential interactions. In this paper, we propose atomic-level distance features and attention mechanisms to capture better specific protein-ligand interactions based on donor-acceptor relations, hydrophobicity, and π -stacking atoms. We argue that distances encompass both short-range direct and long-range indirect interaction effects while attention mechanisms capture levels of interaction effects. On the very well-known CASF-2016 dataset, our proposed method, named Distance plus Attention for Affinity Prediction (DAAP), significantly outperforms existing methods by achieving Correlation Coefficient (R) 0.909, Root Mean Squared Error (RMSE) 0.987, Mean Absolute Error (MAE) 0.745, Standard Deviation (SD) 0.988, and Concordance Index (CI) 0.876. The proposed method also shows substantial improvement, around 2% to 37%, on five other benchmark datasets. The program and data are publicly available on the website https://gitlab.com/mahnewton/daap. Scientific Contribution StatementThis study innovatively introduces distance-based features to predict protein-ligand binding affinity, capitalizing on unique molecular interactions. Furthermore, the incorporation of protein sequence features of specific residues enhances the model's proficiency in capturing intricate binding patterns. The predictive capabilities are further strengthened through the use of a deep learning architecture with attention mechanisms, and an ensemble approach, averaging the outputs of five models, is implemented to ensure robust and reliable predictions.

4.
Head Face Med ; 20(1): 34, 2024 May 18.
Artigo em Inglês | MEDLINE | ID: mdl-38762519

RESUMO

BACKGROUND: We aimed to establish a novel method for automatically constructing three-dimensional (3D) median sagittal plane (MSP) for mandibular deviation patients, which can increase the efficiency of aesthetic evaluating treatment progress. We developed a Euclidean weighted Procrustes analysis (EWPA) algorithm for extracting 3D facial MSP based on the Euclidean distance matrix analysis, automatically assigning weight to facial anatomical landmarks. METHODS: Forty patients with mandibular deviation were recruited, and the Procrustes analysis (PA) algorithm based on the original mirror alignment and EWPA algorithm developed in this study were used to construct the MSP of each facial model of the patient as experimental groups 1 and 2, respectively. The expert-defined regional iterative closest point algorithm was used to construct the MSP as the reference group. The angle errors of the two experimental groups were compared to those of the reference group to evaluate their clinical suitability. RESULTS: The angle errors of the MSP constructed by the two EWPA and PA algorithms for the 40 patients were 1.39 ± 0.85°, 1.39 ± 0.78°, and 1.91 ± 0.80°, respectively. The two EWPA algorithms performed best in patients with moderate facial asymmetry, and in patients with severe facial asymmetry, the angle error was below 2°, which was a significant improvement over the PA algorithm. CONCLUSIONS: The clinical application of the EWPA algorithm based on 3D facial morphological analysis for constructing a 3D facial MSP for patients with mandibular deviated facial asymmetry deformity showed a significant improvement over the conventional PA algorithm and achieved the effect of a dental clinical expert-level diagnostic strategy.


Assuntos
Algoritmos , Assimetria Facial , Imageamento Tridimensional , Humanos , Assimetria Facial/diagnóstico por imagem , Masculino , Feminino , Imageamento Tridimensional/métodos , Pontos de Referência Anatômicos , Mandíbula/diagnóstico por imagem , Adolescente , Adulto , Adulto Jovem , Cefalometria/métodos , Face/diagnóstico por imagem
5.
J Oral Maxillofac Pathol ; 28(1): 111-118, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38800435

RESUMO

Aims: The study aims to identify sexual dimorphic features in the arch patterns based on tooth arrangement patterns and the maxillary and mandibular arches using Euclidean Distance Matrix Analysis (EDMA). Settings and Design: A total of 96 Nepalese subjects, aged 18 to 25 were assessed using casts and photographs. Materials and Methods: Thirteen landmarks representing the most facial portions of the proximal contact areas on the maxillary and mandibular casts were digitised. Seventy-eight possible, Euclidean distances between the 13 landmarks were calculated using the Analysis ToolPak of Microsoft Excel®. The male-to-female ratios of the corresponding distances were computed and ratios were compared to evaluate the arch form for variation in the genders, among the Nepalese population. Statistical Analysis Used: Microsoft Excel Analysis ToolPak and SPSS 20.0 (IBM Chicago) were used to perform EDMA and an independent t-test to compare the significant differences between the two genders. Results: The maxillary arch's largest ratio (1.008179001) was discovered near the location of the right and left lateral incisors, indicating that the anterior region may have experienced the greatest change. The posterior-molar region is where the smallest ratio was discovered, suggesting less variation. At the intercanine region, female arches were wider than male ones; however, at the interpremolar and intermolar sections, they were similar in width. Females' maxillary arches were discovered to be bigger antero-posteriorly than those of males. The highest ratio (1.014336113) in the mandibular arch was discovered at the intermolar area, suggesting that males had a larger mandibular posterior arch morphology. At the intercanine area, the breadth of the arch form was greater in males and nearly the same in females at the interpremolar and intermolar regions. Female mandibular arch forms were also discovered to be longer than those of males from the anterior to the posterior. Conclusions: The male and female arches in the Nepalese population were inferred to be different in size and shape. With references to the landmarks demonstrating such a shift, the EDMA established objectively the presence of square arch forms in Nepali males and tapering arch forms in Nepalese females.

6.
Protein J ; 43(2): 259-273, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38492188

RESUMO

The paper introduces a novel probability descriptor for genome sequence comparison, employing a generalized form of Jensen-Shannon divergence. This divergence metric stems from a one-parameter family, comprising fractions up to a maximum value of half. Utilizing this metric as a distance measure, a distance matrix is computed for the new probability descriptor, shaping Phylogenetic trees via the neighbor-joining method. Initial exploration involves setting the parameter at half for various species. Assessing the impact of parameter variation, trees drawn at different parameter values (half, one-fourth, one-eighth). However, measurement scales decrease with parameter value increments, with higher similarity accuracy corresponding to lower scale values. Ultimately, the highest accuracy aligns with the maximum parameter value of half. Comparative analyses against previous methods, evaluating via Symmetric Distance (SD) values and rationalized perception, consistently favor the present approach's results. Notably, outcomes at the maximum parameter value exhibit the most accuracy, validating the method's efficacy against earlier approaches.


Assuntos
Filogenia , Genoma , Algoritmos , Alinhamento de Sequência/métodos , Genômica/métodos
7.
Genome Biol Evol ; 15(12)2023 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-38085949

RESUMO

Phylogenetics is now fundamental in life sciences, providing insights into the earliest branches of life and the origins and spread of epidemics. However, finding suitable phylogenies from the vast space of possible trees remains challenging. To address this problem, for the first time, we perform both tree exploration and inference in a continuous space where the computation of gradients is possible. This continuous relaxation allows for major leaps across tree space in both rooted and unrooted trees, and is less susceptible to convergence to local minima. Our approach outperforms the current best methods for inference on unrooted trees and, in simulation, accurately infers the tree and root in ultrametric cases. The approach is effective in cases of empirical data with negligible amounts of data, which we demonstrate on the phylogeny of jawed vertebrates. Indeed, only a few genes with an ultrametric signal were generally sufficient for resolving the major lineages of vertebrates. Optimization is possible via automatic differentiation and our method presents an effective way forward for exploring the most difficult, data-deficient phylogenetic questions.


Assuntos
Algoritmos , Modelos Genéticos , Filogenia , Simulação por Computador
8.
Genes (Basel) ; 14(7)2023 07 14.
Artigo em Inglês | MEDLINE | ID: mdl-37510350

RESUMO

Classically, genetic association studies have attempted to assess genetic polymorphisms related to human physiology and physical performance. However, the heterogeneity of some findings drives the research to replicate, validate, and confirmation as essential aspects for ensuring their applicability in sports sciences. Genetic distance matrix and molecular variance analyses may offer an alternative approach to comparing athletes' genomes with those from public databases. Thus, we performed a complete sequencing of 44 genomes from male Brazilian first-division soccer players under 20 years of age (U20_BFDSC). The performance-related SNP genotypes were obtained from players and from the "1000 Genomes" database (European, African, American, East Asian, and South Asian). Surprisingly, U20_BFDSC performance-related genotypes had significantly larger FST levels (p < 0.00001) than African populations, although studies using ancestry markers have shown an important similarity between Brazilian and African populations (12-24%). U20_BFDSC were genetically similar to professional athletes, showing the intense genetic selection pressure likely to occur before this maturation stage. Our study highlighted that performance-related genes might undergo selective pressure due to physical performance and environmental, cognitive, and sociocultural factors. This replicative study suggests that molecular variance and Wright's statistics can yield novel conclusions in exercise science.


Assuntos
Desempenho Atlético , Futebol , Humanos , Masculino , Adolescente , Futebol/fisiologia , Brasil , Desempenho Atlético/fisiologia , Atletas , Exercício Físico
9.
Data Brief ; 47: 108970, 2023 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-36875213

RESUMO

Phylogenetic trees provide insight into the evolutionary trajectories of species and molecules. However, because (2n-5)! Phylogenetic trees can be constructed from a dataset containing n sequences, but this method of phylogenetic tree construction is not ideal from the viewpoint of a combinatorial explosion to determine the optimal tree using brute force. Therefore, we developed a method for constructing a phylogenetic tree using a Fujitsu Digital Annealer, a quantum-inspired computer that solves combinatorial optimization problems at a high speed. Specifically, phylogenetic trees are generated by repeating the process of partitioning a set of sequences into two parts (i.e., the graph-cut problem). Here, the optimality of the solution (normalized cut value) obtained by the proposed method was compared with the existing methods using simulated and real data. The simulation dataset contained 32-3200 sequences, and the average branch length according to a normal distribution or the Yule model ranged from 0.125 to 0.750, covering a wide range of sequence diversity. In addition, the statistical information of the dataset is described in terms of two indices: transitivity and average p-distance. As phylogenetic tree construction methods are expected to continue to improve, we believe that this dataset can be used as a reference for comparison and confirmation of the validity of the results. Further interpretation of these analyses is explained in W. Onodera, N. Hara, S. Aoki, T. Asahi, N. Sawamura, Phylogenetic tree reconstruction via graph cut presented using a quantum-inspired computer, Mol. Phylogenet. Evol. 178 (2023) 107636.

10.
Mol Phylogenet Evol ; 178: 107636, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-36208695

RESUMO

Phylogenetic trees are essential tools in evolutionary biology that present information on evolutionary events among organisms and molecules. From a dataset of n sequences, a phylogenetic tree of (2n-5)!! possible topologies exists, and determining the optimum topology using brute force is infeasible. Recently, a recursive graph cut on a graph-represented-similarity matrix has proven accurate in reconstructing a phylogenetic tree containing distantly related sequences. However, identifying the optimum graph cut is challenging, and approximate solutions are currently utilized. Here, a phylogenetic tree was reconstructed with an improved graph cut using a quantum-inspired computer, the Fujitsu Digital Annealer (DA), and the algorithm was named the "Normalized-Minimum cut by Digital Annealer (NMcutDA) method". First, a criterion for the graph cut, the normalized cut value, was compared with existing clustering methods. Based on the cut, we verified that the simulated phylogenetic tree could be reconstructed with the highest accuracy when sequences were diverged. Moreover, for some actual data from the structure-based protein classification database, only NMcutDA could cluster sequences into correct superfamilies. Conclusively, NMcutDA reconstructed better phylogenetic trees than those using other methods by optimizing the graph cut. We anticipate that when the diversity of sequences is sufficiently high, NMcutDA can be utilized with high efficiency.


Assuntos
Algoritmos , Computadores , Filogenia , Análise por Conglomerados , Bases de Dados de Proteínas
11.
Sensors (Basel) ; 22(21)2022 Oct 31.
Artigo em Inglês | MEDLINE | ID: mdl-36366045

RESUMO

Advances in neural networks have garnered growing interest in applications of machine vision in livestock management, but simpler landmark-based approaches suitable for small, early stage exploratory studies still represent a critical stepping stone towards these more sophisticated analyses. While such approaches are well-validated for calibrated images, the practical limitations of such imaging systems restrict their applicability in working farm environments. The aim of this study was to validate novel algorithmic approaches to improving the reliability of scale-free image biometrics acquired from uncalibrated images of minimally restrained livestock. Using a database of 551 facial images acquired from 108 dairy cows, we demonstrate that, using a simple geometric projection-based approach to metric extraction, a priori knowledge may be leveraged to produce more intuitive and reliable morphometric measurements than conventional informationally complete Euclidean distance matrix analysis. Where uncontrolled variations in image annotation, camera position, and animal pose could not be fully controlled through the design of morphometrics, we further demonstrate how modern unsupervised machine learning tools may be used to leverage the systematic error structures created by such lurking variables in order to generate bias correction terms that may subsequently be used to improve the reliability of downstream statistical analyses and dimension reduction.


Assuntos
Gado , Aprendizado de Máquina não Supervisionado , Animais , Feminino , Bovinos , Reprodutibilidade dos Testes , Redes Neurais de Computação , Matemática
12.
J Bioinform Comput Biol ; 20(4): 2250012, 2022 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-35798684

RESUMO

The evolutionary histories of genes are susceptible of differing greatly from each other which could be explained by evolutionary variations in horizontal gene transfers or biological recombinations. A phylogenetic tree would therefore represent the evolutionary history of each gene, which may present different patterns from the species tree that defines the main evolutionary patterns. In addition, phylogenetic trees of closely related species should be merged, thus minimizing the topological conflicts they present and obtaining consensus trees (in the case of homogeneous data) or supertrees (in the case of heterogeneous data). The traditional approaches are consensus tree inference (if the set of trees contains the same set of species) or supertrees (if the set of trees contains different, but overlapping sets of species). Consensus trees and supertrees are constructed to produce unique trees. However, these methods lose precision with respect to different evolutionary variability. Other approaches have been implemented to preserve this variability using the [Formula: see text]-means algorithm or the [Formula: see text]-medoids algorithm. Using a new method, we determine all possible consensus trees and supertrees that best represent the most significant evolutionary models in a set of phylogenetic trees, thereby increasing the precision of the results and decreasing the time required. Results: This paper presents in detail a new method for predicting the number of clusters in a Robinson and Foulds (RF) distance matrix using a convolutional neural network (CNN). We developed a new CNN approach (called CNNTrees) for multiple tree classification. This new strategy returns a number of clusters of the input phylogenetic trees for different-size sets of trees, which makes the new approach more stable and more robust. The paper provides an in-depth analysis of the relevant, but very difficult, problem of constructing alternative supertrees using phylogenies with different but overlapping sets of taxa. This new model will play an important role in the inference of Trees of Life (ToL). Availability and implementation: CNNTrees is available through a web server at https://tahirinadia.github.io/. The source code, data and information about installation procedures are also available at https://github.com/TahiriNadia/CNNTrees. Supplementary information: Supplementary data are available on GitHub platform. The evolutionary history of species is not unique, but is specific to sets of genes. Indeed, each gene has its own evolutionary history that differs considerably from one gene to another. For example, some individual genes or operons may be affected by specific horizontal gene transfer and recombination events. Thus, the evolutionary history of each gene must be represented by its own phylogenetic tree, which may exhibit different evolutionary patterns than the species tree that accounts for the major vertical descent patterns. The result of traditional consensus tree or supertree inference methods is a single consensus tree or supertree. In this paper, we present in detail a new method for predicting the number of clusters in a Robinson and Foulds (RF) distance matrix using a convolutional neural network (CNN). We developed a new CNN approach (CNNTrees) to construct multiple tree classification. This new strategy returns a number of clusters in the order of the input trees, which allows this new approach to be more stable and also more robust.


Assuntos
Algoritmos , Redes Neurais de Computação , Transferência Genética Horizontal , Filogenia , Software
13.
Genes (Basel) ; 13(6)2022 05 25.
Artigo em Inglês | MEDLINE | ID: mdl-35741702

RESUMO

Recently, we have seen a growing volume of evidence linking the microbiome and human diseases or clinical outcomes, as well as evidence linking the microbiome and environmental exposures. Now comes the time to assess whether the microbiome mediates the effects of exposures on the outcomes, which will enable researchers to develop interventions to modulate outcomes by modifying microbiome compositions. Use of distance matrices is a popular approach to analyzing complex microbiome data that are high-dimensional, sparse, and compositional. However, the existing distance-based methods for mediation analysis of microbiome data, MedTest and MODIMA, only work well in limited scenarios. PERMANOVA is currently the most commonly used distance-based method for testing microbiome associations. Using the idea of inverse regression, here we extend PERMANOVA to test microbiome-mediation effects by including both the exposure and the outcome as covariates and basing the test on the product of their F statistics. This extension of PERMANOVA, which we call PERMANOVA-med, naturally inherits all the flexible features of PERMANOVA, e.g., allowing adjustment of confounders, accommodating continuous, binary, and multivariate exposure and outcome variables including survival outcomes, and providing an omnibus test that combines the results from analyzing multiple distance matrices. Our extensive simulations indicated that PERMANOVA-med always controlled the type I error and had compelling power over MedTest and MODIMA. Frequently, MedTest had diminished power and MODIMA had inflated type I error. Using real data on melanoma immunotherapy response, we demonstrated the wide applicability of PERMANOVA-med through 16 different mediation analyses, only 6 of which could be performed by MedTest and 4 by MODIMA.


Assuntos
Microbiota , Exposição Ambiental , Humanos , Projetos de Pesquisa
14.
Brief Bioinform ; 23(1)2022 01 17.
Artigo em Inglês | MEDLINE | ID: mdl-34553226

RESUMO

The development of single-cell ribonucleic acid (RNA) sequencing (scRNA-seq) technology has led to great opportunities for the identification of heterogeneous cell types in complex tissues. Clustering algorithms are of great importance to effectively identify different cell types. In addition, the definition of the distance between each two cells is a critical step for most clustering algorithms. In this study, we found that different distance measures have considerably different effects on clustering algorithms. Moreover, there is no specific distance measure that is applicable to all datasets. In this study, we introduce a new single-cell clustering method called SD-h, which generates an applicable distance measure for different kinds of datasets by optimally synthesizing commonly used distance measures. Then, hierarchical clustering is performed based on the new distance measure for more accurate cell-type clustering. SD-h was tested on nine frequently used scRNA-seq datasets and it showed great superiority over almost all the compared leading single-cell clustering algorithms.


Assuntos
Algoritmos , RNA , Análise por Conglomerados , Consenso , Análise de Sequência de RNA/métodos
15.
Rural Remote Health ; 21(4): 6128, 2021 10.
Artigo em Inglês | MEDLINE | ID: mdl-34598653

RESUMO

INTRODUCTION: Colombia's rural areas have suffered from government neglect, exacerbating their difficulties in relation to geographic isolation and meeting basic needs. These were some of the main reasons for guerrillas to initiate an armed conflict in the 1960s, trying to expand their forces and control through the rural and remote areas of the country. In this sense, it is necessary to construct new categories of rurality in Colombia, considering the armed conflict and the typology of isolation as variables that help policymakers and planners to make better decisions. METHODS: Based on 27 municipalities in the department of Caldas in Colombia, three accessibility measures were assessed to define isolated areas: geographical accessibility, access to health facilities and access to higher education. Health facilities were measured in three scenarios according to the flow of health care defined by the government. Higher education scenarios were defined according to Ministry of Education levels. Travel time was used as an attribute to calculate the isolation index of municipalities and was calculated using the Google Distance Matrix API using Python v3.7. As a measure of accessibility, a travel time limit was defined to delimit isolated areas. This variable was then added to the categories of rurality and armed conflict to produce the isolation typology by municipality. RESULTS: A strong correlation was found between all variables. Considering geographical accessibility, 20.3% of Caldas' population is isolated. The isolated population rises from 12.2% at the first level of health care to 43.2% of the population at the third level, and 39.5% of the inhabitants are far from universities. The municipalities highly affected by the armed conflict are more isolated in terms of travel time to health care and higher education facilities than those that were not affected. CONCLUSION: The isolation typology complements the Colombian rurality categories and can help governments make decisions about investments in road infrastructure, health, and education. In addition, some non-rural municipalities were found to be isolated, showing low accessibility to health and higher education, and the government should pay more attention to these areas. The government's neglect of municipalities highly affected by the armed conflict is shown by their continued isolation rates. The government should invest more and better in these areas taking into account this method of decision making. The typology of isolation could help the government to better plan care pathways for patients with complex health needs. In addition, it could help determine the investment for upgrading an existing hospital or building a new one, taking into account underserved areas. In terms of higher education, the isolation typology could help to understand where the community is underserved and initiate investment policies to improve access to higher education for its population.


Assuntos
Atenção à Saúde , População Rural , Conflitos Armados , Colômbia , Acessibilidade aos Serviços de Saúde , Humanos , Fatores de Tempo
16.
Adv Protein Chem Struct Biol ; 127: 217-248, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34340768

RESUMO

Protein structure characterization is fundamental to understand protein properties, such as folding process and protein resistance to thermal stress, up to unveiling organism pathologies (e.g., prion disease). In this chapter, we provide an overview on how the spectral properties of the networks reconstructed from the Protein Contact Map (PCM) can be used to generate informative observables. As a specific case study, we apply two different network approaches to an example protein dataset, for the aim of discriminating protein folding state, and for the reconstruction of protein 3D structure.


Assuntos
Bases de Dados de Proteínas , Dobramento de Proteína , Mapas de Interação de Proteínas , Proteínas/química , Proteínas/metabolismo , Animais , Humanos , Domínios Proteicos , Estabilidade Proteica
17.
Neuroimage Clin ; 31: 102715, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34130192

RESUMO

Pinpointing the brain dysconnectivity in idiopathic rapid eye movement sleep behaviour disorder (iRBD) can facilitate preventing the conversion of Parkinson's disease (PD) from prodromal phase. Recent neuroimage investigations reported disruptive brain white matter connectivity in both iRBD and PD, respectively. However, the intrinsic process of the human brain structural network evolving from iRBD to PD still remains largely unknown. To address this issue, 151 participants including iRBD, PD and age-matched normal controls were recruited to receive diffusion MRI scans and neuropsychological examinations. The connectome-wide association analysis was performed to detect reorganization of brain structural network along with PD progression. Eight brain seed regions in both cortical and subcortical areas demonstrated significant structural pattern changes along with the progression of PD. Applying machine learning on the key connectivity related to these seed regions demonstrated better classification accuracy compared to conventional network-based statistic. Our study shows that connectome-wide association analysis reveals the underlying structural connectivity patterns related to the progression of PD, and provide a promising distinct capability to predict prodromal PD patients.


Assuntos
Conectoma , Doença de Parkinson , Transtorno do Comportamento do Sono REM , Encéfalo/diagnóstico por imagem , Humanos , Imageamento por Ressonância Magnética , Doença de Parkinson/diagnóstico por imagem
18.
Methods Mol Biol ; 2242: 77-89, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33961219

RESUMO

By tracking pathogen outbreaks using whole genome sequencing, medical microbiology is currently being transformed into genomic epidemiology. This change in technology is leading to the rapid accumulation of large samples of closely related genome sequences. Summarizing such samples into phylogenies can be computationally challenging. Our program andi quickly computes accurate pairwise distances between up to thousands of bacterial genomes. Working under the UNIX command line, we show how andi can be used to transform genomes to phylogenies with support values ready to be printed or integrated into documents.


Assuntos
DNA Bacteriano/genética , Escherichia coli/genética , Genoma Bacteriano , Genômica , Filogenia , Shigella/genética , Bases de Dados Genéticas , Projetos de Pesquisa , Design de Software , Fluxo de Trabalho
19.
Heliyon ; 7(2): e06199, 2021 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-33644472

RESUMO

High-dimensional data are pervasive in this bigdata era. To avoid the curse of the dimensionality problem, various dimensionality reduction (DR) algorithms have been proposed. To facilitate systematic DR quality comparison and assessment, this paper reviews related metrics and develops an open-source Python package pyDRMetrics. Supported metrics include reconstruction error, distance matrix, residual variance, ranking matrix, co-ranking matrix, trustworthiness, continuity, co-k-nearest neighbor size, LCMC (local continuity meta criterion), and rank-based local/global properties. pyDRMetrics provides a native Python class and a web-oriented API. A case study of mass spectra is conducted to demonstrate the package functions. A web GUI wrapper is also published to support user-friendly B/S applications.

20.
F1000Res ; 10: 1260, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-36204675

RESUMO

A Molecular Features Set (MFS), is a result of a vast diversity of bioinformatics pipelines. The lack of a "gold standard" for most experimental data modalities makes it difficult to provide valid estimation for a particular MFS's quality. Yet, this goal can partially be achieved by analyzing inner-sample Distance Matrices (DM) and their power to distinguish between phenotypes. The quality of a DM can be assessed by summarizing its power to quantify the differences of inner-phenotype and outer-phenotype distances. This estimation of the DM quality can be construed as a measure of the MFS's quality.  Here we propose Hobotnica, an approach to estimate MFSs quality by their ability to stratify data, and assign them significance scores, that allow for collating various signatures and comparing their quality for contrasting groups.


Assuntos
Biologia Computacional , Fenótipo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA