Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 67
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
J Chem Inf Model ; 64(9): 3826-3840, 2024 May 13.
Artículo en Inglés | MEDLINE | ID: mdl-38696451

RESUMEN

Recent advances in computational methods provide the promise of dramatically accelerating drug discovery. While mathematical modeling and machine learning have become vital in predicting drug-target interactions and properties, there is untapped potential in computational drug discovery due to the vast and complex chemical space. This paper builds on our recently published computational fragment-based drug discovery (FBDD) method called fragment databases from screened ligand drug discovery (FDSL-DD). FDSL-DD uses in silico screening to identify ligands from a vast library, fragmenting them while attaching specific attributes based on predicted binding affinity and interaction with the target subdomain. In this paper, we further propose a two-stage optimization method that utilizes the information from prescreening to optimize computational ligand synthesis. We hypothesize that using prescreening information for optimization shrinks the search space and focuses on promising regions, thereby improving the optimization for candidate ligands. The first optimization stage assembles these fragments into larger compounds using genetic algorithms, followed by a second stage of iterative refinement to produce compounds with enhanced bioactivity. To demonstrate broad applicability, the methodology is demonstrated on three diverse protein targets found in human solid cancers, bacterial antimicrobial resistance, and the SARS-CoV-2 virus. Combined, the proposed FDSL-DD and a two-stage optimization approach yield high-affinity ligand candidates more efficiently than other state-of-the-art computational FBDD methods. We further show that a multiobjective optimization method accounting for drug-likeness can still produce potential candidate ligands with a high binding affinity. Overall, the results demonstrate that integrating detailed chemical information with a constrained search framework can markedly optimize the initial drug discovery process, offering a more precise and efficient route to developing new therapeutics.


Asunto(s)
Descubrimiento de Drogas , Ligandos , Descubrimiento de Drogas/métodos , Humanos , SARS-CoV-2/metabolismo , Algoritmos , Tratamiento Farmacológico de COVID-19 , COVID-19/virología
2.
Artículo en Inglés | MEDLINE | ID: mdl-38822995

RESUMEN

PURPOSE OF REVIEW: This review aims to explore the interface between artificial intelligence (AI) and chronic pain, seeking to identify areas of focus for enhancing current treatments and yielding novel therapies. RECENT FINDINGS: In the United States, the prevalence of chronic pain is estimated to be upwards of 40%. Its impact extends to increased healthcare costs, reduced economic productivity, and strain on healthcare resources. Addressing this condition is particularly challenging due to its complexity and the significant variability in how patients respond to treatment. Current options often struggle to provide long-term relief, with their benefits rarely outweighing the risks, such as dependency or other side effects. Currently, AI has impacted four key areas of chronic pain treatment and research: (1) predicting outcomes based on clinical information; (2) extracting features from text, specifically clinical notes; (3) modeling 'omic data to identify meaningful patient subgroups with potential for personalized treatments and improved understanding of disease processes; and (4) disentangling complex neuronal signals responsible for pain, which current therapies attempt to modulate. As AI advances, leveraging state-of-the-art architectures will be essential for improving chronic pain treatment. Current efforts aim to extract meaningful representations from complex data, paving the way for personalized medicine. The identification of unique patient subgroups should reveal targets for tailored chronic pain treatments. Moreover, enhancing current treatment approaches is achievable by gaining a more profound understanding of patient physiology and responses. This can be realized by leveraging AI on the increasing volume of data linked to chronic pain.

3.
BMC Genomics ; 24(1): 212, 2023 Apr 24.
Artículo en Inglés | MEDLINE | ID: mdl-37095444

RESUMEN

BACKGROUND: Early-onset renal cell carcinoma (eoRCC) is typically associated with pathogenic germline variants (PGVs) in RCC familial syndrome genes. However, most eoRCC patients lack PGVs in familial RCC genes and their genetic risk remains undefined. METHODS: Here, we analyzed biospecimens from 22 eoRCC patients that were seen at our institution for genetic counseling and tested negative for PGVs in RCC familial syndrome genes. RESULTS: Analysis of whole-exome sequencing (WES) data found enrichment of candidate pathogenic germline variants in DNA repair and replication genes, including multiple DNA polymerases. Induction of DNA damage in peripheral blood monocytes (PBMCs) significantly elevated numbers of [Formula: see text]H2AX foci, a marker of double-stranded breaks, in PBMCs from eoRCC patients versus PBMCs from matched cancer-free controls. Knockdown of candidate variant genes in Caki RCC cells increased [Formula: see text]H2AX foci. Immortalized patient-derived B cell lines bearing the candidate variants in DNA polymerase genes (POLD1, POLH, POLE, POLK) had DNA replication defects compared to control cells. Renal tumors carrying these DNA polymerase variants were microsatellite stable but had a high mutational burden. Direct biochemical analysis of the variant Pol δ and Pol η polymerases revealed defective enzymatic activities. CONCLUSIONS: Together, these results suggest that constitutional defects in DNA repair underlie a subset of eoRCC cases. Screening patient lymphocytes to identify these defects may provide insight into mechanisms of carcinogenesis in a subset of genetically undefined eoRCCs. Evaluation of DNA repair defects may also provide insight into the cancer initiation mechanisms for subsets of eoRCCs and lay the foundation for targeting DNA repair vulnerabilities in eoRCC.


Asunto(s)
Carcinoma de Células Renales , Neoplasias Renales , Humanos , Predisposición Genética a la Enfermedad , Replicación del ADN , Mutación de Línea Germinal , Células Germinativas
4.
Bioinformatics ; 38(8): 2344-2347, 2022 04 12.
Artículo en Inglés | MEDLINE | ID: mdl-35157026

RESUMEN

MOTIVATION: The analysis of mutational signatures is becoming increasingly common in cancer genetics, with emerging implications in cancer evolution, classification, treatment decision and prognosis. Recently, several packages have been developed for mutational signature analysis, with each using different methodology and yielding significantly different results. Because of the non-trivial differences in tools' refitting results, researchers may desire to survey and compare the available tools, in order to objectively evaluate the results for their specific research question, such as which mutational signatures are prevalent in different cancer types. RESULTS: Due to the need for effective comparison of refitting mutational signatures, we introduce a user-friendly software that can aggregate and visually present results from different refitting packages. AVAILABILITY AND IMPLEMENTATION: MetaMutationalSigs is implemented using R and python and is available for installation using Docker and available at: https://github.com/EESI/MetaMutationalSigs.


Asunto(s)
Neoplasias , Programas Informáticos , Humanos , Mutación , Neoplasias/genética
5.
PLoS Comput Biol ; 17(9): e1009345, 2021 09.
Artículo en Inglés | MEDLINE | ID: mdl-34550967

RESUMEN

Recurrent neural networks with memory and attention mechanisms are widely used in natural language processing because they can capture short and long term sequential information for diverse tasks. We propose an integrated deep learning model for microbial DNA sequence data, which exploits convolutional neural networks, recurrent neural networks, and attention mechanisms to predict taxonomic classifications and sample-associated attributes, such as the relationship between the microbiome and host phenotype, on the read/sequence level. In this paper, we develop this novel deep learning approach and evaluate its application to amplicon sequences. We apply our approach to short DNA reads and full sequences of 16S ribosomal RNA (rRNA) marker genes, which identify the heterogeneity of a microbial community sample. We demonstrate that our implementation of a novel attention-based deep network architecture, Read2Pheno, achieves read-level phenotypic prediction. Training Read2Pheno models will encode sequences (reads) into dense, meaningful representations: learned embedded vectors output from the intermediate layer of the network model, which can provide biological insight when visualized. The attention layer of Read2Pheno models can also automatically identify nucleotide regions in reads/sequences which are particularly informative for classification. As such, this novel approach can avoid pre/post-processing and manual interpretation required with conventional approaches to microbiome sequence classification. We further show, as proof-of-concept, that aggregating read-level information can robustly predict microbial community properties, host phenotype, and taxonomic classification, with performance at least comparable to conventional approaches. An implementation of the attention-based deep learning network is available at https://github.com/EESI/sequence_attention (a python package) and https://github.com/EESI/seq2att (a command line tool).


Asunto(s)
Aprendizaje Profundo , Microbiota/genética , Redes Neurales de la Computación , ARN Ribosómico 16S/genética , Algoritmos , Biología Computacional , Bases de Datos Genéticas , Microbioma Gastrointestinal/genética , Interacciones Microbiota-Huesped/genética , Humanos , Enfermedades Inflamatorias del Intestino/microbiología , Procesamiento de Lenguaje Natural , Fenotipo , Prevotella/clasificación , Prevotella/genética , Prevotella/aislamiento & purificación , Prueba de Estudio Conceptual , ARN Ribosómico 16S/clasificación
6.
PLoS Comput Biol ; 16(9): e1008269, 2020 09.
Artículo en Inglés | MEDLINE | ID: mdl-32941419

RESUMEN

We propose an efficient framework for genetic subtyping of SARS-CoV-2, the novel coronavirus that causes the COVID-19 pandemic. Efficient viral subtyping enables visualization and modeling of the geographic distribution and temporal dynamics of disease spread. Subtyping thereby advances the development of effective containment strategies and, potentially, therapeutic and vaccine strategies. However, identifying viral subtypes in real-time is challenging: SARS-CoV-2 is a novel virus, and the pandemic is rapidly expanding. Viral subtypes may be difficult to detect due to rapid evolution; founder effects are more significant than selection pressure; and the clustering threshold for subtyping is not standardized. We propose to identify mutational signatures of available SARS-CoV-2 sequences using a population-based approach: an entropy measure followed by frequency analysis. These signatures, Informative Subtype Markers (ISMs), define a compact set of nucleotide sites that characterize the most variable (and thus most informative) positions in the viral genomes sequenced from different individuals. Through ISM compression, we find that certain distant nucleotide variants covary, including non-coding and ORF1ab sites covarying with the D614G spike protein mutation which has become increasingly prevalent as the pandemic has spread. ISMs are also useful for downstream analyses, such as spatiotemporal visualization of viral dynamics. By analyzing sequence data available in the GISAID database, we validate the utility of ISM-based subtyping by comparing spatiotemporal analyses using ISMs to epidemiological studies of viral transmission in Asia, Europe, and the United States. In addition, we show the relationship of ISMs to phylogenetic reconstructions of SARS-CoV-2 evolution, and therefore, ISMs can play an important complementary role to phylogenetic tree-based analysis, such as is done in the Nextstrain project. The developed pipeline dynamically generates ISMs for newly added SARS-CoV-2 sequences and updates the visualization of pandemic spatiotemporal dynamics, and is available on Github at https://github.com/EESI/ISM (Jupyter notebook), https://github.com/EESI/ncov_ism (command line tool) and via an interactive website at https://covid19-ism.coe.drexel.edu/.


Asunto(s)
Betacoronavirus/clasificación , Betacoronavirus/genética , Infecciones por Coronavirus , Genómica/métodos , Pandemias , Neumonía Viral , COVID-19 , Infecciones por Coronavirus/epidemiología , Infecciones por Coronavirus/transmisión , Infecciones por Coronavirus/virología , Evolución Molecular , Marcadores Genéticos/genética , Genoma Viral/genética , Humanos , Mutación/genética , Filogenia , Neumonía Viral/epidemiología , Neumonía Viral/transmisión , Neumonía Viral/virología , ARN Viral/genética , SARS-CoV-2 , Alineación de Secuencia , Análisis de Secuencia de ARN , Análisis Espacio-Temporal
7.
BMC Bioinformatics ; 21(1): 412, 2020 Sep 21.
Artículo en Inglés | MEDLINE | ID: mdl-32957925

RESUMEN

BACKGROUND: It is a computational challenge for current metagenomic classifiers to keep up with the pace of training data generated from genome sequencing projects, such as the exponentially-growing NCBI RefSeq bacterial genome database. When new reference sequences are added to training data, statically trained classifiers must be rerun on all data, resulting in a highly inefficient process. The rich literature of "incremental learning" addresses the need to update an existing classifier to accommodate new data without sacrificing much accuracy compared to retraining the classifier with all data. RESULTS: We demonstrate how classification improves over time by incrementally training a classifier on progressive RefSeq snapshots and testing it on: (a) all known current genomes (as a ground truth set) and (b) a real experimental metagenomic gut sample. We demonstrate that as a classifier model's knowledge of genomes grows, classification accuracy increases. The proof-of-concept naïve Bayes implementation, when updated yearly, now runs in 1/4th of the non-incremental time with no accuracy loss. CONCLUSIONS: It is evident that classification improves by having the most current knowledge at its disposal. Therefore, it is of utmost importance to make classifiers computationally tractable to keep up with the data deluge. The incremental learning classifier can be efficiently updated without the cost of reprocessing nor the access to the existing database and therefore save storage as well as computation resources.


Asunto(s)
Microbioma Gastrointestinal/genética , Genoma Bacteriano , Aprendizaje Automático , Metagenómica/métodos , Algoritmos , Bacterias/genética , Teorema de Bayes , Humanos , Metagenoma , Análisis de Secuencia de ADN/métodos
8.
PLoS Comput Biol ; 15(2): e1006721, 2019 02.
Artículo en Inglés | MEDLINE | ID: mdl-30807567

RESUMEN

Advances in high-throughput sequencing have increased the availability of microbiome sequencing data that can be exploited to characterize microbiome community structure in situ. We explore using word and sentence embedding approaches for nucleotide sequences since they may be a suitable numerical representation for downstream machine learning applications (especially deep learning). This work involves first encoding ("embedding") each sequence into a dense, low-dimensional, numeric vector space. Here, we use Skip-Gram word2vec to embed k-mers, obtained from 16S rRNA amplicon surveys, and then leverage an existing sentence embedding technique to embed all sequences belonging to specific body sites or samples. We demonstrate that these representations are meaningful, and hence the embedding space can be exploited as a form of feature extraction for exploratory analysis. We show that sequence embeddings preserve relevant information about the sequencing data such as k-mer context, sequence taxonomy, and sample class. Specifically, the sequence embedding space resolved differences among phyla, as well as differences among genera within the same family. Distances between sequence embeddings had similar qualities to distances between alignment identities, and embedding multiple sequences can be thought of as generating a consensus sequence. In addition, embeddings are versatile features that can be used for many downstream tasks, such as taxonomic and sample classification. Using sample embeddings for body site classification resulted in negligible performance loss compared to using OTU abundance data, and clustering embeddings yielded high fidelity species clusters. Lastly, the k-mer embedding space captured distinct k-mer profiles that mapped to specific regions of the 16S rRNA gene and corresponded with particular body sites. Together, our results show that embedding sequences results in meaningful representations that can be used for exploratory analyses or for downstream machine learning applications that require numeric data. Moreover, because the embeddings are trained in an unsupervised manner, unlabeled data can be embedded and used to bolster supervised machine learning tasks.


Asunto(s)
ARN Ribosómico 16S/genética , ARN Ribosómico 16S/fisiología , Análisis de Secuencia de ARN/métodos , Algoritmos , Análisis por Conglomerados , Biología Computacional/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Microbiota/genética
9.
J Circadian Rhythms ; 18: 6, 2020 Oct 21.
Artículo en Inglés | MEDLINE | ID: mdl-33133210

RESUMEN

BACKGROUND: Circadian misalignment can impair healthcare shift workers' physical and mental health, resulting in sleep deprivation, obesity, and chronic disease. This multidisciplinary research team assessed eating patterns and sleep/physical activity of healthcare workers on three different shifts (day, night, and rotating-shift). To date, no study of real-world shift workers' daily eating and sleep has utilized a largely-objective measurement. METHOD: During this fourteen-day observational study, participants wore two devices (Actiwatch and Bite Technologies counter) to measure physical activity, sleep, light exposure, and eating time. Participants also reported food intake via food diaries on personal mobile devices. RESULTS: In fourteen (5 day-, 5 night-, and 4 rotating-shift) participants, no baseline difference in BMI was observed. Overall, rotating-shift workers consumed fewer calories and had less activity and sleep than day- and night-shift workers. For eating patterns, compared to night- and rotating-shift, day-shift workers ate more frequently during work days. Night workers, however, consumed more calories at work relative to day and rotating workers. For physical activity and sleep, night-shift workers had the highest activity and least sleep on work days. CONCLUSION: This pilot study utilized primarily objective measurement to examine shift workers' habits outside the laboratory. Although no association between BMI and eating patterns/activity/sleep was observed across groups, a small, homogeneous sample may have influenced this. Overall, shift work was associated with 1) increased calorie intake and higher-fat and -carbohydrate diets and 2) sleep deprivation. A larger, more diverse sample can participate in future studies that objectively measure shift workers' real-world habits.

11.
Nucleic Acids Res ; 42(Database issue): D625-32, 2014 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-24198250

RESUMEN

POGO-DB (http://pogo.ece.drexel.edu/) provides an easy platform for comparative microbial genomics. POGO-DB allows users to compare genomes using pre-computed metrics that were derived from extensive computationally intensive BLAST comparisons of >2000 microbes. These metrics include (i) average protein sequence identity across all orthologs shared by two genomes, (ii) genomic fluidity (a measure of gene content dissimilarity), (iii) number of 'orthologs' shared between two genomes, (iv) pairwise identity of the 16S ribosomal RNA genes and (v) pairwise identity of an additional 73 marker genes present in >90% prokaryotes. Users can visualize these metrics against each other in a 2D plot for exploratory analysis of genome similarity and of how different aspects of genome similarity relate to each other. The results of these comparisons are fully downloadable. In addition, users can download raw BLAST results for all or user-selected comparisons. Therefore, we provide users with full flexibility to carry out their own downstream analyses, by creating easy access to data that would normally require heavy computational resources to generate. POGO-DB should prove highly useful for researchers interested in comparative microbiology and benefit the microbiome/metagenomic communities by providing the information needed to select suitable phylogenetic marker genes within particular lineages.


Asunto(s)
Bases de Datos Genéticas , Genes Microbianos , Genoma Microbiano , Genómica , Internet , Filogenia , ARN Ribosómico 16S/genética , Análisis de Secuencia de Proteína
12.
BMC Bioinformatics ; 16: 358, 2015 Nov 04.
Artículo en Inglés | MEDLINE | ID: mdl-26538306

RESUMEN

BACKGROUND: Some of the current software tools for comparative metagenomics provide ecologists with the ability to investigate and explore bacterial communities using α- & ß-diversity. Feature subset selection--a sub-field of machine learning--can also provide a unique insight into the differences between metagenomic or 16S phenotypes. In particular, feature subset selection methods can obtain the operational taxonomic units (OTUs), or functional features, that have a high-level of influence on the condition being studied. For example, in a previous study we have used information-theoretic feature selection to understand the differences between protein family abundances that best discriminate between age groups in the human gut microbiome. RESULTS: We have developed a new Python command line tool, which is compatible with the widely adopted BIOM format, for microbial ecologists that implements information-theoretic subset selection methods for biological data formats. We demonstrate the software tools capabilities on publicly available datasets. CONCLUSIONS: We have made the software implementation of Fizzy available to the public under the GNU GPL license. The standalone implementation can be found at http://github.com/EESI/Fizzy.


Asunto(s)
Biología Computacional/métodos , Metagenómica/métodos , Programas Informáticos , Algoritmos , Bases de Datos Genéticas , Humanos , Microbiota/genética , Vegetarianos
13.
Emerg Infect Dis ; 20(1): 109-13, 2014 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-24377497

RESUMEN

A captive juvenile Bornean orangutan (Pongo pygmaeus) died from an unknown disseminated parasitic infection. Deep sequencing of DNA from infected tissues, followed by gene-specific PCR and sequencing, revealed a divergent species within the newly proposed genus Versteria (Cestoda: Taeniidae). Versteria may represent a previously unrecognized risk to primate health.


Asunto(s)
Enfermedades del Simio Antropoideo/parasitología , Cestodos/clasificación , Cestodos/genética , Infecciones por Cestodos/veterinaria , Pongo pygmaeus/parasitología , Animales , Enfermedades del Simio Antropoideo/patología , Genes de Helminto , Filogenia , ARN Ribosómico
14.
J Gen Virol ; 95(Pt 5): 1055-1066, 2014 May.
Artículo en Inglés | MEDLINE | ID: mdl-24558222

RESUMEN

A thorough characterization of the genetic diversity of viruses present in vector and vertebrate host populations is essential for the early detection of and response to emerging pathogenic viruses, yet genetic characterization of many important viral groups remains incomplete. The Simbu serogroup of the genus Orthobunyavirus, family Bunyaviridae, is an example. The Simbu serogroup currently consists of a highly diverse group of related arboviruses that infect both humans and economically important livestock species. Here, we report complete genome sequences for 11 viruses within this group, with a focus on the large and poorly characterized Manzanilla and Oropouche species complexes. Phylogenetic and pairwise divergence analyses indicated the presence of high levels of genetic diversity within these two species complexes, on a par with that seen among the five other species complexes in the Simbu serogroup. Based on previously reported divergence thresholds between species, the data suggested that these two complexes should actually be divided into at least five species. Together these five species formed a distinct phylogenetic clade apart from the rest of the Simbu serogroup. Pairwise sequence divergences among viruses of this clade and viruses in other Simbu serogroup species complexes were similar to levels of divergence among the other orthobunyavirus serogroups. The genetic data also suggested relatively high levels of natural reassortment, with three potential reassortment events present, including two well-supported events involving viruses known to infect humans.


Asunto(s)
Genoma Viral , Orthobunyavirus/clasificación , Orthobunyavirus/genética , Filogenia , ARN Viral/genética , Análisis de Secuencia de ADN , Análisis por Conglomerados , Variación Genética , Datos de Secuencia Molecular
15.
J Virol ; 87(6): 3187-95, 2013 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-23283959

RESUMEN

Evolutionary insights into the phleboviruses are limited because of an imprecise classification scheme based on partial nucleotide sequences and scattered antigenic relationships. In this report, the serologic and phylogenetic relationships of the Uukuniemi group viruses and their relationships with other recently characterized tick-borne phleboviruses are described using full-length genome sequences. We propose that the viruses currently included in the Uukuniemi virus group be assigned to five different species as follows: Uukuniemi virus, EgAn 1825-61 virus, Fin V707 virus, Chizé virus, and Zaliv Terpenia virus would be classified into the Uukuniemi species; Murre virus, RML-105-105355 virus, and Sunday Canyon virus would be classified into a Murre virus species; and Grand Arbaud virus, Precarious Point virus, and Manawa virus would each be given individual species status. Although limited sequence similarity was detected between current members of the Uukuniemi group and Severe fever with thrombocytopenia syndrome virus (SFTSV) and Heartland virus, a clear serological reaction was observed between some of them, indicating that SFTSV and Heartland virus should be considered part of the Uukuniemi virus group. Moreover, based on the genomic diversity of the phleboviruses and given the low correlation observed between complement fixation titers and genetic distance, we propose a system for classification of the Bunyaviridae based on genetic as well as serological data. Finally, the recent descriptions of SFTSV and Heartland virus also indicate that the public health importance of the Uukuniemi group viruses must be reevaluated.


Asunto(s)
Virus Uukuniemi/clasificación , Genoma Viral , Genotipo , ARN Viral/genética , Análisis de Secuencia de ADN , Serotipificación , Virus Uukuniemi/genética , Virus Uukuniemi/inmunología
16.
Bioinformatics ; 29(17): 2096-102, 2013 Sep 01.
Artículo en Inglés | MEDLINE | ID: mdl-23786768

RESUMEN

MOTIVATION: Many metagenomic studies compare hundreds to thousands of environmental and health-related samples by extracting and sequencing their 16S rRNA amplicons and measuring their similarity using beta-diversity metrics. However, one of the first steps--to classify the operational taxonomic units within the sample--can be a computationally time-consuming task because most methods rely on computing the taxonomic assignment of each individual read out of tens to hundreds of thousands of reads. RESULTS: We introduce Quikr: a QUadratic, K-mer-based, Iterative, Reconstruction method, which computes a vector of taxonomic assignments and their proportions in the sample using an optimization technique motivated from the mathematical theory of compressive sensing. On both simulated and actual biological data, we demonstrate that Quikr typically has less error and is typically orders of magnitude faster than the most commonly used taxonomic assignment technique (the Ribosomal Database Project's Naïve Bayesian Classifier). Furthermore, the technique is shown to be unaffected by the presence of chimeras, thereby allowing for the circumvention of the time-intensive step of chimera filtering. AVAILABILITY: The Quikr computational package (in MATLAB, Octave, Python and C) for the Linux and Mac platforms is available at http://sourceforge.net/projects/quikr/.


Asunto(s)
Bacterias/clasificación , Análisis de Secuencia de ADN/métodos , Algoritmos , Bacterias/genética , Bacterias/aislamiento & purificación , Teorema de Bayes , Clasificación/métodos , Metagenómica , Microbiota , Filogenia , ARN Ribosómico 16S/genética , Programas Informáticos
17.
J Mol Graph Model ; 127: 108669, 2024 03.
Artículo en Inglés | MEDLINE | ID: mdl-38011826

RESUMEN

Fragment-based drug design (FBDD) is one major drug discovery method employed in computer-aided drug discovery. Due to its inherent limitations, this process experiences long processing times and limited success rates. Here we present a new Fragment Databases from Screened Ligands Drug Design method (FDSL-DD) that intelligently incorporates information about fragment characteristics into a fragment-based design approach to the drug development process. The initial step of the FDSL-DD is the creation of a fragment database from a library of docked, drug-like ligands for a specific target, which deviates from the traditional in silico FBDD strategy, incorporating structure-based design screening techniques to combine the advantages of both approaches. Three different protein targets have been tested in this study to demonstrate the potential of the created fragment library and FDSL-DD. Utilizing the FDSL-DD led to an increase in binding affinity for each protein target. The most substantial increase was exhibited by the ligand designed for TIPE2, with a 3.6 kcalmol-1 difference between the top ligand from the FDSL-DD and top ligand from the high throughput virtual screening (HTVS). Using drug-like ligands in the initial HTVS allows for a greater search of chemical space, with higher efficiency in fragments selection, less grid boxes, and potentially identifying more interactions.


Asunto(s)
Diseño de Fármacos , Descubrimiento de Drogas , Ligandos , Descubrimiento de Drogas/métodos , Ensayos Analíticos de Alto Rendimiento , Bases de Datos Factuales
18.
Brain Behav Immun ; 29: 62-69, 2013 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-23261776

RESUMEN

Complex Regional Pain Syndrome (CRPS) is a serious and painful condition involving the peripheral and central nervous systems. Full comprehension of the disorder's pathophysiology remains incomplete, but research implicates the immune system as a contributor to chronic pain. Because of the impact gastrointestinal bacteria have in the development and behavior of the immune system, this study compares the GI microbial communities of 16 participants with CRPS (5 of whom have intestinal discomforts) and 16 healthy controls using 454 sequencing technology. CRPS subjects were found to have significantly less diversity than their healthy counterparts. Statistical analysis of the phylogenetic classifications revealed significantly increased levels of Proteobacteria and decreased levels of Firmicutes in CRPS subjects. Clustering analysis showed significant separation between healthy controls and CRPS subjects. These results support the hypothesis that the GI microbial communities of CRPS participants differ from those of their healthy counterparts. These variations may hold the key to understanding how CRPS develops and provide information that could yield a potential treatment.


Asunto(s)
Bacterias/genética , Síndromes de Dolor Regional Complejo/microbiología , Tracto Gastrointestinal/microbiología , Adulto , Análisis de Varianza , Bacterias/clasificación , ADN/genética , ADN/aislamiento & purificación , Femenino , Genes Bacterianos/genética , Humanos , Masculino , Persona de Mediana Edad , ARN Ribosómico 16S/biosíntesis , Análisis de Secuencia de ADN , Adulto Joven
19.
PeerJ ; 11: e14779, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-36785708

RESUMEN

A major challenge for clustering algorithms is to balance the trade-off between homogeneity, i.e., the degree to which an individual cluster includes only related sequences, and completeness, the degree to which related sequences are broken up into multiple clusters. Most algorithms are conservative in grouping sequences with other sequences. Remote homologs may fail to be clustered together and instead form unnecessarily distinct clusters. The resulting clusters have high homogeneity but completeness that is too low. We propose Complet+, a computationally scalable post-processing method to increase the completeness of clusters without an undue cost in homogeneity. Complet+ proves to effectively merge closely-related clusters of protein that have verified structural relationships in the SCOPe classification scheme, improving the completeness of clustering results at little cost to homogeneity. Applying Complet+ to clusters obtained using MMseqs2's clusterupdate achieves an increased V-measure of 0.09 and 0.05 at the SCOPe superfamily and family levels, respectively. Complet+ also creates more biologically representative clusters, as shown by a substantial increase in Adjusted Mutual Information (AMI) and Adjusted Rand Index (ARI) metrics when comparing predicted clusters to biological classifications. Complet+ similarly improves clustering metrics when applied to other methods, such as CD-HIT and linclust. Finally, we show that Complet+ runtime scales linearly with respect to the number of clusters being post-processed on a COG dataset of over 3 million sequences. Code and supplementary information is available on Github: https://github.com/EESI/Complet-Plus.


Asunto(s)
Algoritmos , Proteínas , Alineación de Secuencia , Secuencia de Aminoácidos , Proteínas/química , Análisis por Conglomerados
20.
ISME J ; 17(10): 1751-1764, 2023 10.
Artículo en Inglés | MEDLINE | ID: mdl-37558860

RESUMEN

While genome sequencing has expanded our knowledge of symbiosis, role assignment within multi-species microbiomes remains challenging due to genomic redundancy and the uncertainties of in vivo impacts. We address such questions, here, for a specialized nitrogen (N) recycling microbiome of turtle ants, describing a new genus and species of gut symbiont-Ischyrobacter davidsoniae (Betaproteobacteria: Burkholderiales: Alcaligenaceae)-and its in vivo physiological context. A re-analysis of amplicon sequencing data, with precisely assigned Ischyrobacter reads, revealed a seemingly ubiquitous distribution across the turtle ant genus Cephalotes, suggesting ≥50 million years since domestication. Through new genome sequencing, we also show that divergent I. davidsoniae lineages are conserved in their uricolytic and urea-generating capacities. With phylogenetically refined definitions of Ischyrobacter and separately domesticated Burkholderiales symbionts, our FISH microscopy revealed a distinct niche for I. davidsoniae, with dense populations at the anterior ileum. Being positioned at the site of host N-waste delivery, in vivo metatranscriptomics and metabolomics further implicate I. davidsoniae within a symbiont-autonomous N-recycling pathway. While encoding much of this pathway, I. davidsoniae expressed only a subset of the requisite steps in mature adult workers, including the penultimate step deriving urea from allantoate. The remaining steps were expressed by other specialized gut symbionts. Collectively, this assemblage converts inosine, made from midgut symbionts, into urea and ammonia in the hindgut. With urea supporting host amino acid budgets and cuticle synthesis, and with the ancient nature of other active N-recyclers discovered here, I. davidsoniae emerges as a central player in a conserved and impactful, multipartite symbiosis.


Asunto(s)
Hormigas , Nitrógeno , Animales , Hormigas/fisiología , Filogenia , Simbiosis/genética , Urea
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA