Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Más filtros

Bases de datos
Tipo de estudio
Tipo del documento
Intervalo de año de publicación
1.
BMC Bioinformatics ; 21(1): 235, 2020 Jun 09.
Artículo en Inglés | MEDLINE | ID: mdl-32517697

RESUMEN

BACKGROUND: The number of applications of deep learning algorithms in bioinformatics is increasing as they usually achieve superior performance over classical approaches, especially, when bigger training datasets are available. In deep learning applications, discrete data, e.g. words or n-grams in language, or amino acids or nucleotides in bioinformatics, are generally represented as a continuous vector through an embedding matrix. Recently, learning this embedding matrix directly from the data as part of the continuous iteration of the model to optimize the target prediction - a process called 'end-to-end learning' - has led to state-of-the-art results in many fields. Although usage of embeddings is well described in the bioinformatics literature, the potential of end-to-end learning for single amino acids, as compared to more classical manually-curated encoding strategies, has not been systematically addressed. To this end, we compared classical encoding matrices, namely one-hot, VHSE8 and BLOSUM62, to end-to-end learning of amino acid embeddings for two different prediction tasks using three widely used architectures, namely recurrent neural networks (RNN), convolutional neural networks (CNN), and the hybrid CNN-RNN. RESULTS: By using different deep learning architectures, we show that end-to-end learning is on par with classical encodings for embeddings of the same dimension even when limited training data is available, and might allow for a reduction in the embedding dimension without performance loss, which is critical when deploying the models to devices with limited computational capacities. We found that the embedding dimension is a major factor in controlling the model performance. Surprisingly, we observed that deep learning models are capable of learning from random vectors of appropriate dimension. CONCLUSION: Our study shows that end-to-end learning is a flexible and powerful method for amino acid encoding. Further, due to the flexibility of deep learning systems, amino acid encoding schemes should be benchmarked against random vectors of the same dimension to disentangle the information content provided by the encoding scheme from the distinguishability effect provided by the scheme.


Asunto(s)
Aminoácidos/metabolismo , Biología Computacional/métodos , Aprendizaje Profundo/normas , Humanos
2.
Bioinformatics ; 35(22): 4857-4859, 2019 11 01.
Artículo en Inglés | MEDLINE | ID: mdl-31225863

RESUMEN

SUMMARY: Sequencing data resources have increased exponentially in recent years, as has interest in large-scale meta-analyses of integrated next-generation sequencing datasets. However, curation of integrated datasets that match a user's particular research priorities is currently a time-intensive and imprecise task. MetaSeek is a sequencing data discovery tool that enables users to flexibly search and filter on any metadata field to quickly find the sequencing datasets that meet their needs. MetaSeek automatically scrapes metadata from all publicly available datasets in the Sequence Read Archive, cleans and parses messy, user-provided metadata into a structured, standard-compliant database and predicts missing fields where possible. MetaSeek provides a web-based graphical user interface and interactive visualization dashboard, as well as a programmatic API to rapidly search, filter, visualize, save, share and download matching sequencing metadata. AVAILABILITY AND IMPLEMENTATION: The MetaSeek online interface is available at https://www.metaseek.cloud/. The MetaSeek database can also be accessed via API to programmatically search, filter and download all metadata. MetaSeek source code, metadata scrapers and documents are available at https://github.com/MetaSeek-Sequencing-Data-Discovery/metaseek/.


Asunto(s)
Metadatos , Programas Informáticos , Bases de Datos Factuales , Secuenciación de Nucleótidos de Alto Rendimiento
3.
Environ Microbiol ; 21(2): 557-571, 2019 02.
Artículo en Inglés | MEDLINE | ID: mdl-30452115

RESUMEN

The extent to which differences in microbial community structure result in variations in organic matter (OM) degradation is not well understood. Here, we tested the hypothesis that distinct marine microbial communities from North Atlantic surface and bottom waters would exhibit varying compositional succession and functional shifts in response to the same pool of complex high molecular weight (HMW-OM). We also hypothesized that microbial communities would produce a broader spectrum of enzymes upon exposure to HMW-OM, indicating a greater potential to degrade these compounds than reflected by initial enzymatic activities. Our results show that community succession in amended mesocosms was congruent with cell growth, increased bacterial production and most notably, with substantial shifts in enzymatic activities. In all amended mesocosms, closely related taxa that were initially rare became dominant at time frames during which a broader spectrum of active enzymes were detected compared to initial timepoints, indicating a similar response among different communities. However, succession on the whole-community level, and the rates, spectra and progression of enzymatic activities, reveal robust differences among distinct communities from discrete water masses. These results underscore the crucial role of rare bacterial taxa in ocean carbon cycling and the importance of bacterial community structure for HMW-OM degradation.


Asunto(s)
Bacterias/enzimología , Bacterias/metabolismo , Compuestos Orgánicos/metabolismo , Bacterias/clasificación , Ciclo del Carbono/fisiología , Microbiota
4.
NPJ Microgravity ; 9(1): 90, 2023 Dec 13.
Artículo en Inglés | MEDLINE | ID: mdl-38092777

RESUMEN

The adverse effects of microgravity exposure on mammalian physiology during spaceflight necessitate a deep understanding of the underlying mechanisms to develop effective countermeasures. One such concern is muscle atrophy, which is partly attributed to the dysregulation of calcium levels due to abnormalities in SERCA pump functioning. To identify potential biomarkers for this condition, multi-omics data and physiological data available on the NASA Open Science Data Repository (osdr.nasa.gov) were used, and machine learning methods were employed. Specifically, we used multi-omics (transcriptomic, proteomic, and DNA methylation) data and calcium reuptake data collected from C57BL/6 J mouse soleus and tibialis anterior tissues during several 30+ day-long missions on the international space station. The QLattice symbolic regression algorithm was introduced to generate highly explainable models that predict either experimental conditions or calcium reuptake levels based on multi-omics features. The list of candidate models established by QLattice was used to identify key features contributing to the predictive capability of these models, with Acyp1 and Rps7 proteins found to be the most predictive biomarkers related to the resilience of the tibialis anterior muscle in space. These findings could serve as targets for future interventions aiming to reduce the extent of muscle atrophy during space travel.

5.
Front Microbiol ; 13: 882333, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36246226

RESUMEN

Heterotrophic bacteria initiate the degradation of high molecular weight organic matter by producing an array of extracellular enzymes to hydrolyze complex organic matter into sizes that can be taken up into the cell. These bacterial communities differ spatially and temporally in composition, and potentially also in their enzymatic complements. Previous research has shown that particle-associated bacteria can be considerably more active than bacteria in the surrounding bulk water, but most prior studies of particle-associated bacteria have been focused on the upper ocean - there are few measurements of enzymatic activities of particle-associated bacteria in the mesopelagic and bathypelagic ocean, although the bacterial communities in the deep are dependent upon degradation of particulate organic matter to fuel their metabolism. We used a broad suite of substrates to compare the glucosidase, peptidase, and polysaccharide hydrolase activities of particle-associated and unfiltered seawater microbial communities in epipelagic, mesopelagic, and bathypelagic waters across 11 stations in the western North Atlantic. We concurrently determined bacterial community composition of unfiltered seawater and of samples collected via gravity filtration (>3 µm). Overall, particle-associated bacterial communities showed a broader spectrum of enzyme activities compared with unfiltered seawater communities. These differences in enzymatic activities were greater at offshore than at coastal locations, and increased with increasing depth in the ocean. The greater differences in enzymatic function measured on particles with depth coincided with increasing differences in particle-associated community composition, suggesting that particles act as 'specialty centers' that are essential for degradation of organic matter even at bathypelagic depths.

6.
ISME J ; 14(1): 178-188, 2020 01.
Artículo en Inglés | MEDLINE | ID: mdl-31611653

RESUMEN

SAR86 is an abundant and ubiquitous heterotroph in the surface ocean that plays a central role in the function of marine ecosystems. We hypothesized that despite its ubiquity, different SAR86 subgroups may be endemic to specific ocean regions and functionally specialized for unique marine environments. However, the global biogeographical distributions of SAR86 genes, and the manner in which these distributions correlate with marine environments, have not been investigated. We quantified SAR86 gene content across globally distributed metagenomic samples and modeled these gene distributions as a function of 51 environmental variables. We identified five distinct clusters of genes within the SAR86 pangenome, each with a unique geographic distribution associated with specific environmental characteristics. Gene clusters are characterized by the strong taxonomic enrichment of distinct SAR86 genomes and partial assemblies, as well as differential enrichment of certain functional groups, suggesting differing functional and ecological roles of SAR86 ecotypes. We then leveraged our models and high-resolution, remote sensing-derived environmental data to predict the distributions of SAR86 gene clusters across the world's oceans, creating global maps of SAR86 ecotype distributions. Our results reveal that SAR86 exhibits previously unknown, complex biogeography, and provide a framework for exploring geographic distributions of genetic diversity from other microbial clades.


Asunto(s)
Gammaproteobacteria/clasificación , Ecotipo , Gammaproteobacteria/genética , Genes Bacterianos , Metagenoma , Océanos y Mares , Filogeografía
7.
Front Microbiol ; 4: 110, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23717305

RESUMEN

Short-chain alkanes play a substantial role in carbon and sulfur cycling at hydrocarbon-rich environments globally, yet few studies have examined the metabolism of ethane (C2), propane (C3), and butane (C4) in anoxic sediments in contrast to methane (C1). In hydrothermal vent systems, short-chain alkanes are formed over relatively short geological time scales via thermogenic processes and often exist at high concentrations. The sediment-covered hydrothermal vent systems at Middle Valley (MV, Juan de Fuca Ridge) are an ideal site for investigating the anaerobic oxidation of C1-C4 alkanes, given the elevated temperatures and dissolved hydrocarbon species characteristic of these metalliferous sediments. We examined whether MV microbial communities oxidized C1-C4 alkanes under mesophilic to thermophilic sulfate-reducing conditions. Here we present data from discrete temperature (25, 55, and 75°C) anaerobic batch reactor incubations of MV sediments supplemented with individual alkanes. Co-registered alkane consumption and sulfate reduction (SR) measurements provide clear evidence for C1-C4 alkane oxidation linked to SR over time and across temperatures. In these anaerobic batch reactor sediments, 16S ribosomal RNA pyrosequencing revealed that Deltaproteobacteria, particularly a novel sulfate-reducing lineage, were the likely phylotypes mediating the oxidation of C2-C4 alkanes. Maximum C1-C4 alkane oxidation rates occurred at 55°C, which reflects the mid-core sediment temperature profile and corroborates previous studies of rate maxima for the anaerobic oxidation of methane (AOM). Of the alkanes investigated, C3 was oxidized at the highest rate over time, then C4, C2, and C1, respectively. The implications of these results are discussed with respect to the potential competition between the anaerobic oxidation of C2-C4alkanes with AOM for available oxidants and the influence on the fate of C1 derived from these hydrothermal systems.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA