Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 14.285
Filtrar
Mais filtros

Intervalo de ano de publicação
1.
Cell ; 184(4): 1098-1109.e9, 2021 02 18.
Artigo em Inglês | MEDLINE | ID: mdl-33606979

RESUMO

Bacteriophages drive evolutionary change in bacterial communities by creating gene flow networks that fuel ecological adaptions. However, the extent of viral diversity and its prevalence in the human gut remains largely unknown. Here, we introduce the Gut Phage Database, a collection of ∼142,000 non-redundant viral genomes (>10 kb) obtained by mining a dataset of 28,060 globally distributed human gut metagenomes and 2,898 reference genomes of cultured gut bacteria. Host assignment revealed that viral diversity is highest in the Firmicutes phyla and that ∼36% of viral clusters (VCs) are not restricted to a single species, creating gene flow networks across phylogenetically distinct bacterial species. Epidemiological analysis uncovered 280 globally distributed VCs found in at least 5 continents and a highly prevalent phage clade with features reminiscent of p-crAssphage. This high-quality, large-scale catalog of phage genomes will improve future virome studies and enable ecological and evolutionary analysis of human gut bacteriophages.


Assuntos
Bacteriófagos/genética , Biodiversidade , Microbioma Gastrointestinal , Bases de Dados de Ácidos Nucleicos , Especificidade de Hospedeiro , Humanos , Filogeografia
2.
Cell ; 179(7): 1623-1635.e11, 2019 12 12.
Artigo em Inglês | MEDLINE | ID: mdl-31835036

RESUMO

Marine bacteria and archaea play key roles in global biogeochemistry. To improve our understanding of this complex microbiome, we employed single-cell genomics and a randomized, hypothesis-agnostic cell selection strategy to recover 12,715 partial genomes from the tropical and subtropical euphotic ocean. A substantial fraction of known prokaryoplankton coding potential was recovered from a single, 0.4 mL ocean sample, which indicates that genomic information disperses effectively across the globe. Yet, we found each genome to be unique, implying limited clonality within prokaryoplankton populations. Light harvesting and secondary metabolite biosynthetic pathways were numerous across lineages, highlighting the value of single-cell genomics to advance the identification of ecological roles and biotechnology potential of uncultured microbial groups. This genome collection enabled functional annotation and genus-level taxonomic assignments for >80% of individual metagenome reads from the tropical and subtropical surface ocean, thus offering a model to improve reference genome databases for complex microbiomes.


Assuntos
Metagenoma , Microbiota , Água do Mar/microbiologia , Archaea/classificação , Archaea/genética , Bactérias/classificação , Bactérias/genética , Metabolismo Energético , Metagenômica/métodos , Filogeografia , Plâncton , Análise de Célula Única/métodos , Transcriptoma
3.
Cell ; 176(4): 869-881.e13, 2019 02 07.
Artigo em Inglês | MEDLINE | ID: mdl-30735636

RESUMO

Circular RNAs (circRNAs) are an intriguing class of RNA due to their covalently closed structure, high stability, and implicated roles in gene regulation. Here, we used an exome capture RNA sequencing protocol to detect and characterize circRNAs across >2,000 cancer samples. When compared against Ribo-Zero and RNase R, capture sequencing significantly enhanced the enrichment of circRNAs and preserved accurate circular-to-linear ratios. Using capture sequencing, we built the most comprehensive catalog of circRNA species to date: MiOncoCirc, the first database to be composed primarily of circRNAs directly detected in tumor tissues. Using MiOncoCirc, we identified candidate circRNAs to serve as biomarkers for prostate cancer and were able to detect circRNAs in urine. We further detected a novel class of circular transcripts, termed read-through circRNAs, that involved exons originating from different genes. MiOncoCirc will serve as a valuable resource for the development of circRNAs as diagnostic or therapeutic targets across cancer types.


Assuntos
Perfilação da Expressão Gênica/métodos , Neoplasias/genética , RNA/genética , Biomarcadores Tumorais/genética , Bases de Dados Genéticas , Regulação Neoplásica da Expressão Gênica/genética , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , MicroRNAs/genética , RNA/metabolismo , RNA Circular , Análise de Sequência de RNA/métodos , Sequenciamento do Exoma/métodos
4.
Cell ; 179(1): 268-281.e13, 2019 09 19.
Artigo em Inglês | MEDLINE | ID: mdl-31495573

RESUMO

Neuronal cell types are the nodes of neural circuits that determine the flow of information within the brain. Neuronal morphology, especially the shape of the axonal arbor, provides an essential descriptor of cell type and reveals how individual neurons route their output across the brain. Despite the importance of morphology, few projection neurons in the mouse brain have been reconstructed in their entirety. Here we present a robust and efficient platform for imaging and reconstructing complete neuronal morphologies, including axonal arbors that span substantial portions of the brain. We used this platform to reconstruct more than 1,000 projection neurons in the motor cortex, thalamus, subiculum, and hypothalamus. Together, the reconstructed neurons constitute more than 85 meters of axonal length and are available in a searchable online database. Axonal shapes revealed previously unknown subtypes of projection neurons and suggest organizational principles of long-range connectivity.


Assuntos
Encéfalo/citologia , Encéfalo/diagnóstico por imagem , Neuritos/fisiologia , Tratos Piramidais/fisiologia , Animais , Feminino , Camundongos , Camundongos Endogâmicos C57BL , Camundongos Transgênicos , Microscopia de Fluorescência por Excitação Multifotônica/métodos , Software , Transfecção
5.
Cell ; 169(1): 161-173.e12, 2017 03 23.
Artigo em Inglês | MEDLINE | ID: mdl-28340341

RESUMO

Generating a precise cellular and molecular cartography of the human embryo is essential to our understanding of the mechanisms of organogenesis in normal and pathological conditions. Here, we have combined whole-mount immunostaining, 3DISCO clearing, and light-sheet imaging to start building a 3D cellular map of the human development during the first trimester of gestation. We provide high-resolution 3D images of the developing peripheral nervous, muscular, vascular, cardiopulmonary, and urogenital systems. We found that the adult-like pattern of skin innervation is established before the end of the first trimester, showing important intra- and inter-individual variations in nerve branches. We also present evidence for a differential vascularization of the male and female genital tracts concomitant with sex determination. This work paves the way for a cellular and molecular reference atlas of human cells, which will be of paramount importance to understanding human development in health and disease. PAPERCLIP.


Assuntos
Embrião de Mamíferos/citologia , Feto/citologia , Desenvolvimento Humano , Imageamento Tridimensional/métodos , Imuno-Histoquímica/métodos , Microscopia/métodos , Desenvolvimento Embrionário , Humanos , Organogênese , Sistema Nervoso Periférico/citologia , Sistema Nervoso Periférico/crescimento & desenvolvimento
6.
Cell ; 167(4): 1125-1136.e8, 2016 11 03.
Artigo em Inglês | MEDLINE | ID: mdl-27814509

RESUMO

Gut microbial dysbioses are linked to aberrant immune responses, which are often accompanied by abnormal production of inflammatory cytokines. As part of the Human Functional Genomics Project (HFGP), we investigate how differences in composition and function of gut microbial communities may contribute to inter-individual variation in cytokine responses to microbial stimulations in healthy humans. We observe microbiome-cytokine interaction patterns that are stimulus specific, cytokine specific, and cytokine and stimulus specific. Validation of two predicted host-microbial interactions reveal that TNFα and IFNγ production are associated with specific microbial metabolic pathways: palmitoleic acid metabolism and tryptophan degradation to tryptophol. Besides providing a resource of predicted microbially derived mediators that influence immune phenotypes in response to common microorganisms, these data can help to define principles for understanding disease susceptibility. The three HFGP studies presented in this issue lay the groundwork for further studies aimed at understanding the interplay between microbial, genetic, and environmental factors in the regulation of the immune response in humans. PAPERCLIP.


Assuntos
Citocinas/imunologia , Microbioma Gastrointestinal , Inflamação/imunologia , Microbiota , Adolescente , Adulto , Idoso , Bactérias/classificação , Bactérias/imunologia , Sangue/imunologia , Disbiose/imunologia , Disbiose/microbiologia , Fezes/microbiologia , Feminino , Fungos/classificação , Fungos/imunologia , Interação Gene-Ambiente , Projeto Genoma Humano , Humanos , Infecções/imunologia , Infecções/microbiologia , Leucócitos Mononucleares/imunologia , Masculino , Pessoa de Meia-Idade
7.
Immunity ; 54(2): 355-366.e4, 2021 02 09.
Artigo em Inglês | MEDLINE | ID: mdl-33484642

RESUMO

Definition of the specific germline immunoglobulin (Ig) alleles present in an individual is a critical first step to delineate the ontogeny and evolution of antigen-specific antibody responses. Rhesus and cynomolgus macaques are important animal models for pre-clinical studies, with four main sub-groups being used: Indian- and Chinese-origin rhesus macaques and Mauritian and Indonesian cynomolgus macaques. We applied the (Ig) gene inference tool IgDiscover and performed extensive Sanger sequencing-based genomic validation to define germline VDJ alleles in these 4 sub-groups, comprising 45 macaques in total. There was allelic overlap between Chinese- and Indian-origin rhesus macaques and also between the two macaque species, which is consistent with substantial admixture. The island-restricted Mauritian cynomolgus population displayed the lowest number of alleles of the sub-groups, yet maintained high individual allelic diversity. These comprehensive databases of germline IGH alleles for rhesus and cynomolgus macaques provide a resource toward the study of B cell responses in these important pre-clinical models.


Assuntos
Genótipo , Mutação em Linhagem Germinativa/genética , Cadeias Pesadas de Imunoglobulinas/genética , Alelos , Animais , Bases de Dados Genéticas , Modelos Animais de Doenças , Epitopos , Imunidade Humoral , Macaca fascicularis , Macaca mulatta , Filogenia , Polimorfismo Genético , Especificidade da Espécie , Recombinação V(D)J
8.
Mol Cell ; 76(2): 286-294, 2019 10 17.
Artigo em Inglês | MEDLINE | ID: mdl-31626750

RESUMO

Stress granules and P-bodies are cytosolic biomolecular condensates that dynamically form by the phase separation of RNAs and proteins. They participate in translational control and buffer the proteome. Upon stress, global translation halts and mRNAs bound to the translational machinery and other proteins coalesce to form stress granules (SGs). Similarly, translationally stalled mRNAs devoid of translation initiation factors shuttle to P-bodies (PBs). Here, we review the cumulative progress made in defining the protein components that associate with mammalian SGs and PBs. We discuss the composition of SG and PB proteomes, supported by a new user-friendly database (http://rnagranuledb.lunenfeld.ca/) that curates current literature evidence for genes or proteins associated with SGs or PBs. As previously observed, the SG and PB proteomes are biased toward intrinsically disordered regions and have a high propensity to contain primary sequence features favoring phase separation. We also provide an outlook on how the various components of SGs and PBs may cooperate to organize and form membraneless organelles.


Assuntos
Grânulos Citoplasmáticos/metabolismo , Proteoma/metabolismo , RNA Mensageiro/metabolismo , Animais , Humanos
9.
Proc Natl Acad Sci U S A ; 121(25): e2401326121, 2024 Jun 18.
Artigo em Inglês | MEDLINE | ID: mdl-38857394

RESUMO

When wires are cut, the tool produces striations on the cut surface; as in other forms of forensic analysis, these striation marks are used to connect the evidence to the source that created them. Here, we argue that the practice of comparing two wire cut surfaces introduces complexities not present in better-investigated forensic examination of toolmarks such as those observed on bullets, as wire comparisons inherently require multiple distinct comparisons, increasing the expected false discovery rate. We call attention to the multiple comparison problem in wire examination and relate it to other situations in forensics that involve multiple comparisons, such as database searches.

10.
RNA ; 30(3): 189-199, 2024 Feb 16.
Artigo em Inglês | MEDLINE | ID: mdl-38164624

RESUMO

Aptamers have emerged as research hotspots of the next generation due to excellent performance benefits and application potentials in pharmacology, medicine, and analytical chemistry. Despite the numerous aptamer investigations, the lack of comprehensive data integration has hindered the development of computational methods for aptamers and the reuse of aptamers. A public access database named AptaDB, derived from experimentally validated data manually collected from the literature, was hence developed, integrating comprehensive aptamer-related data, which include six key components: (i) experimentally validated aptamer-target interaction information, (ii) aptamer property information, (iii) structure information of aptamer, (iv) target information, (v) experimental activity information, and (vi) algorithmically calculated similar aptamers. AptaDB currently contains 1350 experimentally validated aptamer-target interactions, 1230 binding affinity constants, 1293 aptamer sequences, and more. Compared to other aptamer databases, it contains twice the number of entries found in available databases. The collection and integration of the above information categories is unique among available aptamer databases and provides a user-friendly interface. AptaDB will also be continuously updated as aptamer research evolves. We expect that AptaDB will become a powerful source for aptamer rational design and a valuable tool for aptamer screening in the future. For access to AptaDB, please visit http://lmmd.ecust.edu.cn/aptadb/.


Assuntos
Aptâmeros de Nucleotídeos , Oligonucleotídeos , Bases de Dados Factuais , Aptâmeros de Nucleotídeos/química , Técnica de Seleção de Aptâmeros
11.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38711369

RESUMO

Diet-drug interactions (DDIs) are pivotal in drug discovery and pharmacovigilance. DDIs can modify the systemic bioavailability/pharmacokinetics of drugs, posing a threat to public health and patient safety. Therefore, it is crucial to establish a platform to reveal the correlation between diets and drugs. Accordingly, we have established a publicly accessible online platform, known as Diet-Drug Interactions Database (DDID, https://bddg.hznu.edu.cn/ddid/), to systematically detail the correlation and corresponding mechanisms of DDIs. The platform comprises 1338 foods/herbs, encompassing flora and fauna, alongside 1516 widely used drugs and 23 950 interaction records. All interactions are meticulously scrutinized and segmented into five categories, thereby resulting in evaluations (positive, negative, no effect, harmful and possible). Besides, cross-linkages between foods/herbs, drugs and other databases are furnished. In conclusion, DDID is a useful resource for comprehending the correlation between foods, herbs and drugs and holds a promise to enhance drug utilization and research on drug combinations.


Assuntos
Bases de Dados Factuais , Interações Alimento-Droga , Humanos , Dieta
12.
Brief Bioinform ; 25(4)2024 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-39038934

RESUMO

From the catalytic breakdown of nutrients to signaling, interactions between metabolites and proteins play an essential role in cellular function. An important case is cell-cell communication, where metabolites, secreted into the microenvironment, initiate signaling cascades by binding to intra- or extracellular receptors of neighboring cells. Protein-protein cell-cell communication interactions are routinely predicted from transcriptomic data. However, inferring metabolite-mediated intercellular signaling remains challenging, partially due to the limited size of intercellular prior knowledge resources focused on metabolites. Here, we leverage knowledge-graph infrastructure to integrate generalistic metabolite-protein with curated metabolite-receptor resources to create MetalinksDB. MetalinksDB is an order of magnitude larger than existing metabolite-receptor resources and can be tailored to specific biological contexts, such as diseases, pathways, or tissue/cellular locations. We demonstrate MetalinksDB's utility in identifying deregulated processes in renal cancer using multi-omics bulk data. Furthermore, we infer metabolite-driven intercellular signaling in acute kidney injury using spatial transcriptomics data. MetalinksDB is a comprehensive and customizable database of intercellular metabolite-protein interactions, accessible via a web interface (https://metalinks.omnipathdb.org/) and programmatically as a knowledge graph (https://github.com/biocypher/metalinks). We anticipate that by enabling diverse analyses tailored to specific biological contexts, MetalinksDB will facilitate the discovery of disease-relevant metabolite-mediated intercellular signaling processes.


Assuntos
Transdução de Sinais , Humanos , Comunicação Celular , Neoplasias Renais/metabolismo , Neoplasias Renais/genética , Injúria Renal Aguda/metabolismo , Injúria Renal Aguda/genética , Biologia Computacional/métodos , Proteínas/metabolismo , Proteínas/genética , Software , Transcriptoma
13.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38678388

RESUMO

Cyclic peptides offer a range of notable advantages, including potent antibacterial properties, high binding affinity and specificity to target molecules, and minimal toxicity, making them highly promising candidates for drug development. However, a comprehensive database that consolidates both synthetically derived and naturally occurring cyclic peptides is conspicuously absent. To address this void, we introduce CyclicPepedia (https://www.biosino.org/iMAC/cyclicpepedia/), a pioneering database that encompasses 8744 known cyclic peptides. This repository, structured as a composite knowledge network, offers a wealth of information encompassing various aspects of cyclic peptides, such as cyclic peptides' sources, categorizations, structural characteristics, pharmacokinetic profiles, physicochemical properties, patented drug applications, and a collection of crucial publications. Supported by a user-friendly knowledge retrieval system and calculation tools specifically designed for cyclic peptides, CyclicPepedia will be able to facilitate advancements in cyclic peptide drug development.


Assuntos
Bases de Conhecimento , Peptídeos Cíclicos , Peptídeos Cíclicos/química , Bases de Dados de Proteínas
14.
Brief Bioinform ; 25(2)2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38446737

RESUMO

Accurately predicting the binding affinity between proteins and ligands is crucial in drug screening and optimization, but it is still a challenge in computer-aided drug design. The recent success of AlphaFold2 in predicting protein structures has brought new hope for deep learning (DL) models to accurately predict protein-ligand binding affinity. However, the current DL models still face limitations due to the low-quality database, inaccurate input representation and inappropriate model architecture. In this work, we review the computational methods, specifically DL-based models, used to predict protein-ligand binding affinity. We start with a brief introduction to protein-ligand binding affinity and the traditional computational methods used to calculate them. We then introduce the basic principles of DL models for predicting protein-ligand binding affinity. Next, we review the commonly used databases, input representations and DL models in this field. Finally, we discuss the potential challenges and future work in accurately predicting protein-ligand binding affinity via DL models.


Assuntos
Aprendizado Profundo , Ligantes , Bases de Dados Factuais , Desenho de Fármacos , Avaliação Pré-Clínica de Medicamentos
15.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38546324

RESUMO

Enrichment analysis contextualizes biological features in pathways to facilitate a systematic understanding of high-dimensional data and is widely used in biomedical research. The emerging reporter score-based analysis (RSA) method shows more promising sensitivity, as it relies on P-values instead of raw values of features. However, RSA cannot be directly applied to multi-group and longitudinal experimental designs and is often misused due to the lack of a proper tool. Here, we propose the Generalized Reporter Score-based Analysis (GRSA) method for multi-group and longitudinal omics data. A comparison with other popular enrichment analysis methods demonstrated that GRSA had increased sensitivity across multiple benchmark datasets. We applied GRSA to microbiome, transcriptome and metabolome data and discovered new biological insights in omics studies. Finally, we demonstrated the application of GRSA beyond functional enrichment using a taxonomy database. We implemented GRSA in an R package, ReporterScore, integrating with a powerful visualization module and updatable pathway databases, which is available on the Comprehensive R Archive Network (https://cran.r-project.org/web/packages/ReporterScore). We believe that the ReporterScore package will be a valuable asset for broad biomedical research fields.


Assuntos
Pesquisa Biomédica , Microbiota , Benchmarking , Bases de Dados Factuais , Metaboloma
16.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38555474

RESUMO

As key oncogenic drivers in non-small-cell lung cancer (NSCLC), various mutations in the epidermal growth factor receptor (EGFR) with variable drug sensitivities have been a major obstacle for precision medicine. To achieve clinical-level drug recommendations, a platform for clinical patient case retrieval and reliable drug sensitivity prediction is highly expected. Therefore, we built a database, D3EGFRdb, with the clinicopathologic characteristics and drug responses of 1339 patients with EGFR mutations via literature mining. On the basis of D3EGFRdb, we developed a deep learning-based prediction model, D3EGFRAI, for drug sensitivity prediction of new EGFR mutation-driven NSCLC. Model validations of D3EGFRAI showed a prediction accuracy of 0.81 and 0.85 for patients from D3EGFRdb and our hospitals, respectively. Furthermore, mutation scanning of the crucial residues inside drug-binding pockets, which may occur in the future, was performed to explore their drug sensitivity changes. D3EGFR is the first platform to achieve clinical-level drug response prediction of all approved small molecule drugs for EGFR mutation-driven lung cancer and is freely accessible at https://www.d3pharma.com/D3EGFR/index.php.


Assuntos
Carcinoma Pulmonar de Células não Pequenas , Aprendizado Profundo , Neoplasias Pulmonares , Humanos , Neoplasias Pulmonares/tratamento farmacológico , Neoplasias Pulmonares/genética , Carcinoma Pulmonar de Células não Pequenas/tratamento farmacológico , Carcinoma Pulmonar de Células não Pequenas/genética , Carcinoma Pulmonar de Células não Pequenas/patologia , Receptores ErbB/genética , Mutação , Armazenamento e Recuperação da Informação
17.
Brief Bioinform ; 25(4)2024 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-39038936

RESUMO

Sequence database searches followed by homology-based function transfer form one of the oldest and most popular approaches for predicting protein functions, such as Gene Ontology (GO) terms. These searches are also a critical component in most state-of-the-art machine learning and deep learning-based protein function predictors. Although sequence search tools are the basis of homology-based protein function prediction, previous studies have scarcely explored how to select the optimal sequence search tools and configure their parameters to achieve the best function prediction. In this paper, we evaluate the effect of using different options from among popular search tools, as well as the impacts of search parameters, on protein function prediction. When predicting GO terms on a large benchmark dataset, we found that BLASTp and MMseqs2 consistently exceed the performance of other tools, including DIAMOND-one of the most popular tools for function prediction-under default search parameters. However, with the correct parameter settings, DIAMOND can perform comparably to BLASTp and MMseqs2 in function prediction. Additionally, we developed a new scoring function to derive GO prediction from homologous hits that consistently outperform previously proposed scoring functions. These findings enable the improvement of almost all protein function prediction algorithms with a few easily implementable changes in their sequence homolog-based component. This study emphasizes the critical role of search parameter settings in homology-based function transfer and should have an important contribution to the development of future protein function prediction algorithms.


Assuntos
Bases de Dados de Proteínas , Proteínas , Proteínas/química , Proteínas/metabolismo , Proteínas/genética , Biologia Computacional/métodos , Ontologia Genética , Algoritmos , Análise de Sequência de Proteína/métodos , Software , Aprendizado de Máquina
18.
Brief Bioinform ; 25(4)2024 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-39007592

RESUMO

High-throughput DNA sequencing technologies decode tremendous amounts of microbial protein-coding gene sequences. However, accurately assigning protein functions to novel gene sequences remain a challenge. To this end, we developed FunGeneTyper, an extensible framework with two new deep learning models (i.e., FunTrans and FunRep), structured databases, and supporting resources for achieving highly accurate (Accuracy > 0.99, F1-score > 0.97) and fine-grained classification of antibiotic resistance genes (ARGs) and virulence factor genes. Using an experimentally confirmed dataset of ARGs comprising remote homologous sequences as the test set, our framework achieves by-far-the-best performance in the discovery of new ARGs from human gut (F1-score: 0.6948), wastewater (0.6072), and soil (0.5445) microbiomes, beating the state-of-the-art bioinformatics tools and sequence alignment-based (F1-score: 0.0556-0.5065) and domain-based (F1-score: 0.2630-0.5224) annotation approaches. Furthermore, our framework is implemented as a lightweight, privacy-preserving, and plug-and-play neural network module, facilitating its versatility and accessibility to developers and users worldwide. We anticipate widespread utilization of FunGeneTyper (https://github.com/emblab-westlake/FunGeneTyper) for precise classification of protein-coding gene functions and the discovery of numerous valuable enzymes. This advancement will have a significant impact on various fields, including microbiome research, biotechnology, metagenomics, and bioinformatics.


Assuntos
Aprendizado Profundo , Humanos , Biologia Computacional/métodos , Microbiota/genética , Proteínas de Bactérias/genética , Resistência Microbiana a Medicamentos/genética , Software , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Fatores de Virulência/genética
19.
Proc Natl Acad Sci U S A ; 120(29): e2220762120, 2023 07 18.
Artigo em Inglês | MEDLINE | ID: mdl-37432995

RESUMO

Large datasets contribute new insights to subjects formerly investigated by exemplars. We used coevolution data to create a large, high-quality database of transmembrane ß-barrels (TMBB). By applying simple feature detection on generated evolutionary contact maps, our method (IsItABarrel) achieves 95.88% balanced accuracy when discriminating among protein classes. Moreover, comparison with IsItABarrel revealed a high rate of false positives in previous TMBB algorithms. In addition to being more accurate than previous datasets, our database (available online) contains 1,938,936 bacterial TMBB proteins from 38 phyla, respectively, 17 and 2.2 times larger than the previous sets TMBB-DB and OMPdb. We anticipate that due to its quality and size, the database will serve as a useful resource where high-quality TMBB sequence data are required. We found that TMBBs can be divided into 11 types, three of which have not been previously reported. We find tremendous variance in proteome percentage among TMBB-containing organisms with some using 6.79% of their proteome for TMBBs and others using as little as 0.27% of their proteome. The distribution of the lengths of the TMBBs is suggestive of previously hypothesized duplication events. In addition, we find that the C-terminal ß-signal varies among different classes of bacteria though its consensus sequence is LGLGYRF. However, this ß-signal is only characteristic of prototypical TMBBs. The ten non-prototypical barrel types have other C-terminal motifs, and it remains to be determined if these alternative motifs facilitate TMBB insertion or perform any other signaling function.


Assuntos
Algoritmos , Proteoma , Humanos , Proteínas de Bactérias/genética , Evolução Biológica , Sequência Consenso
20.
Proc Natl Acad Sci U S A ; 120(43): e2306815120, 2023 10 24.
Artigo em Inglês | MEDLINE | ID: mdl-37844232

RESUMO

Recent global changes associated with anthropogenic activities are impacting ecological systems globally, giving rise to the Anthropocene. Critical reorganization of biological communities and biodiversity loss are expected to accelerate as anthropogenic global change continues. Long-term records offer context for understanding baseline conditions and those trajectories that are beyond the range of normal fluctuation seen over recent millennia: Are we causing changes that are fundamentally different from changes in the past? Using a rich dataset of late Quaternary pollen records, stored in the open-access and community-curated Neotoma database, we analyzed changes in biodiversity and community composition since the end Pleistocene in North America. We measured taxonomic richness, short-term taxonomic loss and gain, first/last appearances (FAD/LAD), and abrupt community change. For all analyses, we incorporated age-model uncertainty and accounted for differences in sample size to generate conservative estimates. The most prominent signals of elevated vegetation change were seen during the Pleistocene-Holocene transition and since 200 calendar years before present (cal YBP). During the Pleistocene-Holocene transition, abrupt changes and FADs were elevated, and from 200 to -50 cal YBP, we found increases in short-term taxonomic loss, FADs, LADs, and abrupt changes. Taxonomic richness declined from ~13,000 cal YBP until about 6,000 cal YBP and then increased until the present, reaching levels seen during the end Pleistocene. Regionally, patterns were highly variable. These results show that recent changes associated with anthropogenic impacts are comparable to the landscape changes that took place as we moved from a glacial to interglacial world.


Assuntos
Biodiversidade , Ecossistema , Pólen , América do Norte , Biota
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA