Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 72
Filtrar
1.
PLoS One ; 19(5): e0301172, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38696408

RESUMO

Horizontal gene transfer (HGT) is a powerful evolutionary force that considerably shapes the structure of prokaryotic genomes and is associated with genomic islands (GIs). A GI is a DNA segment composed of transferred genes that can be found within a prokaryotic genome, obtained through HGT. Much research has focused on detecting GIs in genomes, but here we pursue a new course, which is identifying possible preferred locations of GIs in the prokaryotic genome. Here, we identify the locations of the GIs within prokaryotic genomes to examine patterns in those locations. Prokaryotic GIs were analyzed according to the genome structure that they are located in, whether it be a circular or a linear genome. The analytical investigations employed are: (1) studying the GI locations in relation to the origin of replication (oriC); (2) exploring the distances between GIs; and (3) determining the distribution of GIs across the genomes. For each of the investigations, the analysis was performed on all of the GIs in the data set. Moreover, to void bias caused by the distribution of the genomes represented, the GIs in one genome from each species and the GIs of the most frequent species are also analyzed. Overall, the results showed that there are preferred sites for the GIs in the genome. In the linear genomes, these sites are usually located in the oriC region and terminus region, while in the circular genomes, they are located solely in the terminus region. These results also showed that the distance distribution between the GIs is almost exponential, which proves that GIs have preferred sites within genomes. The oriC and termniuns are preferred sites for the GIs and a possible natural explanation for this could be connected to the content of the oriC region. Moreover, the content of the GIs in terms of its protein families was studied and the results demonstrated that the majority of frequent protein families are close to identical in each section.


Assuntos
Transferência Genética Horizontal , Ilhas Genômicas , Genoma Bacteriano , Genoma Arqueal , Origem de Replicação/genética , Células Procarióticas/metabolismo
2.
Toxicology ; 501: 153708, 2024 01.
Artigo em Inglês | MEDLINE | ID: mdl-38104655

RESUMO

With the aim of helping to set safe exposure limits for the general population, various techniques have been implemented to conduct risk assessments for chemicals and other environmental stressors; however, none of these tools facilitate the identification of completely new chemicals that are likely hazardous and elicit an adverse biological effect. Here, we detail a novel in silico, deep-learning framework that is designed to systematically generate structures for new chemical compounds that are predicted to be chemical hazards. To assess the utility of the framework, we applied the tool to four endpoints related to environmental toxicants and their impacts on human and animal health: (i) toxicity to honeybees, (ii) immunotoxicity, (iii) endocrine disruption via ER-α antagonism, and (iv) mutagenicity. In addition, we characterized the predicted potency of these compounds and examined their structural relationship to existing chemicals of concern. As part of the array of emerging new approach methodologies (NAMs), we anticipate that such a framework will be a significant asset to risk assessors and other environmental scientists when planning and forecasting. Though not in the scope of the present study, we expect that the methodology detailed here could also be useful in the de novo design of more environmentally-friendly industrial chemicals.


Assuntos
Aprendizado Profundo , Humanos , Animais , Estudos Prospectivos , Substâncias Perigosas/toxicidade , Receptores de Estrogênio , Mutagênicos , Medição de Risco/métodos
3.
Front Microbiol ; 14: 1254999, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38029109

RESUMO

As the name of the genus Pantoea ("of all sorts and sources") suggests, this genus includes bacteria with a wide range of provenances, including plants, animals, soils, components of the water cycle, and humans. Some members of the genus are pathogenic to plants, and some are suspected to be opportunistic human pathogens; while others are used as microbial pesticides or show promise in biotechnological applications. During its taxonomic history, the genus and its species have seen many revisions. However, evolutionary and comparative genomics studies have started to provide a solid foundation for a more stable taxonomy. To move further toward this goal, we have built a 2,509-gene core genome tree of 437 public genome sequences representing the currently known diversity of the genus Pantoea. Clades were evaluated for being evolutionarily and ecologically significant by determining bootstrap support, gene content differences, and recent recombination events. These results were then integrated with genome metadata, published literature, descriptions of named species with standing in nomenclature, and circumscriptions of yet-unnamed species clusters, 15 of which we assigned names under the nascent SeqCode. Finally, genome-based circumscriptions and descriptions of each species and each significant genetic lineage within species were uploaded to the LINbase Web server so that newly sequenced genomes of isolates belonging to any of these groups could be precisely and accurately identified.

4.
Front Genet ; 14: 1219297, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37811141

RESUMO

Antibiotic resistance is of crucial interest to both human and animal medicine. It has been recognized that increased environmental monitoring of antibiotic resistance is needed. Metagenomic DNA sequencing is becoming an attractive method to profile antibiotic resistance genes (ARGs), including a special focus on pathogens. A number of computational pipelines are available and under development to support environmental ARG monitoring; the pipeline we present here is promising for general adoption for the purpose of harmonized global monitoring. Specifically, ARGem is a user-friendly pipeline that provides full-service analysis, from the initial DNA short reads to the final visualization of results. The capture of extensive metadata is also facilitated to support comparability across projects and broader monitoring goals. The ARGem pipeline offers efficient analysis of a modest number of samples along with affordable computational components, though the throughput could be increased through cloud resources, based on the user's configuration. The pipeline components were carefully assessed and selected to satisfy tradeoffs, balancing efficiency and flexibility. It was essential to provide a step to perform short read assembly in a reasonable time frame to ensure accurate annotation of identified ARGs. Comprehensive ARG and mobile genetic element databases are included in ARGem for annotation support. ARGem further includes an expandable set of analysis tools that include statistical and network analysis and supports various useful visualization techniques, including Cytoscape visualization of co-occurrence and correlation networks. The performance and flexibility of the ARGem pipeline is demonstrated with analysis of aquatic metagenomes. The pipeline is freely available at https://github.com/xlxlxlx/ARGem.

5.
Sci Rep ; 13(1): 12102, 2023 07 26.
Artigo em Inglês | MEDLINE | ID: mdl-37495642

RESUMO

Mass testing is essential for identifying infected individuals during an epidemic and allowing healthy individuals to return to normal social activities. However, testing capacity is often insufficient to meet global health needs, especially during newly emerging epidemics. Dorfman's method, a classic group testing technique, helps reduce the number of tests required by pooling the samples of multiple individuals into a single sample for analysis. Dorfman's method does not consider the time dynamics or limits on testing capacity involved in infection detection, and it assumes that individuals are infected independently, ignoring community correlations. To address these limitations, we present an adaptive group testing (AGT) strategy based on graph partitioning, which divides a physical contact network into subgraphs (groups of individuals) and assigns testing priorities based on the social contact characteristics of each subgraph. Our AGT aims to maximize the number of infected individuals detected and minimize the number of tests required. After each testing round (perhaps on a daily basis), the testing priority is increased for each neighboring group of known infected individuals. We also present an enhanced infectious disease transmission model that simulates the dynamic spread of a pathogen and evaluate our AGT strategy using the simulation results. When applied to 13 social contact networks, AGT demonstrates significant performance improvements compared to Dorfman's method and its variations. Our AGT strategy requires fewer tests overall, reduces disease spread, and retains robustness under changes in group size, testing capacity, and other parameters. Testing plays a crucial role in containing and mitigating pandemics by identifying infected individuals and helping to prevent further transmission in families and communities. By identifying infected individuals and helping to prevent further transmission in families and communities, our AGT strategy can have significant implications for public health, providing guidance for policymakers trying to balance economic activity with the need to manage the spread of infection.


Assuntos
Doenças Transmissíveis , Interação Social , Humanos , Simulação por Computador
6.
G3 (Bethesda) ; 13(9)2023 08 30.
Artigo em Inglês | MEDLINE | ID: mdl-37313728

RESUMO

De novo genes are genes that emerge as new genes in some species, such as primate de novo genes that emerge in certain primate species. Over the past decade, a great deal of research has been conducted regarding their emergence, origins, functions, and various attributes in different species, some of which have involved estimating the ages of de novo genes. However, limited by the number of species available for whole-genome sequencing, relatively few studies have focused specifically on the emergence time of primate de novo genes. Among those, even fewer investigate the association between primate gene emergence with environmental factors, such as paleoclimate (ancient climate) conditions. This study investigates the relationship between paleoclimate and human gene emergence at primate species divergence. Based on 32 available primate genome sequences, this study has revealed possible associations between temperature changes and the emergence of de novo primate genes. Overall, findings in this study are that de novo genes tended to emerge in the recent 13 MY when the temperature continues cooling, which is consistent with past findings. Furthermore, in the context of an overall trend of cooling temperature, new primate genes were more likely to emerge during local warming periods, where the warm temperature more closely resembled the environmental condition that preceded the cooling trend. Results also indicate that both primate de novo genes and human cancer-associated genes have later origins in comparison to random human genes. Future studies can be in-depth on understanding human de novo gene emergence from an environmental perspective as well as understanding species divergence from a gene emergence perspective.


Assuntos
Evolução Molecular , Primatas , Animais , Humanos , Primatas/genética , Genoma
7.
PLoS One ; 18(3): e0281824, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36961781

RESUMO

We present a method for detecting horizontal gene transfer (HGT) using partial orders (posets). The method requires a poset for each species/gene pair, where we have a set of species S, and a set of genes G. Given the posets, the method constructs a phylogenetic tree that is compatible with the set of posets; this is done for each gene. Also, the set of posets can be derived from the tree. The trees constructed for each gene are then compared and tested for contradicting information, where a contradiction suggests HGT.


Assuntos
Evolução Molecular , Transferência Genética Horizontal , Filogenia
8.
Sci Rep ; 13(1): 2395, 2023 Feb 10.
Artigo em Inglês | MEDLINE | ID: mdl-36765153

RESUMO

The rapid growth of online social media usage in our daily lives has increased the importance of analyzing the dynamics of online social networks. However, the dynamic data of existing online social media platforms are not readily accessible. Hence, there is a necessity to synthesize networks emulating those of online social media for further study. In this work, we propose an epidemiology-inspired and community-based, time-evolving online social network generation algorithm (EpiCNet), to generate a time-evolving sequence of random networks that closely mirror the characteristics of real-world online social networks. Variants of the algorithm can produce both undirected and directed networks to accommodate different user interaction paradigms. EpiCNet utilizes compartmental models inspired by mathematical epidemiology to simulate the flow of individuals into and out of the online social network. It also employs an overlapping community structure to enable more realistic connections between individuals in the network. Furthermore, EpiCNet evolves the community structure and connections in the simulated online social network as a function of time and with an emphasis on the behavior of individuals. EpiCNet is capable of simulating a variety of online social networks by adjusting a set of tunable parameters that specify the individual behavior and the evolution of communities over time. The experimental results show that the network properties of the synthetic time-evolving online social network generated by EpiCNet, such as clustering coefficient, node degree, and diameter, match those of typical real-world online social networks such as Facebook and Twitter.

9.
Sci Rep ; 13(1): 344, 2023 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-36611105

RESUMO

Prokaryotic genomes evolve via horizontal gene transfer (HGT), mutations, and rearrangements. A noteworthy part of the HGT process is facilitated by genomic islands (GIs). While previous computational biology research has focused on developing tools to detect GIs in prokaryotic genomes, there has been little research investigating GI patterns and biological connections across species. We have pursued the novel idea of connecting GIs across prokaryotic and phage genomes via patterns of protein families. Such patterns are sequences of protein families frequently present in the genomes of multiple species. We combined the large data set from the IslandViewer4 database with protein families from Pfam while implementing a comprehensive strategy to identify patterns making use of HMMER, BLAST, and MUSCLE. we also implemented Python programs that link the analysis into a single pipeline. Research results demonstrated that related GIs often exist in species that are evolutionarily unrelated and in multiple bacterial phyla. Analysis of the discovered patterns led to the identification of biological connections among prokaryotes and phages. These connections suggest broad HGT connections across the bacterial kingdom and its associated phages. The discovered patterns and connections could provide the basis for additional analysis on HGT breadth and the patterns in pathogenic GIs.


Assuntos
Bacteriófagos , Ilhas Genômicas , Ilhas Genômicas/genética , Bacteriófagos/genética , Células Procarióticas , Proteínas/genética , Bactérias/genética , Biologia Computacional/métodos , Transferência Genética Horizontal , Genoma Bacteriano
10.
Front Microbiol ; 13: 887310, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35663905

RESUMO

Genomics has put prokaryotic rank-based taxonomy on a solid phylogenetic foundation. However, most taxonomic ranks were set long before the advent of DNA sequencing and genomics. In this concept paper, we thus ask the following question: should prokaryotic classification schemes besides the current phylum-to-species ranks be explored, developed, and incorporated into scientific discourse? Could such alternative schemes provide better solutions to the basic need of science and society for which taxonomy was developed, namely, precise and meaningful identification? A neutral genome-similarity based framework is then described that could allow alternative classification schemes to be explored, compared, and translated into each other without having to choose only one as the gold standard. Classification schemes could thus continue to evolve and be selected according to their benefits and based on how well they fulfill the need for prokaryotic identification.

11.
Microb Genom ; 8(5)2022 05.
Artigo em Inglês | MEDLINE | ID: mdl-35584001

RESUMO

Early disease detection is a prerequisite for enacting effective interventions for disease control. Strains of the bacterial plant pathogen Xylella fastidiosa have recurrently spread to new crops in new countries causing devastating outbreaks. So far, investigation of outbreak strains and highly resolved phylogenetic reconstruction have required whole-genome sequencing of pure bacterial cultures, which are challenging to obtain due to the fastidious nature of X. fastidiosa. Here, we show that culture-independent metagenomic sequencing, using the Oxford Nanopore Technologies MinION long-read sequencer, can sensitively and specifically detect the causative agent of Pierce's disease of grapevine, X. fastidiosa subspecies fastidiosa. Using a DNA sample from a grapevine in Virginia, USA, it was possible to obtain a metagenome-assembled genome (MAG) of sufficient quality for phylogenetic reconstruction with SNP resolution. The analysis placed the MAG in a clade with isolates from Georgia, USA, suggesting introduction of X. fastidiosa subspecies fastidiosa to Virginia from the south-eastern USA. This proof of concept study, thus, revealed that metagenomic sequencing can replace culture-dependent genome sequencing for reconstructing transmission routes of bacterial plant pathogens.


Assuntos
Metagenômica , Xylella , Surtos de Doenças , Filogenia , Xylella/genética
13.
PLoS One ; 16(12): e0261926, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34962963

RESUMO

Gene regulatory network (GRN) inference can now take advantage of powerful machine learning algorithms to complement traditional experimental methods in building gene networks. However, the dynamical nature of embryonic development-representing the time-dependent interactions between thousands of transcription factors, signaling molecules, and effector genes-is one of the most challenging arenas for GRN prediction. In this work, we show that successful GRN predictions for a developmental network from gene expression data alone can be obtained with the Priors Enriched Absent Knowledge (PEAK) network inference algorithm. PEAK is a noise-robust method that models gene expression dynamics via ordinary differential equations and selects the best network based on information-theoretic criteria coupled with the machine learning algorithm Elastic Net. We test our GRN prediction methodology using two gene expression datasets for the purple sea urchin, Stronglyocentrotus purpuratus, and cross-check our results against existing GRN models that have been constructed and validated by over 30 years of experimental results. Our results find a remarkably high degree of sensitivity in identifying known gene interactions in the network (maximum 81.58%). We also generate novel predictions for interactions that have not yet been described, which provide a resource for researchers to use to further complete the sea urchin GRN. Published ChIPseq data and spatial co-expression analysis further support a subset of the top novel predictions. We conclude that GRN predictions that match known gene interactions can be produced using gene expression data alone from developmental time series experiments.


Assuntos
Biologia Computacional/métodos , Perfilação da Expressão Gênica , Regulação da Expressão Gênica no Desenvolvimento , Redes Reguladoras de Genes , Strongylocentrotus purpuratus/embriologia , Strongylocentrotus purpuratus/genética , Algoritmos , Animais , Fenômenos Bioquímicos , Imunoprecipitação da Cromatina , Feminino , Aprendizado de Máquina , Masculino , Sensibilidade e Especificidade , Biologia de Sistemas , Fatores de Transcrição/genética , Transcriptoma
14.
J Comput Biol ; 28(11): 1063-1074, 2021 11.
Artigo em Inglês | MEDLINE | ID: mdl-34665648

RESUMO

The functional profile of metagenomic samples enables improved understanding of microbial populations in the environment. Such analysis consists of assigning short sequencing reads to a particular functional category. Normally, manually curated databases are used for functional assignment, and genes are arranged into different classes. Sequence alignment has been widely used to profile metagenomic samples against curated databases. However, this method is time consuming and requires high computational resources. While several alignment-free methods based on k-mer composition have been developed in recent years, they still require large amounts of computer main memory. In this article, MetaMLP (Metagenomics Machine Learning Profiler), a machine learning method that represents sequences as numerical vectors (embeddings) and uses a simple one hidden layer neural network to profile functional categories, is developed. Unlike other methods, MetaMLP enables partial matching by using a reduced alphabet to build sequence embeddings from full and partial k-mers. MetaMLP is able to identify a slightly larger number of reads compared with DIAMOND (one of the fastest sequence alignment methods), as well as to perform accurate predictions with 0.99 precision and 0.99 recall. MetaMLP can process 100M reads in ∼10 minutes on a laptop computer, which is 50 times faster than DIAMOND.


Assuntos
Biologia Computacional/métodos , Metagenômica/métodos , Alinhamento de Sequência/métodos , Algoritmos , Curadoria de Dados , Bases de Dados Genéticas , Aprendizado de Máquina , Análise de Sequência de DNA
15.
PeerJ ; 9: e10906, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33828908

RESUMO

BACKGROUND: Computing genomic similarity between strains is a prerequisite for genome-based prokaryotic classification and identification. Genomic similarity was first computed as Average Nucleotide Identity (ANI) values based on the alignment of genomic fragments. Since this is computationally expensive, faster and computationally cheaper alignment-free methods have been developed to estimate ANI. However, these methods do not reach the level of accuracy of alignment-based methods. METHODS: Here we introduce LINflow, a computational pipeline that infers pairwise genomic similarity in a set of genomes. LINflow takes advantage of the speed of the alignment-free sourmash tool to identify the genome in a dataset that is most similar to a query genome and the precision of the alignment-based pyani software to precisely compute ANI between the query genome and the most similar genome identified by sourmash. This is repeated for each new genome that is added to a dataset. The sequentially computed ANI values are stored as Life Identification Numbers (LINs), which are then used to infer all other pairwise ANI values in the set. We tested LINflow on four sets, 484 genomes in total, and compared the needed time and the generated similarity matrices with other tools. RESULTS: LINflow is up to 150 times faster than pyani and pairwise ANI values generated by LINflow are highly correlated with those computed by pyani. However, because LINflow infers most pairwise ANI values instead of computing them directly, ANI values occasionally depart from the ANI values computed by pyani. In conclusion, LINflow is a fast and memory-efficient pipeline to infer similarity among a large set of prokaryotic genomes. Its ability to quickly add new genome sequences to an already computed similarity matrix makes LINflow particularly useful for projects when new genome sequences need to be regularly added to an existing dataset.

16.
BMC Bioinformatics ; 22(1): 117, 2021 Mar 10.
Artigo em Inglês | MEDLINE | ID: mdl-33691615

RESUMO

BACKGROUND: Metagenomics is gaining attention as a powerful tool for identifying how agricultural management practices influence human and animal health, especially in terms of potential to contribute to the spread of antibiotic resistance. However, the ability to compare the distribution and prevalence of antibiotic resistance genes (ARGs) across multiple studies and environments is currently impossible without a complete re-analysis of published datasets. This challenge must be addressed for metagenomics to realize its potential for helping guide effective policy and practice measures relevant to agricultural ecosystems, for example, identifying critical control points for mitigating the spread of antibiotic resistance. RESULTS: Here we introduce AgroSeek, a centralized web-based system that provides computational tools for analysis and comparison of metagenomic data sets tailored specifically to researchers and other users in the agricultural sector interested in tracking and mitigating the spread of ARGs. AgroSeek draws from rich, user-provided metagenomic data and metadata to facilitate analysis, comparison, and prediction in a user-friendly fashion. Further, AgroSeek draws from publicly-contributed data sets to provide a point of comparison and context for data analysis. To incorporate metadata into our analysis and comparison procedures, we provide flexible metadata templates, including user-customized metadata attributes to facilitate data sharing, while maintaining the metadata in a comparable fashion for the broader user community and to support large-scale comparative and predictive analysis. CONCLUSION: AgroSeek provides an easy-to-use tool for environmental metagenomic analysis and comparison, based on both gene annotations and associated metadata, with this initial demonstration focusing on control of antibiotic resistance in agricultural ecosystems. Agroseek creates a space for metagenomic data sharing and collaboration to assist policy makers, stakeholders, and the public in decision-making. AgroSeek is publicly-available at https://agroseek.cs.vt.edu/ .


Assuntos
Resistência Microbiana a Medicamentos/genética , Microbiologia Ambiental , Genes Bacterianos , Metadados , Metagenômica , Ecossistema , Internet , Metagenoma , Software
17.
Water Res ; 194: 116907, 2021 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-33610927

RESUMO

The emergence of next generation sequencing (NGS) is revolutionizing the potential to address complex microbiological challenges in the water industry. NGS technologies can provide holistic insight into microbial communities and their functional capacities in water and wastewater systems, thus eliminating the need to develop a new assay for each target organism or gene. However, several barriers have hampered wide-scale adoption of NGS by the water industry, including cost, need for specialized expertise and equipment, challenges with data analysis and interpretation, lack of standardized methods, and the rapid pace of development of new technologies. In this critical review, we provide an overview of the current state of the science of NGS technologies as they apply to water, wastewater, and recycled water. In addition, a systematic literature review was conducted in which we identified over 600 peer-reviewed journal articles on this topic and summarized their contributions to six key areas relevant to the water and wastewater fields: taxonomic classification and pathogen detection, functional and catabolic gene characterization, antimicrobial resistance (AMR) profiling, bacterial toxicity characterization, Cyanobacteria and harmful algal bloom identification, and virus characterization. For each application, we have presented key trends, noteworthy advancements, and proposed future directions. Finally, key needs to advance NGS technologies for broader application in water and wastewater fields are assessed.


Assuntos
Cianobactérias , Sequenciamento de Nucleotídeos em Larga Escala , Cianobactérias/genética , Proliferação Nociva de Algas , Águas Residuárias , Água
18.
Commun Biol ; 4(1): 183, 2021 02 10.
Artigo em Inglês | MEDLINE | ID: mdl-33568741

RESUMO

Biases in data used to train machine learning (ML) models can inflate their prediction performance and confound our understanding of how and what they learn. Although biases are common in biological data, systematic auditing of ML models to identify and eliminate these biases is not a common practice when applying ML in the life sciences. Here we devise a systematic, principled, and general approach to audit ML models in the life sciences. We use this auditing framework to examine biases in three ML applications of therapeutic interest and identify unrecognized biases that hinder the ML process and result in substantially reduced model performance on new datasets. Ultimately, we show that ML models tend to learn primarily from data biases when there is insufficient signal in the data to learn from. We provide detailed protocols, guidelines, and examples of code to enable tailoring of the auditing framework to other biomedical applications.


Assuntos
Mineração de Dados , Aprendizado de Máquina , Proteínas/metabolismo , Proteoma , Proteômica , Animais , Viés , Bases de Dados de Proteínas , Antígenos de Histocompatibilidade/metabolismo , Humanos , Preparações Farmacêuticas/química , Preparações Farmacêuticas/metabolismo , Ligação Proteica , Mapas de Interação de Proteínas , Proteínas/química , Reprodutibilidade dos Testes
19.
Insect Sci ; 28(4): 976-986, 2021 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-32537916

RESUMO

Planthoppers are the most notorious rice pests, because they transmit various rice viruses in a persistent-propagative manner. Protein-protein interactions (PPIs) between virus and vector are crucial for virus transmission by vector insects. However, the number of known PPIs for pairs of rice viruses and planthoppers is restricted by low throughput research methods. In this study, we applied DeNovo, a virus-host sequence-based PPI predictor, to predict potential PPIs at a genome-wide scale between three planthoppers and five rice viruses. PPIs were identified at two different confidence thresholds, referred to as low and high modes. The number of PPIs for the five planthopper-virus pairs ranged from 506 to 1985 in the low mode and from 1254 to 4286 in the high mode. After eliminating the "one-too-many" redundant interacting information, the PPIs with unique planthopper proteins were reduced to 343-724 in the low mode and 758-1671 in the high mode. Homologous analysis showed that 11 sets and 31 sets of homologous planthopper proteins were shared by all planthopper-virus interactions in the two modes, indicating that they are potential conserved vector factors essential for transmission of rice viruses. Ten PPIs between small brown planthopper and rice stripe virus (RSV) were verified using glutathione-S-transferase (GST)/His-pull down or co-immunoprecipitation assay. Five of the ten PPIs were proven positive, and three of the five SBPH proteins were confirmed to interact with RSV. The predicted PPIs provide new clues for further studies of the complicated relationship between rice viruses and their vector insects.


Assuntos
Hemípteros/virologia , Interações entre Hospedeiro e Microrganismos , Oryza/virologia , Vírus de Plantas , Animais , Hemípteros/genética , Hemípteros/metabolismo , Imunoprecipitação/métodos , Proteínas de Insetos/metabolismo , Insetos Vetores/genética , Insetos Vetores/metabolismo , Insetos Vetores/virologia , Oryza/metabolismo , Doenças das Plantas/virologia , Vírus de Plantas/genética , Vírus de Plantas/metabolismo , Mapas de Interação de Proteínas , Tenuivirus/genética , Tenuivirus/metabolismo
20.
Nucleic Acids Res ; 48(W1): W529-W537, 2020 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-32232369

RESUMO

High throughput DNA sequencing in combination with efficient algorithms could provide the basis for a highly resolved, genome phylogeny-based and digital prokaryotic taxonomy. However, current taxonomic practice continues to rely on cumbersome journal publications for the description of new species, which still constitute the smallest taxonomic units. In response, we introduce LINbase, a web server that allows users to genomically circumscribe any group of prokaryotes with measurable DNA similarity and that uses the individual isolate as smallest unit. Since LINbase leverages the concept of Life Identification Numbers (LINs), which are codes assigned to individual genomes based on reciprocal average nucleotide identity, we refer to groups circumscribed in LINbase as LINgroups. Users can associate with each LINgroup a name, a short description, and a URL to a peer-reviewed publication. As soon as a LINgroup is circumscribed, any user can immediately identify query genomes as members and submit comments about the LINgroup. Most genomes currently in LINbase were imported from GenBank, but users can upload their own genome sequences as well. In conclusion, LINbase combines the resolution of LINs with the power of crowdsourcing in support of a highly resolved, genome phylogeny-based digital taxonomy. LINbase is available at http://www.LINbase.org.


Assuntos
Bactérias/classificação , Genoma Bacteriano , Software , Algoritmos , Bactérias/genética , Bactérias/isolamento & purificação , Genoma Arqueal , Genômica/métodos , Internet , Filogenia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA