RESUMO
BACKGROUND: Transcriptomics has identified at-arrival differentially expressed genes associated with bovine respiratory disease (BRD) development; however, their use as prediction molecules necessitates further evaluation. Therefore, we aimed to selectively analyze and corroborate at-arrival mRNA expression from multiple independent populations of beef cattle. In a nested case-control study, we evaluated the expression of 56 mRNA molecules from at-arrival blood samples of 234 cattle across seven populations via NanoString nCounter gene expression profiling. Analysis of mRNA was performed with nSolver Advanced Analysis software (p < 0.05), comparing cattle groups based on the diagnosis of clinical BRD within 28 days of facility arrival (n = 115 Healthy; n = 119 BRD); BRD was further stratified for severity based on frequency of treatment and/or mortality (Treated_1, n = 89; Treated_2+, n = 30). Gene expression homogeneity of variance, receiver operator characteristic (ROC) curve, and decision tree analyses were performed between severity cohorts. RESULTS: Increased expression of mRNAs involved in specialized pro-resolving mediator synthesis (ALOX15, HPGD), leukocyte differentiation (LOC100297044, GCSAML, KLF17), and antimicrobial peptide production (CATHL3, GZMB, LTF) were identified in Healthy cattle. BRD cattle possessed increased expression of CFB, and mRNA related to granulocytic processes (DSG1, LRG1, MCF2L) and type-I interferon activity (HERC6, IFI6, ISG15, MX1). Healthy and Treated_1 cattle were similar in terms of gene expression, while Treated_2+ cattle were the most distinct. ROC cutoffs were used to generate an at-arrival treatment decision tree, which classified 90% of Treated_2+ individuals. CONCLUSIONS: Increased expression of complement factor B, pro-inflammatory, and type I interferon-associated mRNA hallmark the at-arrival expression patterns of cattle that develop severe clinical BRD. Here, we corroborate at-arrival mRNA markers identified in previous transcriptome studies and generate a prediction model to be evaluated in future studies. Further research is necessary to evaluate these expression patterns in a prospective manner.
Assuntos
Complexo Respiratório Bovino , Doenças dos Bovinos , Animais , Complexo Respiratório Bovino/diagnóstico , Complexo Respiratório Bovino/genética , Estudos de Casos e Controles , Bovinos , Doenças dos Bovinos/diagnóstico , Estudos Prospectivos , RNA Mensageiro/genética , TranscriptomaRESUMO
Microsatellites are common in genomes of most eukaryotic species. Due to their high mutability, an adaptive role for microsatellites has been considered. However, little is known concerning the contribution of microsatellites towards phenotypic variation. We used populations of the common sunflower (Helianthus annuus) at two latitudes to quantify the effect of microsatellite allele length on phenotype at the level of gene expression. We conducted a common garden experiment with seed collected from sunflower populations in Kansas and Oklahoma followed by an RNA-Seq experiment on 95 individuals. The effect of microsatellite allele length on gene expression was assessed across 3,325 microsatellites that could be consistently scored. Our study revealed 479 microsatellites at which allele length significantly correlates with gene expression (eSTRs). When irregular allele sizes not conforming to the motif length were removed, the number of eSTRs rose to 2,379. The percentage of variation in gene expression explained by eSTRs ranged from 1%-86% when controlling for population and allele-by-population interaction effects at the 479 eSTRs. Of these eSTRs, 70.4% are in untranslated regions (UTRs). A gene ontology (GO) analysis revealed that eSTRs are significantly enriched for GO terms associated with cis- and trans-regulatory processes. Our findings suggest that a substantial number of transcribed microsatellites can influence gene expression.
Assuntos
Genética Populacional , Helianthus , Repetições de Microssatélites , Alelos , Expressão Gênica , Helianthus/genética , Kansas , OklahomaRESUMO
The mechanisms by which natural populations generate adaptive genetic variation are not well understood. Some studies propose that microsatellites can function as drivers of adaptive variation. Here, we tested a potentially adaptive role for transcribed microsatellites with natural populations of the common sunflower (Helianthus annuus L.) by assessing the enrichment of microsatellites in genes that show expression divergence across latitudes. Seeds collected from six populations at two distinct latitudes in Kansas and Oklahoma were planted and grown in a common garden. Morphological measurements from the common garden demonstrated that phenotypic variation among populations is largely explained by underlying genetic variation. An RNA-Seq experiment was conducted with 96 of the individuals grown in the common garden and differentially expressed (DE) transcripts between the two latitudes were identified. A total number of 825 DE transcripts were identified. DE transcripts and nondifferentially expressed (NDE) transcripts were then scanned for microsatellites. The abundance of different motif lengths and types in both groups were estimated. Our results indicate that DE transcripts are significantly enriched with mononucleotide repeats and significantly depauperate in trinucleotide repeats. Further, the standardized mononucleotide repeat motif A and dinucleotide repeat motif AG were significantly enriched within DE transcripts while motif types, C, AT, ACC and AAC in DE transcripts, are significantly differentiated in microsatellite tract length between the two latitudes. The tract length differentiation at specific microsatellite motif types across latitudes and their enrichment within DE transcripts indicate a potential functional role for transcribed microsatellites in gene expression divergence in sunflower.
Assuntos
Perfilação da Expressão Gênica , Regulação da Expressão Gênica de Plantas , Helianthus/genética , Repetições de Microssatélites/fisiologia , Adaptação Biológica , Genes de Plantas , Variação Genética , Helianthus/crescimento & desenvolvimento , Helianthus/metabolismo , Kansas , Oklahoma , Fenótipo , Análise de Sequência de RNARESUMO
High-throughput transcriptomics was used to identify Fibroporia radiculosa genes that were differentially regulated during colonization of wood treated with a copper-based preservative. The transcriptome was profiled at two time points while the fungus was growing on wood treated with micronized copper quat (MCQ). A total of 917 transcripts were differentially expressed. Fifty-eight of these genes were more highly expressed when the MCQ was protecting the wood from strength loss and had putative functions related to oxalate production/degradation, laccase activity, quinone biosynthesis, pectin degradation, ATP production, cytochrome P450 activity, signal transduction, and transcriptional regulation. Sixty-one genes were more highly expressed when the MCQ lost its effectiveness (>50% strength loss) and had functions related to oxalate degradation; cytochrome P450 activity; H(2)O(2) production and degradation; degradation of cellulose, hemicellulose, and pectin; hexose transport; membrane glycerophospholipid metabolism; and cell wall chemistry. Ten of these differentially regulated genes were quantified by reverse transcriptase PCR for a more in-depth study (4 time points on wood with or without MCQ treatment). Our results showed that MCQ induced higher than normal levels of expression for four genes (putative annotations for isocitrate lyase, glyoxylate dehydrogenase, laccase, and oxalate decarboxylase 1), while four other genes (putative annotations for oxalate decarboxylase 2, aryl alcohol oxidase, glycoside hydrolase 5, and glycoside hydrolase 10) were repressed. The significance of these results is that we have identified several genes that appear to be coregulated, with putative functions related to copper tolerance and/or wood decay.
Assuntos
Cobre/toxicidade , Perfilação da Expressão Gênica , Regulação Fúngica da Expressão Gênica , Polyporaceae/efeitos dos fármacos , Polyporaceae/genética , Estresse Fisiológico , Redes e Vias Metabólicas/genética , Polyporaceae/crescimento & desenvolvimento , Reação em Cadeia da Polimerase em Tempo Real , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Madeira/microbiologiaRESUMO
BACKGROUND: A wealth of clustering algorithms has been applied to gene co-expression experiments. These algorithms cover a broad range of approaches, from conventional techniques such as k-means and hierarchical clustering, to graphical approaches such as k-clique communities, weighted gene co-expression networks (WGCNA) and paraclique. Comparison of these methods to evaluate their relative effectiveness provides guidance to algorithm selection, development and implementation. Most prior work on comparative clustering evaluation has focused on parametric methods. Graph theoretical methods are recent additions to the tool set for the global analysis and decomposition of microarray co-expression matrices that have not generally been included in earlier methodological comparisons. In the present study, a variety of parametric and graph theoretical clustering algorithms are compared using well-characterized transcriptomic data at a genome scale from Saccharomyces cerevisiae. METHODS: For each clustering method under study, a variety of parameters were tested. Jaccard similarity was used to measure each cluster's agreement with every GO and KEGG annotation set, and the highest Jaccard score was assigned to the cluster. Clusters were grouped into small, medium, and large bins, and the Jaccard score of the top five scoring clusters in each bin were averaged and reported as the best average top 5 (BAT5) score for the particular method. RESULTS: Clusters produced by each method were evaluated based upon the positive match to known pathways. This produces a readily interpretable ranking of the relative effectiveness of clustering on the genes. Methods were also tested to determine whether they were able to identify clusters consistent with those identified by other clustering methods. CONCLUSIONS: Validation of clusters against known gene classifications demonstrate that for this data, graph-based techniques outperform conventional clustering approaches, suggesting that further development and application of combinatorial strategies is warranted.
Assuntos
Algoritmos , Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Análise por Conglomerados , Genoma Fúngico , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Saccharomyces cerevisiae/genéticaRESUMO
The feasibility of short-read sequencing for genomic analysis was demonstrated for Fibroporia radiculosa, a copper-tolerant fungus that causes brown rot decay of wood. The effect of read quality on genomic assembly was assessed by filtering Illumina GAIIx reads from a single run of a paired-end library (75-nucleotide read length and 300-bp fragment size) at three different stringency levels and then assembling each data set with Velvet. A simple approach was devised to determine which filter stringency was "best." Venn diagrams identified the regions containing reads that were used in an assembly but were of a low-enough quality to be removed by a filter. By plotting base quality histograms of reads in this region, we judged whether a filter was too stringent or not stringent enough. Our best assembly had a genome size of 33.6 Mb, an N50 of 65.8 kb for a k-mer of 51, and a maximum contig length of 347 kb. Using GeneMark, 9,262 genes were predicted. TargetP and SignalP analyses showed that among the 1,213 genes with secreted products, 986 had motifs for signal peptides and 227 had motifs for signal anchors. Blast2GO analysis provided functional annotation for 5,407 genes. We identified 29 genes with putative roles in copper tolerance and 73 genes for lignocellulose degradation. A search for homologs of these 102 genes showed that F. radiculosa exhibited more similarity to Postia placenta than Serpula lacrymans. Notable differences were found, however, and their involvements in copper tolerance and wood decay are discussed.
Assuntos
Proteínas Fúngicas/genética , Genoma Fúngico/genética , Genômica/métodos , Polyporaceae/genética , Análise de Sequência de DNA/métodos , Madeira/microbiologia , Biologia Computacional/métodos , Cobre/metabolismo , Cobre/farmacologia , Proteínas Fúngicas/metabolismo , Perfilação da Expressão Gênica , Tamanho do Genoma , Lignina/metabolismo , Polyporaceae/efeitos dos fármacos , Madeira/metabolismoRESUMO
Bovine respiratory disease (BRD), the leading disease complex in beef cattle production systems, remains highly elusive regarding diagnostics and disease prediction. Previous research has employed cellular and molecular techniques to describe hematological and gene expression variation that coincides with BRD development. Here, we utilized weighted gene co-expression network analysis (WGCNA) to leverage total gene expression patterns from cattle at arrival and generate hematological and clinical trait associations to describe mechanisms that may predict BRD development. Gene expression counts of previously published RNA-Seq data from 23 cattle (2017; n = 11 Healthy, n = 12 BRD) were used to construct gene co-expression modules and correlation patterns with complete blood count (CBC) and clinical datasets. Modules were further evaluated for cross-populational preservation of expression with RNA-Seq data from 24 cattle in an independent population (2019; n = 12 Healthy, n = 12 BRD). Genes within well-preserved modules were subject to functional enrichment analysis for significant Gene Ontology terms and pathways. Genes which possessed high module membership and association with BRD development, regardless of module preservation ("hub genes"), were utilized for protein-protein physical interaction network and clustering analyses. Five well-preserved modules of co-expressed genes were identified. One module ("steelblue"), involved in alpha-beta T-cell complexes and Th2-type immunity, possessed significant correlation with increased erythrocytes, platelets, and BRD development. One module ("purple"), involved in mitochondrial metabolism and rRNA maturation, possessed significant correlation with increased eosinophils, fecal egg count per gram, and weight gain over time. Fifty-two interacting hub genes, stratified into 11 clusters, may possess transient function involved in BRD development not previously described in literature. This study identifies co-expressed genes and coordinated mechanisms associated with BRD, which necessitates further investigation in BRD-prediction research.
Assuntos
Complexo Respiratório Bovino , Doenças dos Bovinos , Transtornos Respiratórios , Doenças Respiratórias , Bovinos , Animais , Doenças Respiratórias/genética , Sistema Respiratório , Redes Reguladoras de Genes , Aumento de Peso/genética , Complexo Respiratório Bovino/genéticaRESUMO
Bovine respiratory disease (BRD) is a multifactorial disease involving complex host immune interactions shaped by pathogenic agents and environmental factors. Advancements in RNA sequencing and associated analytical methods are improving our understanding of host response related to BRD pathophysiology. Supervised machine learning (ML) approaches present one such method for analyzing new and previously published transcriptome data to identify novel disease-associated genes and mechanisms. Our objective was to apply ML models to lung and immunological tissue datasets acquired from previous clinical BRD experiments to identify genes that classify disease with high accuracy. Raw mRNA sequencing reads from 151 bovine datasets (n = 123 BRD, n = 28 control) were downloaded from NCBI-GEO. Quality filtered reads were assembled in a HISAT2/Stringtie2 pipeline. Raw gene counts for ML analysis were normalized, transformed, and analyzed with MLSeq, utilizing six ML models. Cross-validation parameters (fivefold, repeated 10 times) were applied to 70% of the compiled datasets for ML model training and parameter tuning; optimized ML models were tested with the remaining 30%. Downstream analysis of significant genes identified by the top ML models, based on classification accuracy for each etiological association, was performed within WebGestalt and Reactome (FDR ≤ 0.05). Nearest shrunken centroid and Poisson linear discriminant analysis with power transformation models identified 154 and 195 significant genes for IBR and BRSV, respectively; from these genes, the two ML models discriminated IBR and BRSV with 100% accuracy compared to sham controls. Significant genes classified by the top ML models in IBR (154) and BRSV (195), but not BVDV (74), were related to type I interferon production and IL-8 secretion, specifically in lymphoid tissue and not homogenized lung tissue. Genes identified in Mannheimia haemolytica infections (97) were involved in activating classical and alternative pathways of complement. Novel findings, including expression of genes related to reduced mitochondrial oxygenation and ATP synthesis in consolidated lung tissue, were discovered. Genes identified in each analysis represent distinct genomic events relevant to understanding and predicting clinical BRD. Our analysis demonstrates the utility of ML with published datasets for discovering functional information to support the prediction and understanding of clinical BRD.
Assuntos
Complexo Respiratório Bovino/genética , Biologia Computacional , Perfilação da Expressão Gênica , Redes Reguladoras de Genes , RNA-Seq , Aprendizado de Máquina Supervisionado , Transcriptoma , Animais , Complexo Respiratório Bovino/imunologia , Complexo Respiratório Bovino/microbiologia , Complexo Respiratório Bovino/virologia , Bovinos , Bases de Dados Genéticas , Interações Hospedeiro-Patógeno , Pulmão/imunologia , Pulmão/microbiologia , Pulmão/virologiaRESUMO
BACKGROUND: Despite decades of extensive research, bovine respiratory disease (BRD) remains the most devastating disease in beef cattle production. Establishing a clinical diagnosis often relies upon visual detection of non-specific signs, leading to low diagnostic accuracy. Thus, post-weaned beef cattle are often metaphylactically administered antimicrobials at facility arrival, which poses concerns regarding antimicrobial stewardship and resistance. Additionally, there is a lack of high-quality research that addresses the gene-by-environment interactions that underlie why some cattle that develop BRD die while others survive. Therefore, it is necessary to decipher the underlying host genomic factors associated with BRD mortality versus survival to help determine BRD risk and severity. Using transcriptomic analysis of at-arrival whole blood samples from cattle that died of BRD, as compared to those that developed signs of BRD but lived (n = 3 DEAD, n = 3 ALIVE), we identified differentially expressed genes (DEGs) and associated pathways in cattle that died of BRD. Additionally, we evaluated unmapped reads, which are often overlooked within transcriptomic experiments. RESULTS: 69 DEGs (FDR<0.10) were identified between ALIVE and DEAD cohorts. Several DEGs possess immunological and proinflammatory function and associations with TLR4 and IL6. Biological processes, pathways, and disease phenotype associations related to type-I interferon production and antiviral defense were enriched in DEAD cattle at arrival. Unmapped reads aligned primarily to various ungulate assemblies, but failed to align to viral assemblies. CONCLUSION: This study further revealed increased proinflammatory immunological mechanisms in cattle that develop BRD. DEGs upregulated in DEAD cattle were predominantly involved in innate immune pathways typically associated with antiviral defense, although no viral genes were identified within unmapped reads. Our findings provide genomic targets for further analysis in cattle at highest risk of BRD, suggesting that mechanisms related to type I interferons and antiviral defense may be indicative of viral respiratory disease at arrival and contribute to eventual BRD mortality.
Assuntos
Antivirais/metabolismo , Complexo Respiratório Bovino/patologia , Interferon Tipo I/metabolismo , Transcriptoma , Animais , Antivirais/uso terapêutico , Complexo Respiratório Bovino/tratamento farmacológico , Complexo Respiratório Bovino/metabolismo , Complexo Respiratório Bovino/mortalidade , Bovinos , Mapeamento de Sequências Contíguas , Perfilação da Expressão Gênica , Masculino , Fenótipo , Mapas de Interação de Proteínas/genética , Receptor 4 Toll-Like/metabolismoRESUMO
Bovine respiratory disease (BRD) remains the leading infectious disease in post-weaned beef cattle. The objective of this investigation was to contrast the at-arrival blood transcriptomes from cattle derived from two distinct populations that developed BRD in the 28 days following arrival versus cattle that did not. Forty-eight blood samples from two populations were selected for mRNA sequencing based on even distribution of development (n = 24) or lack of (n = 24) clinical BRD within 28 days following arrival; cattle which developed BRD were further stratified into BRD severity cohorts based on frequency of antimicrobial treatment: treated once (treated_1) or treated twice or more and/or died (treated_2+). Sequenced reads (~ 50 M/sample, 150 bp paired-end) were aligned to the ARS-UCD1.2 bovine genome assembly. One hundred and thirty-two unique differentially expressed genes (DEGs) were identified between groups stratified by disease severity (healthy, n = 24; treated_1, n = 13; treated_2+, n = 11) with edgeR (FDR ≤ 0.05). Differentially expressed genes in treated_1 relative to both healthy and treated_2+ were predicted to increase neutrophil activation, cellular cornification/keratinization, and antimicrobial peptide production. Differentially expressed genes in treated_2+ relative to both healthy and treated_1 were predicted to increase alternative complement activation, decrease leukocyte activity, and increase nitric oxide production. Receiver operating characteristic (ROC) curves generated from expression data for six DEGs identified in our current and previous studies (MARCO, CFB, MCF2L, ALOX15, LOC100335828 (aka CD200R1), and SLC18A2) demonstrated good-to-excellent (AUC: 0.800-0.899; ≥ 0.900) predictability for classifying disease occurrence and severity. This investigation identifies candidate biomarkers and functional mechanisms in at arrival blood that predicted development and severity of BRD.
Assuntos
Doenças dos Bovinos/genética , Bovinos/genética , Infecções Respiratórias/genética , Transcriptoma , Animais , Biomarcadores/metabolismo , Bovinos/fisiologia , Infecções Respiratórias/veterináriaRESUMO
Bovine respiratory disease (BRD) is a multifactorial disease complex and the leading infectious disease in post-weaned beef cattle. Clinical manifestations of BRD are recognized in beef calves within a high-risk setting, commonly associated with weaning, shipping, and novel feeding and housing environments. However, the understanding of complex host immune interactions and genomic mechanisms involved in BRD susceptibility remain elusive. Utilizing high-throughput RNA-sequencing, we contrasted the at-arrival blood transcriptomes of 6 beef cattle that ultimately developed BRD against 5 beef cattle that remained healthy within the same herd, differentiating BRD diagnosis from production metadata and treatment records. We identified 135 differentially expressed genes (DEGs) using the differential gene expression tools edgeR and DESeq2. Thirty-six of the DEGs shared between these two analysis platforms were prioritized for investigation of their relevance to infectious disease resistance using WebGestalt, STRING, and Reactome. Biological processes related to inflammatory response, immunological defense, lipoxin metabolism, and macrophage function were identified. Production of specialized pro-resolvin mediators (SPMs) and endogenous metabolism of angiotensinogen were increased in animals that resisted BRD. Protein-protein interaction modeling of gene products with significantly higher expression in cattle that naturally acquire BRD identified molecular processes involving microbial killing. Accordingly, identification of DEGs in whole blood at arrival revealed a clear distinction between calves that went on to develop BRD and those that resisted BRD. These results provide novel insight into host immune factors that are present at the time of arrival that confer protection from BRD.
Assuntos
Doenças dos Bovinos/diagnóstico , Resistência à Doença/genética , Perfilação da Expressão Gênica/métodos , Doenças Respiratórias/diagnóstico , Angiotensinogênio/metabolismo , Animais , Estudos de Casos e Controles , Bovinos , Doenças dos Bovinos/sangue , Doenças dos Bovinos/genética , Regulação da Expressão Gênica , Redes Reguladoras de Genes , Mapas de Interação de Proteínas/genética , RNA/química , RNA/genética , RNA/metabolismo , Doenças Respiratórias/sangue , Doenças Respiratórias/genética , Análise de Sequência de RNA , Transdução de Sinais/genéticaRESUMO
BACKGROUND: Gene co-expression networks are often constructed by computing some measure of similarity between expression levels of gene transcripts and subsequently applying a high-pass filter to remove all but the most likely biologically-significant relationships. The selection of this expression threshold necessarily has a significant effect on any conclusions derived from the resulting network. Many approaches have been taken to choose an appropriate threshold, among them computing levels of statistical significance, accepting only the top one percent of relationships, and selecting an arbitrary expression cutoff. RESULTS: We apply spectral graph theory methods to develop a systematic method for threshold selection. Eigenvalues and eigenvectors are computed for a transformation of the adjacency matrix of the network constructed at various threshold values. From these, we use a basic spectral clustering method to examine the set of gene-gene relationships and select a threshold dependent upon the community structure of the data. This approach is applied to two well-studied microarray data sets from Homo sapiens and Saccharomyces cerevisiae. CONCLUSION: This method presents a systematic, data-based alternative to using more artificial cutoff values and results in a more conservative approach to threshold selection than some other popular techniques such as retaining only statistically-significant relationships or setting a cutoff to include a percentage of the highest correlations.
Assuntos
Biologia Computacional/métodos , Redes Reguladoras de Genes , Perfilação da Expressão Gênica/métodos , Humanos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Saccharomyces cerevisiae/genéticaRESUMO
BACKGROUND: This paper presents a framework for integrating disparate data sets to predict gene function. The algorithm constructs a graph, called an integrated similarity graph, by computing similarities based upon both gene expression and textual phenotype data. This integrated graph is then used to make predictions about whether individual genes should be assigned a particular annotation from the Gene Ontology. RESULTS: A combined graph was generated from publicly-available gene expression data and phenotypic information from Saccharomyces cerevisiae. This graph was used to assign annotations to genes, as were graphs constructed from gene expression data and textual phenotype information alone. While the F-measure appeared similar for all three methods, annotations based upon the integrated similarity graph exhibited a better overall precision than gene expression or phenotype information alone can generate. The integrated approach was also able to assign almost as many annotations as the gene expression method alone, and generated significantly more total and correct assignments than the phenotype information could provide. CONCLUSION: These results suggest that augmenting standard gene expression data sets with publicly-available textual phenotype data can help generate more precise functional annotation predictions while mitigating the weaknesses of a standard textual phenotype approach.
Assuntos
Biologia Computacional/métodos , Expressão Gênica , Genômica/métodos , Fenótipo , Algoritmos , Saccharomyces cerevisiae/genéticaRESUMO
Aeromonas veronii is a gram-negative species abundant in aquatic environments that causes disease in humans as well as terrestrial and aquatic animals. In the current study, 41 publicly available A. veronii genomes were compared to investigate distribution of putative virulence genes, global dissemination of pathotypes, and potential mechanisms of virulence. The complete genome of A. veronii strain ML09-123 from an outbreak of motile aeromonas septicemia in farm-raised catfish in the southeastern United States was included. Dissemination of A. veronii strain types was discovered in dispersed geographical locations. Isolate ML09-123 is highly similar to Chinese isolate TH0426, suggesting the two strains have a common origin and may represent a pathotype impacting aquaculture in both countries. Virulence of strain ML09-123 in catfish in a dose-dependent manner was confirmed experimentally. Subsystem category disposition showed the majority of genomes exhibit similar distribution of genomic elements. The type I secretion system (T1SS), type II secretion system (T2SS), type 4 pilus (T4P), and flagellum core elements are conserved in all A. veronii genomes, whereas the type III secretion system (T3SS), type V secretion system (T5SS), type VI secretion system (T6SS), and tight adherence (TAD) system demonstrate variable dispersal. Distribution of mobile elements is dependent on host and geographic origin, suggesting this species has undergone considerable genetic exchange. The data presented here lends insight into the genomic variation of A. veronii and identifies a pathotype impacting aquaculture globally.
Assuntos
Aeromonas veronii/genética , Aeromonas veronii/patogenicidade , Genômica , Infecções por Bactérias Gram-Negativas/genética , Fatores de Virulência/genética , Microbiologia da Água , Aeromonas veronii/isolamento & purificação , Animais , Aquicultura , HumanosRESUMO
Genes with common functions often exhibit correlated expression levels, which can be used to identify sets of interacting genes from microarray data. Microarrays typically measure expression across genomic space, creating a massive matrix of co-expression that must be mined to extract only the most relevant gene interactions. We describe a graph theoretical approach to extracting co-expressed sets of genes, based on the computation of cliques. Unlike the results of traditional clustering algorithms, cliques are not disjoint and allow genes to be assigned to multiple sets of interacting partners, consistent with biological reality. A graph is created by thresholding the correlation matrix to include only the correlations most likely to signify functional relationships. Cliques computed from the graph correspond to sets of genes for which significant edges are present between all members of the set, representing potential members of common or interacting pathways. Clique membership can be used to infer function about poorly annotated genes, based on the known functions of better-annotated genes with which they share clique membership (i.e., "guilt-by-association"). We illustrate our method by applying it to microarray data collected from the spleens of mice exposed to low-dose ionizing radiation. Differential analysis is used to identify sets of genes whose interactions are impacted by radiation exposure. The correlation graph is also queried independently of clique to extract edges that are impacted by radiation. We present several examples of multiple gene interactions that are altered by radiation exposure and thus represent potential molecular pathways that mediate the radiation response.
Assuntos
Algoritmos , Biologia Computacional/métodos , Regulação da Expressão Gênica/efeitos da radiação , Proteínas/genética , Proteínas/metabolismo , Animais , Linhagem Celular , Simulação por Computador , Relação Dose-Resposta à Radiação , Humanos , Camundongos , Análise de Sequência com Séries de Oligonucleotídeos , Doses de Radiação , Radiação IonizanteRESUMO
The following sections are included:Bioinformatics is a Mature DisciplineThe Golden Era of Bioinformatics Has BegunNo-Boundary Thinking in BioinformaticsReferences.
Assuntos
Biologia Computacional/tendências , HumanosRESUMO
To address important challenges in bioinformatics, high throughput data technologies are needed to interpret biological data efficiently and reliably. Clustering is widely used as a first step to interpreting high dimensional biological data, such as the gene expression data measured by microarrays. A good clustering algorithm should be efficient, reliable, and effective, as demonstrated by its capability of determining biologically relevant clusters. This paper proposes a new minimum spanning tree based heuristic B-MST, that is guided by an innovative objective function: the tightness and separation index (TSI). The TSI presented here obtains biologically meaningful clusters, making use of co-expression network topology, and this paper develops a local search procedure to minimize the TSI value. The proposed B-MST is tested by comparing results to: (1) adjusted rand index (ARI), for microarray data sets with known object classes, and (2) gene ontology (GO) annotations for data sets without documented object classes.
Assuntos
Heurística Computacional , Processamento Eletrônico de Dados/métodos , Regulação da Expressão Gênica , Ontologia Genética , Análise de Sequência com Séries de OligonucleotídeosRESUMO
The distribution of microsatellites in exons, and their association with gene ontology (GO) terms is explored to elucidate patterns of microsatellite evolution in the common sunflower, Helianthus annuus. The relative position, motif, size and level of impurity were estimated for each microsatellite in the unigene database available from the Compositae Genome Project (CGP), and statistical analyses were performed to determine if differences in microsatellite distributions and enrichment within certain GO terms were significant. There are more translated than untranslated microsatellites, implying that many bring about structural changes in proteins. However, the greatest density is observed within the UTRs, particularly 5'UTRs. Further, UTR microsatellites are purer and longer than coding region microsatellites. This suggests that UTR microsatellites are either younger and under more relaxed constraints, or that purifying selection limits impurities, and directional selection favours their expansion. GOs associated with response to various environmental stimuli including water deprivation and salt stress were significantly enriched with microsatellites. This may suggest that these GOs are more labile in plant genomes, or that selection has favoured the maintenance of microsatellites in these genes over others. This study shows that the distribution of transcribed microsatellites in H. annuus is nonrandom, the coding region microsatellites are under greater constraint compared to the UTR microsatellites, and that these sequences are enriched within genes that regulate plant responses to environmental stress and stimuli.
Assuntos
Genes de Plantas , Helianthus/genética , Repetições de Microssatélites , Transcriptoma , Regiões 5' não Traduzidas , Evolução Molecular , Ontologia Genética , Helianthus/metabolismo , Fases de Leitura Aberta , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , RNA de Plantas/genética , RNA de Plantas/metabolismo , Estresse FisiológicoRESUMO
Aspergillus flavus is a pathogenic fungus infecting maize and producing aflatoxins that are health hazards to humans and animals. Characterizing host defense mechanism and prioritizing candidate resistance genes are important to the development of resistant maize germplasm. We investigated methods amenable for the analysis of the significance and relations among maize candidate genes based on the empirical gene expression data obtained by RT-qPCR technique from maize inbred lines. We optimized a pipeline of analysis tools chosen from various programs to provide rigorous statistical analysis and state of the art data visualization. A network-based method was also explored to construct the empirical gene expression relational structures. Maize genes at the centers in the network were considered as important candidate genes for maize DNA marker studies. The methods in this research can be used to analyze large RT-qPCR datasets and establish complex empirical gene relational structures across multiple experimental conditions.
Assuntos
Aspergillus flavus , Regulação da Expressão Gênica de Plantas , Interações Hospedeiro-Patógeno/genética , Zea mays/genética , Zea mays/microbiologia , Aflatoxinas , Transporte Biológico , Análise por Conglomerados , Perfilação da Expressão Gênica , Redes Reguladoras de Genes , Endogamia , RNA de Plantas , Zea mays/metabolismoRESUMO
BACKGROUND: The identification of novel genes by high-throughput studies of complex diseases is complicated by the large number of potential genes. However, since disease-associated genes tend to interact, one solution is to arrange them in modules based on co-expression data and known gene interactions. The hypothesis of this study was that such a module could be a) found and validated in allergic disease and b) used to find and validate one ore more novel disease-associated genes. RESULTS: To test these hypotheses integrated analysis of a large number of gene expression microarray experiments from different forms of allergy was performed. This led to the identification of an experimentally validated reference gene that was used to construct a module of co-expressed and interacting genes. This module was validated in an independent material, by replicating the expression changes in allergen-challenged CD4+ cells. Moreover, the changes were reversed following treatment with corticosteroids. The module contained several novel disease-associated genes, of which the one with the highest number of interactions with known disease genes, IL7R, was selected for further validation. The expression levels of IL7R in allergen challenged CD4+ cells decreased following challenge but increased after treatment. This suggested an inhibitory role, which was confirmed by functional studies. CONCLUSION: We propose that a module-based analytical strategy is generally applicable to find novel genes in complex diseases.