Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 50
Filtrar
Mais filtros

Base de dados
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Brief Bioinform ; 23(1)2022 01 17.
Artigo em Inglês | MEDLINE | ID: mdl-34850822

RESUMO

Gene co-expression networks (GCNs) provide multiple benefits to molecular research including hypothesis generation and biomarker discovery. Transcriptome profiles serve as input for GCN construction and are derived from increasingly larger studies with samples across multiple experimental conditions, treatments, time points, genotypes, etc. Such experiments with larger numbers of variables confound discovery of true network edges, exclude edges and inhibit discovery of context (or condition) specific network edges. To demonstrate this problem, a 475-sample dataset is used to show that up to 97% of GCN edges can be misleading because correlations are false or incorrect. False and incorrect correlations can occur when tests are applied without ensuring assumptions are met, and pairwise gene expression may not meet test assumptions if the expression of at least one gene in the pairwise comparison is a function of multiple confounding variables. The 'one-size-fits-all' approach to GCN construction is therefore problematic for large, multivariable datasets. Recently, the Knowledge Independent Network Construction toolkit has been used in multiple studies to provide a dynamic approach to GCN construction that ensures statistical tests meet assumptions and confounding variables are addressed. Additionally, it can associate experimental context for each edge of the network resulting in context-specific GCNs (csGCNs). To help researchers recognize such challenges in GCN construction, and the creation of csGCNs, we provide a review of the workflow.


Assuntos
Redes Reguladoras de Genes , Transcriptoma
2.
Brief Bioinform ; 22(6)2021 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-34251419

RESUMO

Online, open access databases for biological knowledge serve as central repositories for research communities to store, find and analyze integrated, multi-disciplinary datasets. With increasing volumes, complexity and the need to integrate genomic, transcriptomic, metabolomic, proteomic, phenomic and environmental data, community databases face tremendous challenges in ongoing maintenance, expansion and upgrades. A common infrastructure framework using community standards shared by many databases can reduce development burden, provide interoperability, ensure use of common standards and support long-term sustainability. Tripal is a mature, open source platform built to meet this need. With ongoing improvement since its first release in 2009, Tripal provides full functionality for searching, browsing, loading and curating numerous types of data and is a primary technology powering at least 31 publicly available databases spanning plants, animals and human data, primarily storing genomics, genetics and breeding data. Tripal software development is managed by a shared, inclusive governance structure including both project management and advisory teams. Here, we report on the most important and innovative aspects of Tripal after 11 years development, including integration of diverse types of biological data, successful collaborative projects across member databases, and support for implementing FAIR principles.


Assuntos
Cruzamento , Biologia Computacional/métodos , Bases de Dados Genéticas , Genômica/métodos , Plantas/genética , Software , Produtos Agrícolas/genética , Variação Genética , Filogenia , Plantas/metabolismo , Proteômica , Navegador
3.
BMC Bioinformatics ; 23(1): 156, 2022 May 02.
Artigo em Inglês | MEDLINE | ID: mdl-35501696

RESUMO

BACKGROUND: Quantification of gene expression from RNA-seq data is a prerequisite for transcriptome analysis such as differential gene expression analysis and gene co-expression network construction. Individual RNA-seq experiments are larger and combining multiple experiments from sequence repositories can result in datasets with thousands of samples. Processing hundreds to thousands of RNA-seq data can result in challenges related to data management, access to sufficient computational resources, navigation of high-performance computing (HPC) systems, installation of required software dependencies, and reproducibility. Processing of larger and deeper RNA-seq experiments will become more common as sequencing technology matures. RESULTS: GEMmaker, is a nf-core compliant, Nextflow workflow, that quantifies gene expression from small to massive RNA-seq datasets. GEMmaker ensures results are highly reproducible through the use of versioned containerized software that can be executed on a single workstation, institutional compute cluster, Kubernetes platform or the cloud. GEMmaker supports popular alignment and quantification tools providing results in raw and normalized formats. GEMmaker is unique in that it can scale to process thousands of local or remote stored samples without exceeding available data storage. CONCLUSIONS: Workflows that quantify gene expression are not new, and many already address issues of portability, reusability, and scale in terms of access to CPUs. GEMmaker provides these benefits and adds the ability to scale despite low data storage infrastructure. This allows users to process hundreds to thousands of RNA-seq samples even when data storage resources are limited. GEMmaker is freely available and fully documented with step-by-step setup and execution instructions.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Software , Sequenciamento de Nucleotídeos em Larga Escala/métodos , RNA-Seq , Reprodutibilidade dos Testes , Análise de Sequência de RNA/métodos
4.
BMC Genomics ; 23(1): 350, 2022 May 06.
Artigo em Inglês | MEDLINE | ID: mdl-35524179

RESUMO

BACKGROUND: Lung cancer is the leading cause of cancer death in both men and women. The most common lung cancer subtype is non-small cell lung carcinoma (NSCLC) comprising about 85% of all cases. NSCLC can be further divided into three subtypes: adenocarcinoma (LUAD), squamous cell carcinoma (LUSC), and large cell lung carcinoma. Specific genetic mutations and epigenetic aberrations play an important role in the developmental transition to a specific tumor subtype. The elucidation of normal lung versus lung tumor gene expression patterns and regulatory targets yields biomarker systems that discriminate lung phenotypes (i.e., biomarkers) and provide a foundation for the discovery of normal and aberrant gene regulatory mechanisms. RESULTS: We built condition-specific gene co-expression networks (csGCNs) for normal lung, LUAD, and LUSC conditions. Then, we integrated normal lung tissue-specific gene regulatory networks (tsGRNs) to elucidate control-target biomarker systems for normal and cancerous lung tissue. We characterized co-expressed gene edges, possibly under common regulatory control, for relevance in lung cancer. CONCLUSIONS: Our approach demonstrates the ability to elucidate csGCN:tsGRN merged biomarker systems based on gene expression correlation and regulation. The biomarker systems we describe can be used to classify and further describe lung specimens. Our approach is generalizable and can be used to discover and interpret complex gene expression patterns for any condition or species.


Assuntos
Adenocarcinoma de Pulmão , Carcinoma Pulmonar de Células não Pequenas , Neoplasias Pulmonares , Adenocarcinoma de Pulmão/genética , Adenocarcinoma de Pulmão/patologia , Biomarcadores , Biomarcadores Tumorais/genética , Carcinoma Pulmonar de Células não Pequenas/genética , Carcinoma Pulmonar de Células não Pequenas/patologia , Feminino , Regulação Neoplásica da Expressão Gênica , Humanos , Pulmão/patologia , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/patologia , Prognóstico
5.
Nucleic Acids Res ; 47(D1): D1137-D1145, 2019 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-30357347

RESUMO

The Genome Database for Rosaceae (GDR, https://www.rosaceae.org) is an integrated web-based community database resource providing access to publicly available genomics, genetics and breeding data and data-mining tools to facilitate basic, translational and applied research in Rosaceae. The volume of data in GDR has increased greatly over the last 5 years. The GDR now houses multiple versions of whole genome assembly and annotation data from 14 species, made available by recent advances in sequencing technology. Annotated and searchable reference transcriptomes, RefTrans, combining peer-reviewed published RNA-Seq as well as EST datasets, are newly available for major crop species. Significantly more quantitative trait loci, genetic maps and markers are available in MapViewer, a new visualization tool that better integrates with other pages in GDR. Pathways can be accessed through the new GDR Cyc Pathways databases, and synteny among the newest genome assemblies from eight species can be viewed through the new synteny browser, SynView. Collated single-nucleotide polymorphism diversity data and phenotypic data from publicly available breeding datasets are integrated with other relevant data. Also, the new Breeding Information Management System allows breeders to upload, manage and analyze their private breeding data within the secure GDR server with an option to release data publicly.


Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Genoma de Planta/genética , Genômica/métodos , Rosaceae/genética , Biologia Computacional/estatística & dados numéricos , Perfilação da Expressão Gênica/métodos , Genes de Plantas/genética , Armazenamento e Recuperação da Informação/métodos , Internet , Melhoramento Vegetal/métodos , Locos de Características Quantitativas/genética , Rosaceae/classificação , Especificidade da Espécie , Sintenia , Fatores de Tempo , Interface Usuário-Computador
6.
Int J Mol Sci ; 21(6)2020 Mar 20.
Artigo em Inglês | MEDLINE | ID: mdl-32244875

RESUMO

Lentil (Lens culinaris Medikus) is an important source of protein for people in developing countries. Aphanomyces root rot (ARR) has emerged as one of the most devastating diseases affecting lentil production. In this study, we applied two complementary quantitative trait loci (QTL) analysis approaches to unravel the genetic architecture underlying this complex trait. A recombinant inbred line (RIL) population and an association mapping population were genotyped using genotyping by sequencing (GBS) to discover novel single nucleotide polymorphisms (SNPs). QTL mapping identified 19 QTL associated with ARR resistance, while association mapping detected 38 QTL and highlighted accumulation of favorable haplotypes in most of the resistant accessions. Seven QTL clusters were discovered on six chromosomes, and 15 putative genes were identified within the QTL clusters. To validate QTL mapping and genome-wide association study (GWAS) results, expression analysis of five selected genes was conducted on partially resistant and susceptible accessions. Three of the genes were differentially expressed at early stages of infection, two of which may be associated with ARR resistance. Our findings provide valuable insight into the genetic control of ARR, and genetic and genomic resources developed here can be used to accelerate development of lentil cultivars with high levels of partial resistance to ARR.


Assuntos
Aphanomyces/fisiologia , Mapeamento Cromossômico , Resistência à Doença/genética , Estudo de Associação Genômica Ampla , Lens (Planta)/genética , Lens (Planta)/microbiologia , Doenças das Plantas/genética , Locos de Características Quantitativas/genética , Análise de Dados , Regulação da Expressão Gênica de Plantas , Genética Populacional , Haplótipos/genética , Desequilíbrio de Ligação/genética , Fenótipo , Doenças das Plantas/microbiologia
7.
Nucleic Acids Res ; 42(Database issue): D1229-36, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24203703

RESUMO

CottonGen (http://www.cottongen.org) is a curated and integrated web-based relational database providing access to publicly available genomic, genetic and breeding data for cotton. CottonGen supercedes CottonDB and the Cotton Marker Database, with enhanced tools for easier data sharing, mining, visualization and data retrieval of cotton research data. CottonGen contains annotated whole genome sequences, unigenes from expressed sequence tags (ESTs), markers, trait loci, genetic maps, genes, taxonomy, germplasm, publications and communication resources for the cotton community. Annotated whole genome sequences of Gossypium raimondii are available with aligned genetic markers and transcripts. These whole genome data can be accessed through genome pages, search tools and GBrowse, a popular genome browser. Most of the published cotton genetic maps can be viewed and compared using CMap, a comparative map viewer, and are searchable via map search tools. Search tools also exist for markers, quantitative trait loci (QTLs), germplasm, publications and trait evaluation data. CottonGen also provides online analysis tools such as NCBI BLAST and Batch BLAST.


Assuntos
Bases de Dados Genéticas , Genoma de Planta , Gossypium/genética , Cruzamento , Etiquetas de Sequências Expressas , Genes de Plantas , Marcadores Genéticos , Genômica , Internet , Locos de Características Quantitativas
8.
Nucleic Acids Res ; 42(Database issue): D1237-44, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24225320

RESUMO

The Genome Database for Rosaceae (GDR, http:/www.rosaceae.org), the long-standing central repository and data mining resource for Rosaceae research, has been enhanced with new genomic, genetic and breeding data, and improved functionality. Whole genome sequences of apple, peach and strawberry are available to browse or download with a range of annotations, including gene model predictions, aligned transcripts, repetitive elements, polymorphisms, mapped genetic markers, mapped NCBI Rosaceae genes, gene homologs and association of InterPro protein domains, GO terms and Kyoto Encyclopedia of Genes and Genomes pathway terms. Annotated sequences can be queried using search interfaces and visualized using GBrowse. New expressed sequence tag unigene sets are available for major genera, and Pathway data are available through FragariaCyc, AppleCyc and PeachCyc databases. Synteny among the three sequenced genomes can be viewed using GBrowse_Syn. New markers, genetic maps and extensively curated qualitative/Mendelian and quantitative trait loci are available. Phenotype and genotype data from breeding projects and genetic diversity projects are also included. Improved search pages are available for marker, trait locus, genetic diversity and publication data. New search tools for breeders enable selection comparison and assistance with breeding decision making.


Assuntos
Bases de Dados Genéticas , Genoma de Planta , Rosaceae/genética , Cruzamento , Genes de Plantas , Marcadores Genéticos , Variação Genética , Genômica , Internet , Locos de Características Quantitativas
9.
BMC Genomics ; 16: 155, 2015 Mar 07.
Artigo em Inglês | MEDLINE | ID: mdl-25886969

RESUMO

BACKGROUND: A high-throughput genotyping platform is needed to enable marker-assisted breeding in the allo-octoploid cultivated strawberry Fragaria × ananassa. Short-read sequences from one diploid and 19 octoploid accessions were aligned to the diploid Fragaria vesca 'Hawaii 4' reference genome to identify single nucleotide polymorphisms (SNPs) and indels for incorporation into a 90 K Affymetrix® Axiom® array. We report the development and preliminary evaluation of this array. RESULTS: About 36 million sequence variants were identified in a 19 member, octoploid germplasm panel. Strategies and filtering pipelines were developed to identify and incorporate markers of several types: di-allelic SNPs (66.6%), multi-allelic SNPs (1.8%), indels (10.1%), and ploidy-reducing "haploSNPs" (11.7%). The remaining SNPs included those discovered in the diploid progenitor F. iinumae (3.9%), and speculative "codon-based" SNPs (5.9%). In genotyping 306 octoploid accessions, SNPs were assigned to six classes with Affymetrix's "SNPolisher" R package. The highest quality classes, PolyHigh Resolution (PHR), No Minor Homozygote (NMH), and Off-Target Variant (OTV) comprised 25%, 38%, and 1% of array markers, respectively. These markers were suitable for genetic studies as demonstrated in the full-sib family 'Holiday' × 'Korona' with the generation of a genetic linkage map consisting of 6,594 PHR SNPs evenly distributed across 28 chromosomes with an average density of approximately one marker per 0.5 cM, thus exceeding our goal of one marker per cM. CONCLUSIONS: The Affymetrix IStraw90 Axiom array is the first high-throughput genotyping platform for cultivated strawberry and is commercially available to the worldwide scientific community. The array's high success rate is likely driven by the presence of naturally occurring variation in ploidy level within the nominally octoploid genome, and by effectiveness of the employed array design and ploidy-reducing strategies. This array enables genetic analyses including generation of high-density linkage maps, identification of quantitative trait loci for economically important traits, and genome-wide association studies, thus providing a basis for marker-assisted breeding in this high value crop.


Assuntos
Fragaria/genética , Técnicas de Genotipagem/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Polimorfismo de Nucleotídeo Único , Poliploidia , Mapeamento Cromossômico , Hibridização Genética , Mutação INDEL , Análise de Sequência de DNA
10.
PLoS One ; 19(3): e0297015, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38446822

RESUMO

Gene expression is highly impacted by the environment and can be reflective of past events that affected developmental processes. It is therefore expected that gene expression can serve as a signal of a current or future phenotypic traits. In this paper we identify sets of genes, which we call Prognostic Transcriptomic Biomarkers (PTBs), that can predict firmness in Malus domestica (apple) fruits. In apples, all individuals of a cultivar are clones, and differences in fruit quality are due to the environment. The apples transcriptome responds to these differences in environment, which makes PTBs an attractive predictor of future fruit quality. PTBs have the potential to enhance supply chain efficiency, reduce crop loss, and provide higher and more consistent quality for consumers. However, several questions must be addressed. In this paper we answer the question of which of two common modeling approaches, Random Forest or ElasticNet, outperforms the other. We answer if PTBs with few genes are efficient at predicting traits. This is important because we need few genes to perform qPCR, and we answer the question if qPCR is a cost-effective assay as input for PTBs modeled using high-throughput RNA-seq. To do this, we conducted a pilot study using fruit texture in the 'Gala' variety of apples across several postharvest storage regiments. Fruit texture in 'Gala' apples is highly controllable by post-harvest treatments and is therefore a good candidate to explore the use of PTBs. We find that the RandomForest model is more consistent than an ElasticNet model and is predictive of firmness (r2 = 0.78) with as few as 15 genes. We also show that qPCR is reasonably consistent with RNA-seq in a follow up experiment. Results are promising for PTBs, yet more work is needed to ensure that PTBs are robust across various environmental conditions and storage treatments.


Assuntos
Malus , Humanos , Malus/genética , Frutas/genética , Transcriptoma , Projetos Piloto , Perfilação da Expressão Gênica
11.
PLoS One ; 19(6): e0306187, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38905271

RESUMO

[This corrects the article DOI: 10.1371/journal.pone.0297015.].

12.
G3 (Bethesda) ; 14(3)2024 03 06.
Artigo em Inglês | MEDLINE | ID: mdl-38190814

RESUMO

Cultivated pear consists of several Pyrus species with Pyrus communis (European pear) representing a large fraction of worldwide production. As a relatively recently domesticated crop and perennial tree, pear can benefit from genome-assisted breeding. Additionally, comparative genomics within Rosaceae promises greater understanding of evolution within this economically important family. Here, we generate a fully phased chromosome-scale genome assembly of P. communis 'd'Anjou.' Using PacBio HiFi and Dovetail Omni-C reads, the genome is resolved into the expected 17 chromosomes, with each haplotype totaling nearly 540 Megabases and a contig N50 of nearly 14 Mb. Both haplotypes are highly syntenic to each other and to the Malus domestica 'Honeycrisp' apple genome. Nearly 45,000 genes were annotated in each haplotype, over 90% of which have direct RNA-seq expression evidence. We detect signatures of the known whole-genome duplication shared between apple and pear, and we estimate 57% of d'Anjou genes are retained in duplicate derived from this event. This genome highlights the value of generating phased diploid assemblies for recovering the full allelic complement in highly heterozygous crop species.


Assuntos
Malus , Pyrus , Pyrus/genética , Genoma de Planta , Melhoramento Vegetal , Malus/genética , Cromossomos
13.
Database (Oxford) ; 20232023 11 15.
Artigo em Inglês | MEDLINE | ID: mdl-37971715

RESUMO

Over the last couple of decades, there has been a rapid growth in the number and scope of agricultural genetics, genomics and breeding databases and resources. The AgBioData Consortium (https://www.agbiodata.org/) currently represents 44 databases and resources (https://www.agbiodata.org/databases) covering model or crop plant and animal GGB data, ontologies, pathways, genetic variation and breeding platforms (referred to as 'databases' throughout). One of the goals of the Consortium is to facilitate FAIR (Findable, Accessible, Interoperable, and Reusable) data management and the integration of datasets which requires data sharing, along with structured vocabularies and/or ontologies. Two AgBioData working groups, focused on Data Sharing and Ontologies, respectively, conducted a Consortium-wide survey to assess the current status and future needs of the members in those areas. A total of 33 researchers responded to the survey, representing 37 databases. Results suggest that data-sharing practices by AgBioData databases are in a fairly healthy state, but it is not clear whether this is true for all metadata and data types across all databases; and that, ontology use has not substantially changed since a similar survey was conducted in 2017. Based on our evaluation of the survey results, we recommend (i) providing training for database personnel in a specific data-sharing techniques, as well as in ontology use; (ii) further study on what metadata is shared, and how well it is shared among databases; (iii) promoting an understanding of data sharing and ontologies in the stakeholder community; (iv) improving data sharing and ontologies for specific phenotypic data types and formats; and (v) lowering specific barriers to data sharing and ontology use, by identifying sustainability solutions, and the identification, promotion, or development of data standards. Combined, these improvements are likely to help AgBioData databases increase development efforts towards improved ontology use, and data sharing via programmatic means. Database URL  https://www.agbiodata.org/databases.


Assuntos
Gerenciamento de Dados , Melhoramento Vegetal , Animais , Genômica/métodos , Bases de Dados Factuais , Disseminação de Informação
14.
BMC Plant Biol ; 12: 38, 2012 Mar 19.
Artigo em Inglês | MEDLINE | ID: mdl-22429310

RESUMO

BACKGROUND: A century ago, Chestnut Blight Disease (CBD) devastated the American chestnut. Backcross breeding has been underway to introgress resistance from Chinese chestnut into surviving American chestnut genotypes. Development of genomic resources for the family Fagaceae, has focused in this project on Castanea mollissima Blume (Chinese chestnut) and Castanea dentata (Marsh.) Borkh (American chestnut) to aid in the backcross breeding effort and in the eventual identification of blight resistance genes through genomic sequencing and map based cloning. A previous study reported partial characterization of the transcriptomes from these two species. Here, further analyses of a larger dataset and assemblies including both 454 and capillary sequences were performed and defense related genes with differential transcript abundance (GDTA) in canker versus healthy stem tissues were identified. RESULTS: Over one and a half million cDNA reads were assembled into 34,800 transcript contigs from American chestnut and 48,335 transcript contigs from Chinese chestnut. Chestnut cDNA showed higher coding sequence similarity to genes in other woody plants than in herbaceous species. The number of genes tagged, the length of coding sequences, and the numbers of tagged members within gene families showed that the cDNA dataset provides a good resource for studying the American and Chinese chestnut transcriptomes. In silico analysis of transcript abundance identified hundreds of GDTA in canker versus healthy stem tissues. A significant number of additional DTA genes involved in the defense-response not reported in a previous study were identified here. These DTA genes belong to various pathways involving cell wall biosynthesis, reactive oxygen species (ROS), salicylic acid (SA), ethylene, jasmonic acid (JA), abscissic acid (ABA), and hormone signalling. DTA genes were also identified in the hypersensitive response and programmed cell death (PCD) pathways. These DTA genes are candidates for host resistance to the chestnut blight fungus, Cryphonectria parasitica. CONCLUSIONS: Our data allowed the identification of many genes and gene network candidates for host resistance to the chestnut blight fungus, Cryphonectria parasitica. The similar set of GDTAs in American chestnut and Chinese chestnut suggests that the variation in sensitivity to this pathogen between these species may be the result of different timing and amplitude of the response of the two to the pathogen infection. Resources developed in this study are useful for functional genomics, comparative genomics, resistance breeding and phylogenetics in the Fagaceae.


Assuntos
Ascomicetos/patogenicidade , Resistência à Doença , Fagaceae/microbiologia , Perfilação da Expressão Gênica/métodos , Doenças das Plantas/imunologia , Ascomicetos/imunologia , Cruzamento , Clonagem Molecular , Mapeamento de Sequências Contíguas , DNA Complementar/genética , Bases de Dados Genéticas , Fagaceae/genética , Fagaceae/imunologia , Regulação da Expressão Gênica de Plantas , Biblioteca Gênica , Genes de Plantas , Endogamia , Filogenia , Doenças das Plantas/microbiologia , Caules de Planta/genética , Caules de Planta/imunologia , Caules de Planta/microbiologia , Proteoma/análise , Proteoma/genética , RNA de Plantas/análise , RNA de Plantas/genética , Análise de Sequência de DNA , Homologia de Sequência , Especificidade da Espécie , Fatores de Tempo , Transcriptoma
15.
Plant Physiol ; 156(3): 1244-56, 2011 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-21606319

RESUMO

One major objective for plant biology is the discovery of molecular subsystems underlying complex traits. The use of genetic and genomic resources combined in a systems genetics approach offers a means for approaching this goal. This study describes a maize (Zea mays) gene coexpression network built from publicly available expression arrays. The maize network consisted of 2,071 loci that were divided into 34 distinct modules that contained 1,928 enriched functional annotation terms and 35 cofunctional gene clusters. Of note, 391 maize genes of unknown function were found to be coexpressed within modules along with genes of known function. A global network alignment was made between this maize network and a previously described rice (Oryza sativa) coexpression network. The IsoRankN tool was used, which incorporates both gene homology and network topology for the alignment. A total of 1,173 aligned loci were detected between the two grass networks, which condensed into 154 conserved subgraphs that preserved 4,758 coexpression edges in rice and 6,105 coexpression edges in maize. This study provides an early view into maize coexpression space and provides an initial network-based framework for the translation of functional genomic and genetic information between these two vital agricultural species.


Assuntos
Sequência Conservada/genética , Regulação da Expressão Gênica de Plantas , Redes Reguladoras de Genes/genética , Oryza/genética , Zea mays/genética , Análise por Conglomerados , Fenótipo
16.
BMC Genomics ; 12: 413, 2011 Aug 16.
Artigo em Inglês | MEDLINE | ID: mdl-21846342

RESUMO

BACKGROUND: The fermented dried seeds of Theobroma cacao (cacao tree) are the main ingredient in chocolate. World cocoa production was estimated to be 3 million tons in 2010 with an annual estimated average growth rate of 2.2%. The cacao bean production industry is currently under threat from a rise in fungal diseases including black pod, frosty pod, and witches' broom. In order to address these issues, genome-sequencing efforts have been initiated recently to facilitate identification of genetic markers and genes that could be utilized to accelerate the release of robust T. cacao cultivars. However, problems inherent with assembly and resolution of distal regions of complex eukaryotic genomes, such as gaps, chimeric joins, and unresolvable repeat-induced compressions, have been unavoidably encountered with the sequencing strategies selected. RESULTS: Here, we describe the construction of a BAC-based integrated genetic-physical map of the T. cacao cultivar Matina 1-6 which is designed to augment and enhance these sequencing efforts. Three BAC libraries, each comprised of 10× coverage, were constructed and fingerprinted. 230 genetic markers from a high-resolution genetic recombination map and 96 Arabidopsis-derived conserved ortholog set (COS) II markers were anchored using pooled overgo hybridization. A dense tile path consisting of 29,383 BACs was selected and end-sequenced. The physical map consists of 154 contigs and 4,268 singletons. Forty-nine contigs are genetically anchored and ordered to chromosomes for a total span of 307.2 Mbp. The unanchored contigs (105) span 67.4 Mbp and therefore the estimated genome size of T. cacao is 374.6 Mbp. A comparative analysis with A. thaliana, V. vinifera, and P. trichocarpa suggests that comparisons of the genome assemblies of these distantly related species could provide insights into genome structure, evolutionary history, conservation of functional sites, and improvements in physical map assembly. A comparison between the two T. cacao cultivars Matina 1-6 and Criollo indicates a high degree of collinearity in their genomes, yet rearrangements were also observed. CONCLUSIONS: The results presented in this study are a stand-alone resource for functional exploitation and enhancement of Theobroma cacao but are also expected to complement and augment ongoing genome-sequencing efforts. This resource will serve as a template for refinement of the T. cacao genome through gap-filling, targeted re-sequencing, and resolution of repetitive DNA arrays.


Assuntos
Cacau/genética , Mapeamento Físico do Cromossomo/métodos , Cromossomos Artificiais Bacterianos/genética , Mapeamento de Sequências Contíguas , Marcadores Genéticos/genética , Genoma de Planta/genética , Alinhamento de Sequência , Sitios de Sequências Rotuladas
17.
BMC Genomics ; 12: 379, 2011 Jul 27.
Artigo em Inglês | MEDLINE | ID: mdl-21794110

RESUMO

BACKGROUND: BAC-based physical maps provide for sequencing across an entire genome or a selected sub-genomic region of biological interest. Such a region can be approached with next-generation whole-genome sequencing and assembly as if it were an independent small genome. Using the minimum tiling path as a guide, specific BAC clones representing the prioritized genomic interval are selected, pooled, and used to prepare a sequencing library. RESULTS: This pooled BAC approach was taken to sequence and assemble a QTL-rich region, of ~3 Mbp and represented by twenty-seven BACs, on linkage group 5 of the Theobroma cacao cv. Matina 1-6 genome. Using various mixtures of read coverages from paired-end and linear 454 libraries, multiple assemblies of varied quality were generated. Quality was assessed by comparing the assembly of 454 reads with a subset of ten BACs individually sequenced and assembled using Sanger reads. A mixture of reads optimal for assembly was identified. We found, furthermore, that a quality assembly suitable for serving as a reference genome template could be obtained even with a reduced depth of sequencing coverage. Annotation of the resulting assembly revealed several genes potentially responsible for three T. cacao traits: black pod disease resistance, bean shape index, and pod weight. CONCLUSIONS: Our results, as with other pooled BAC sequencing reports, suggest that pooling portions of a minimum tiling path derived from a BAC-based physical map is an effective method to target sub-genomic regions for sequencing. While we focused on a single QTL region, other QTL regions of importance could be similarly sequenced allowing for biological discovery to take place before a high quality whole-genome assembly is completed.


Assuntos
Cacau/genética , Cromossomos Artificiais Bacterianos , Genoma de Planta , Locos de Características Quantitativas , Biblioteca Genômica , Alinhamento de Sequência , Análise de Sequência de DNA
18.
Plant Physiol ; 154(1): 13-24, 2010 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-20668062

RESUMO

Discovering gene sets underlying the expression of a given phenotype is of great importance, as many phenotypes are the result of complex gene-gene interactions. Gene coexpression networks, built using a set of microarray samples as input, can help elucidate tightly coexpressed gene sets (modules) that are mixed with genes of known and unknown function. Functional enrichment analysis of modules further subdivides the coexpressed gene set into cofunctional gene clusters that may coexist in the module with other functionally related gene clusters. In this study, 45 coexpressed gene modules and 76 cofunctional gene clusters were discovered for rice (Oryza sativa) using a global, knowledge-independent paradigm and the combination of two network construction methodologies. Some clusters were enriched for previously characterized mutant phenotypes, providing evidence for specific gene sets (and their annotated molecular functions) that underlie specific phenotypes.


Assuntos
Regulação da Expressão Gênica de Plantas , Redes Reguladoras de Genes/genética , Genes de Plantas/genética , Oryza/genética , Análise por Conglomerados , Sondas de DNA/metabolismo , Loci Gênicos/genética , Internet , Mutação/genética , Análise de Sequência com Séries de Oligonucleotídeos , Fenótipo
19.
BMC Genom Data ; 22(1): 17, 2021 05 27.
Artigo em Inglês | MEDLINE | ID: mdl-34044788

RESUMO

BACKGROUND: Gene expression is potentially an important heritable quantitative trait that mediates between genetic variation and higher-level complex phenotypes through time and condition-dependent regulatory interactions. Therefore, we sought to explore both the genomic and condition-specific characteristics of gene expression heritability within the context of chromosomal structure. RESULTS: Heritability was estimated for biological gene expression using a diverse, 84-line, Oryza sativa (rice) population under optimal and salt-stressed conditions. Overall, 5936 genes were found to have heritable expression regardless of condition and 1377 genes were found to have heritable expression only during salt stress. These genes with salt-specific heritable expression are enriched for functional terms associated with response to stimulus and transcription factor activity. Additionally, we discovered that highly and lowly expressed genes, and genes with heritable expression are distributed differently along the chromosomes in patterns that follow previously identified high-throughput chromosomal conformation capture (Hi-C) A/B chromatin compartments. Furthermore, multiple genomic hot-spots enriched for genes with salt-specific heritability were identified on chromosomes 1, 4, 6, and 8. These hotspots were found to contain genes functionally enriched for transcriptional regulation and overlaps with a previously identified major QTL for salt-tolerance in rice. CONCLUSIONS: Investigating the heritability of traits, and in-particular gene expression traits, is important towards developing a basic understanding of how regulatory networks behave across a population. This work provides insights into spatial patterns of heritable gene expression at the chromosomal level.


Assuntos
Cromossomos de Plantas/genética , Regulação da Expressão Gênica de Plantas , Genoma de Planta/genética , Oryza/genética , Estresse Salino/genética , Locos de Características Quantitativas/genética
20.
Front Plant Sci ; 12: 609684, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34220875

RESUMO

Estimating maturity in pome fruits is a critical task that directs virtually all postharvest supply chain decisions. This is especially important for European pear (Pyrus communis) cultivars because losses due to spoilage and senescence must be minimized while ensuring proper ripening capacity is achieved (in part by satisfying a fruit chilling requirement). Reliable methods are lacking for accurate estimation of pear fruit maturity, and because ripening is maturity dependent it makes predicting ripening capacity a challenge. In this study of the European pear cultivar 'd'Anjou', we sorted fruit at harvest based upon on-tree fruit position to build contrasts of maturity. Our sorting scheme showed clear contrasts of maturity between canopy positions, yet there was substantial overlap in the distribution of values for the index of absorbance difference (I AD ), a non-destructive spectroscopic measurement that has been used as a proxy for pome fruit maturity. This presented an opportunity to explore a contrast of maturity that was more subtle than I AD could differentiate, and thus guided our subsequent transcriptome analysis of tissue samples taken at harvest and during storage. Using a novel approach that tests for condition-specific differences of co-expressed genes, we discovered genes with a phased character that mirrored our sorting scheme. The expression patterns of these genes are associated with fruit quality and ripening differences across the experiment. Functional profiles of these co-expressed genes are concordant with previous findings, and also offer new clues, and thus hypotheses, about genes involved in pear fruit quality, maturity, and ripening. This work may lead to new tools for enhanced postharvest management based on activity of gene co-expression modules, rather than individual genes. Further, our results indicate that modules may have utility within specific windows of time during postharvest management of 'd'Anjou' pear.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA