RESUMO
Advances in genomics have expedited the improvement of several agriculturally important crops but similar efforts in wheat (Triticum spp.) have been more challenging. This is largely owing to the size and complexity of the wheat genome1, and the lack of genome-assembly data for multiple wheat lines2,3. Here we generated ten chromosome pseudomolecule and five scaffold assemblies of hexaploid wheat to explore the genomic diversity among wheat lines from global breeding programs. Comparative analysis revealed extensive structural rearrangements, introgressions from wild relatives and differences in gene content resulting from complex breeding histories aimed at improving adaptation to diverse environments, grain yield and quality, and resistance to stresses4,5. We provide examples outlining the utility of these genomes, including a detailed multi-genome-derived nucleotide-binding leucine-rich repeat protein repertoire involved in disease resistance and the characterization of Sm16, a gene associated with insect resistance. These genome assemblies will provide a basis for functional gene discovery and breeding to deliver the next generation of modern wheat cultivars.
Assuntos
Variação Genética , Genoma de Planta/genética , Genômica , Internacionalidade , Melhoramento Vegetal/métodos , Triticum/genética , Aclimatação/genética , Animais , Centrômero/genética , Centrômero/metabolismo , Mapeamento Cromossômico , Clonagem Molecular , Variações do Número de Cópias de DNA/genética , Elementos de DNA Transponíveis/genética , Grão Comestível/genética , Grão Comestível/crescimento & desenvolvimento , Genes de Plantas/genética , Introgressão Genética , Haplótipos , Insetos/patogenicidade , Proteínas NLR/genética , Doenças das Plantas/genética , Proteínas de Plantas/genética , Polimorfismo de Nucleotídeo Único/genética , Poliploidia , Triticum/classificação , Triticum/crescimento & desenvolvimentoRESUMO
Advances in genome sequencing and assembly technologies are generating many high-quality genome sequences, but assemblies of large, repeat-rich polyploid genomes, such as that of bread wheat, remain fragmented and incomplete. We have generated a new wheat whole-genome shotgun sequence assembly using a combination of optimized data types and an assembly algorithm designed to deal with large and complex genomes. The new assembly represents >78% of the genome with a scaffold N50 of 88.8 kb that has a high fidelity to the input data. Our new annotation combines strand-specific Illumina RNA-seq and Pacific Biosciences (PacBio) full-length cDNAs to identify 104,091 high-confidence protein-coding genes and 10,156 noncoding RNA genes. We confirmed three known and identified one novel genome rearrangements. Our approach enables the rapid and scalable assembly of wheat genomes, the identification of structural variants, and the definition of complete gene models, all powerful resources for trait analysis and breeding of this key global crop.
Assuntos
Mapeamento de Sequências Contíguas/métodos , Genoma de Planta , Anotação de Sequência Molecular/métodos , Proteínas de Plantas/genética , Translocação Genética , Triticum/genética , Algoritmos , Mapeamento de Sequências Contíguas/normas , Anotação de Sequência Molecular/normas , Polimorfismo Genético , PoliploidiaRESUMO
KEY MESSAGE: Genetic mapping of sensitivity to the Pyrenophora tritici-repentis effector ToxB allowed development of a diagnostic genetic marker, and investigation of wheat pedigrees allowed transmission of sensitive alleles to be tracked. Tan spot, caused by the necrotrophic fungal pathogen Pyrenophora tritici-repentis, is a major disease of wheat (Triticum aestivum). Secretion of the P. tritici-repentis effector ToxB is thought to play a part in mediating infection, causing chlorosis of plant tissue. Here, genetic analysis using an association mapping panel (n = 480) and a multiparent advanced generation intercross (MAGIC) population (n founders = 8, n progeny = 643) genotyped with a 90,000 feature single nucleotide polymorphism (SNP) array found ToxB sensitivity to be highly heritable (h2 ≥ 0.9), controlled predominantly by the Tsc2 locus on chromosome 2B. Genetic mapping of Tsc2 delineated a 1921-kb interval containing 104 genes in the reference genome of ToxB-insensitive variety 'Chinese Spring'. This allowed development of a co-dominant genetic marker for Tsc2 allelic state, diagnostic for ToxB sensitivity in the association mapping panel. Phenotypic and genotypic analysis in a panel of wheat varieties post-dated the association mapping panel further supported the diagnostic nature of the marker. Combining ToxB phenotype and genotypic data with wheat pedigree datasets allowed historic sources of ToxB sensitivity to be tracked, finding the variety 'Maris Dove' to likely be the historic source of sensitive Tsc2 alleles in the wheat germplasm surveyed. Exploration of the Tsc2 region gene space in the ToxB-sensitive line 'Synthetic W7984' identified candidate genes for future investigation. Additionally, a minor ToxB sensitivity QTL was identified on chromosome 2A. The resources presented here will be of immediate use for marker-assisted selection for ToxB insensitivity and the development of germplasm with additional genetic recombination within the Tsc2 region.
Assuntos
Ascomicetos , Resistência à Doença/genética , Interações Hospedeiro-Patógeno/genética , Micotoxinas/toxicidade , Doenças das Plantas/genética , Triticum/genética , Mapeamento Cromossômico , Ligação Genética , Marcadores Genéticos , Genômica , Genótipo , Sequenciamento de Nucleotídeos em Larga Escala , Fenótipo , Doenças das Plantas/microbiologia , Polimorfismo de Nucleotídeo Único , Locos de Características QuantitativasRESUMO
Crop populations derived from experimental crosses enable the genetic dissection of complex traits and support modern plant breeding. Among these, multi-parent populations now play a central role. By mixing and recombining the genomes of multiple founders, multi-parent populations combine many commonly sought beneficial properties of genetic mapping populations. For example, they have high power and resolution for mapping quantitative trait loci, high genetic diversity and minimal population structure. Many multi-parent populations have been constructed in crop species, and their inbred germplasm and associated phenotypic and genotypic data serve as enduring resources. Their utility has grown from being a tool for mapping quantitative trait loci to a means of providing germplasm for breeding programmes. Genomics approaches, including de novo genome assemblies and gene annotations for the population founders, have allowed the imputation of rich sequence information into the descendent population, expanding the breadth of research and breeding applications of multi-parent populations. Here, we report recent successes from crop multi-parent populations in crops. We also propose an ideal genotypic, phenotypic and germplasm 'package' that multi-parent populations should feature to optimise their use as powerful community resources for crop research, development and breeding.
Assuntos
Produtos Agrícolas , Genômica , Melhoramento Vegetal , Mapeamento Cromossômico , Produtos Agrícolas/genética , Genoma de Planta , Locos de Características QuantitativasRESUMO
Using RNA sequencing technology and de novo transcriptome assembly, we compared representative sets of wild and domesticated accessions of common bean (Phaseolus vulgaris) from Mesoamerica. RNA was extracted at the first true-leaf stage, and de novo assembly was used to develop a reference transcriptome; the final data set consists of â¼190,000 single nucleotide polymorphisms from 27,243 contigs in expressed genomic regions. A drastic reduction in nucleotide diversity (â¼60%) is evident for the domesticated form, compared with the wild form, and almost 50% of the contigs that are polymorphic were brought to fixation by domestication. In parallel, the effects of domestication decreased the diversity of gene expression (18%). While the coexpression networks for the wild and domesticated accessions demonstrate similar seminal network properties, they show distinct community structures that are enriched for different molecular functions. After simulating the demographic dynamics during domestication, we found that 9% of the genes were actively selected during domestication. We also show that selection induced a further reduction in the diversity of gene expression (26%) and was associated with 5-fold enrichment of differentially expressed genes. While there is substantial evidence of positive selection associated with domestication, in a few cases, this selection has increased the nucleotide diversity in the domesticated pool at target loci associated with abiotic stress responses, flowering time, and morphology.
RESUMO
The grapevine (Vitis vinifera) cultivar Tannat is cultivated mainly in Uruguay for the production of high-quality red wines. Tannat berries have unusually high levels of polyphenolic compounds, producing wines with an intense purple color and remarkable antioxidant properties. We investigated the genetic basis of these important characteristics by sequencing the genome of the Uruguayan Tannat clone UY11 using Illumina technology, followed by a mixture of de novo assembly and iterative mapping onto the PN40024 reference genome. RNA sequencing data for genome reannotation were processed using a combination of reference-guided annotation and de novo transcript assembly, allowing 5901 previously unannotated or unassembled genes to be defined and resulting in the discovery of 1873 genes that were not shared with PN40024. Expression analysis showed that these cultivar-specific genes contributed substantially (up to 81.24%) to the overall expression of enzymes involved in the synthesis of phenolic and polyphenolic compounds that contribute to the unique characteristics of the Tannat berries. The characterization of the Tannat genome therefore indicated that the grapevine reference genome lacks many genes that appear to be relevant for the varietal phenotype.
Assuntos
Genoma de Planta , Polifenóis/biossíntese , Vitis/genética , Antioxidantes/metabolismo , Frutas/química , Frutas/genética , Anotação de Sequência Molecular , Fenótipo , Polifenóis/genética , Valores de Referência , Análise de Sequência de RNA , Transcriptoma , Uruguai , Vitis/metabolismoRESUMO
We developed a genome-wide transcriptomic atlas of grapevine (Vitis vinifera) based on 54 samples representing green and woody tissues and organs at different developmental stages as well as specialized tissues such as pollen and senescent leaves. Together, these samples expressed â¼91% of the predicted grapevine genes. Pollen and senescent leaves had unique transcriptomes reflecting their specialized functions and physiological status. However, microarray and RNA-seq analysis grouped all the other samples into two major classes based on maturity rather than organ identity, namely, the vegetative/green and mature/woody categories. This division represents a fundamental transcriptomic reprogramming during the maturation process and was highlighted by three statistical approaches identifying the transcriptional relationships among samples (correlation analysis), putative biomarkers (O2PLS-DA approach), and sets of strongly and consistently expressed genes that define groups (topics) of similar samples (biclustering analysis). Gene coexpression analysis indicated that the mature/woody developmental program results from the reiterative coactivation of pathways that are largely inactive in vegetative/green tissues, often involving the coregulation of clusters of neighboring genes and global regulation based on codon preference. This global transcriptomic reprogramming during maturation has not been observed in herbaceous annual species and may be a defining characteristic of perennial woody plants.
Assuntos
Regulação da Expressão Gênica de Plantas/genética , Genes de Plantas/genética , Genoma de Planta/genética , Transcriptoma , Vitis/genética , Cromossomos de Plantas/genética , Análise por Conglomerados , Frutas/genética , Frutas/crescimento & desenvolvimento , Frutas/fisiologia , Expressão Gênica , Perfilação da Expressão Gênica , Marcadores Genéticos , Análise de Sequência com Séries de Oligonucleotídeos , Especificidade de Órgãos , Folhas de Planta/genética , Folhas de Planta/crescimento & desenvolvimento , Folhas de Planta/fisiologia , Caules de Planta/genética , Caules de Planta/crescimento & desenvolvimento , Caules de Planta/fisiologia , Pólen/genética , Pólen/crescimento & desenvolvimento , Pólen/fisiologia , RNA de Plantas/genética , RNA de Plantas/metabolismo , Especificidade da Espécie , Vitis/crescimento & desenvolvimento , Vitis/fisiologiaRESUMO
BACKGROUND: Grapevine berries undergo complex biochemical changes during fruit maturation, many of which are dependent upon the variety and its environment. In order to elucidate the varietal dependent developmental regulation of primary and specialized metabolism, berry skins of Cabernet Sauvignon and Shiraz were subjected to gas chromatography-mass spectrometry (GC-MS) and liquid chromatography-mass spectrometry (LC-MS) based metabolite profiling from pre-veraison to harvest. The generated dataset was augmented with transcript profiling using RNAseq. RESULTS: The analysis of the metabolite data revealed similar developmental patterns of change in primary metabolites between the two cultivars. Nevertheless, towards maturity the extent of change in the major organic acid and sugars (i.e. sucrose, trehalose, malate) and precursors of aromatic and phenolic compounds such as quinate and shikimate was greater in Shiraz compared to Cabernet Sauvignon. In contrast, distinct directional projections on the PCA plot of the two cultivars samples towards maturation when using the specialized metabolite profiles were apparent, suggesting a cultivar-dependent regulation of the specialized metabolism. Generally, Shiraz displayed greater upregulation of the entire polyphenol pathway and specifically higher accumulation of piceid and coumaroyl anthocyanin forms than Cabernet Sauvignon from veraison onwards. Transcript profiling revealed coordinated increased transcript abundance for genes encoding enzymes of committing steps in the phenylpropanoid pathway. The anthocyanin metabolite profile showed F3'5'H-mediated delphinidin-type anthocyanin enrichment in both varieties towards maturation, consistent with the transcript data, indicating that the F3'5'H-governed branching step dominates the anthocyanin profile at late berry development. Correlation analysis confirmed the tightly coordinated metabolic changes during development, and suggested a source-sink relation between the central and specialized metabolism, stronger in Shiraz than Cabernet Sauvignon. RNAseq analysis also revealed that the two cultivars exhibited distinct pattern of changes in genes related to abscisic acid (ABA) biosynthesis enzymes. CONCLUSIONS: Compared with CS, Shiraz showed higher number of significant correlations between metabolites, which together with the relatively higher expression of flavonoid genes supports the evidence of increased accumulation of coumaroyl anthocyanins in that cultivar. Enhanced stress related metabolism, e.g. trehalose, stilbene and ABA in Shiraz berry-skin are consistent with its relatively higher susceptibility to environmental cues.
Assuntos
Frutas/metabolismo , Metaboloma , Transcriptoma , Vitis/genética , Antocianinas/química , Cromatografia Líquida , Flavonoides/química , Frutas/genética , Cromatografia Gasosa-Espectrometria de Massas , Polifenóis/química , Vitis/classificação , Vitis/metabolismo , VinhoRESUMO
Medicago truncatula is one of the most studied model plants. Nevertheless, the genome of this legume remains incompletely determined. We used RNA-Seq to characterize the transcriptome during the early organogenesis of the nodule and during its functioning. We detected 37,333 expressed transcription units; to our knowledge, 1,670 had never been described before and were functionally annotated. We identified 7,595 new transcribed regions, mostly corresponding to 5' and 3' untranslated region extensions and new exons associated with 5,264 previously annotated genes. We also inferred 23,165 putative transcript isoforms from 6,587 genes and measured the abundance of transcripts for each isoform, which suggests an important role for alternative splicing in the generation of proteome diversity in M. truncatula. Finally, we carried out a differential expression analysis, which provided a comprehensive view of transcriptional reprogramming during nodulation. In particular, depletion of nitric oxide in roots inoculated with Sinorhizobium meliloti greatly increased our understanding of the role of this reactive species in the optimal establishment of the symbiotic interaction, revealing differential patterns of expression for 2,030 genes and pointing to the inhibition of the expression of defense genes.
Assuntos
Medicago truncatula/microbiologia , Óxido Nítrico/metabolismo , Sinorhizobium meliloti/crescimento & desenvolvimento , Simbiose , Transcriptoma , Regiões 3' não Traduzidas , Regiões 5' não Traduzidas , Processamento Alternativo , Éxons , Regulação da Expressão Gênica de Plantas , Genes de Plantas , Sequenciamento de Nucleotídeos em Larga Escala , Íntrons , Medicago truncatula/genética , Medicago truncatula/metabolismo , Anotação de Sequência Molecular , Nodulação , Raízes de Plantas/metabolismo , Raízes de Plantas/microbiologia , RNA de Plantas/genética , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Transcrição GênicaRESUMO
BACKGROUND: Plants such as grapevine (Vitis spp.) display significant inter-cultivar genetic and phenotypic variation. The genetic components underlying phenotypic diversity in grapevine must be understood in order to disentangle genetic and environmental factors. RESULTS: We have shown that cDNA sequencing by RNA-seq is a robust approach for the characterization of varietal diversity between a local grapevine cultivar (Corvina) and the PN40024 reference genome. We detected 15,161 known genes including 9463 with novel splice isoforms, and identified 2321 potentially novel protein-coding genes in non-annotated or unassembled regions of the reference genome. We also discovered 180 apparent private genes in the Corvina genome which were missing from the reference genome. CONCLUSIONS: The de novo assembly approach allowed a substantial amount of the Corvina transcriptome to be reconstructed, improving known gene annotations by robustly defining gene structures, annotating splice isoforms and detecting genes without annotations. The private genes we discovered are likely to be nonessential but could influence certain cultivar-specific characteristics. Therefore, the application of de novo transcriptome assembly should not be restricted to species lacking a reference genome because it can also improve existing reference genome annotations and identify novel, cultivar-specific genes.
Assuntos
Perfilação da Expressão Gênica , Variação Genética/genética , Vitis/genética , Frutas/genética , Frutas/crescimento & desenvolvimento , Genes de Plantas/genética , Anotação de Sequência Molecular , Dados de Sequência Molecular , Especificidade da Espécie , Vitis/crescimento & desenvolvimentoRESUMO
Lesion mimic mutants (LMMs) are a class of mutants in which hypersensitive cell death and defence responses are constitutively activated in the absence of pathogen attack. Various signalling molecules, such as salicylic acid (SA), reactive oxygen species (ROS), nitric oxide (NO), Ca(2+), ethylene, and jasmonate, are involved in the regulation of multiple pathways controlling hypersensitive response (HR) activation, and LMMs are considered useful tools to understand the role played by the key elements of the HR cell death signalling cascade. Here the characterization of an Arabidopsis LMM lacking the function of the FZL gene is reported. This gene encodes a membrane-remodelling GTPase playing an essential role in the determination of thylakoid and chloroplast morphology. The mutant displayed alteration in chloroplast number, size, and shape, and the typical characteristics of an LMM, namely development of chlorotic lesions on rosette leaves and constitutive expression of genetic and biochemical markers associated with defence responses. The chloroplasts are a major source of ROS, and the characterization of this mutant suggests that their accumulation, triggered by damage to the chloroplast membranes, is a signal sufficient to start the HR signalling cascade, thus confirming the central role of the chloroplast in HR activation.
Assuntos
Proteínas de Arabidopsis/genética , Arabidopsis/genética , Cloroplastos/genética , GTP Fosfo-Hidrolases/genética , Genes de Plantas/genética , Mutação/genética , Proteínas de Arabidopsis/metabolismo , Ecótipo , Meio Ambiente , GTP Fosfo-Hidrolases/metabolismo , Regulação da Expressão Gênica de Plantas , Teste de Complementação Genética , Marcadores Genéticos , Fenótipo , Reação em Cadeia da Polimerase em Tempo Real , TemperaturaRESUMO
With approximately 450 species, spiny Solanum species constitute the largest monophyletic group in the Solanaceae family, but a high-quality genome assembly from this group is presently missing. We obtained a chromosome-anchored genome assembly of eggplant (Solanum melongena), containing 34,916 genes, confirming that the diploid gene number in the Solanaceae is around 35,000. Comparative genomic studies with tomato (S. lycopersicum), potato (S. tuberosum) and pepper (Capsicum annuum) highlighted the rapid evolution of miRNA:mRNA regulatory pairs and R-type defense genes in the Solanaceae, and provided a genomic basis for the lack of steroidal glycoalkaloid compounds in the Capsicum genus. Using parsimony methods, we reconstructed the putative chromosomal complements of the key founders of the main Solanaceae clades and the rearrangements that led to the karyotypes of extant species and their ancestors. From 10% to 15% of the genes present in the four genomes were syntenic paralogs (ohnologs) generated by the pre-γ, γ and T paleopolyploidy events, and were enriched in transcription factors. Our data suggest that the basic gene network controlling fruit ripening is conserved in different Solanaceae clades, and that climacteric fruit ripening involves a differential regulation of relatively few components of this network, including CNR and ethylene biosynthetic genes.
Assuntos
Cromossomos de Plantas , Evolução Molecular , Genoma de Planta , Solanum melongena/genética , Etilenos/metabolismo , Redes Reguladoras de Genes , MicroRNAs/genética , Solanum melongena/metabolismoRESUMO
Next-generation sequencing technologies enable rapid and cheap genome-wide transcriptome analysis, providing vital information about gene structure, transcript expression, and alternative splicing. Key to this is the accurate identification of exon-exon junctions from RNA sequenced (RNA-seq) reads. A number of RNA-seq aligners capable of splitting reads across these splice junctions (SJs) have been developed; however, it has been shown that while they correctly identify most genuine SJs available in a given sample, they also often produce large numbers of incorrect SJs. Here, we describe the extent of this problem using popular RNA-seq mapping tools and present a new method, called Portcullis, to rapidly filter false SJs derived from spliced alignments. We show that Portcullis distinguishes between genuine and false-positive junctions to a high degree of accuracy across different species, samples, expression levels, error profiles, and read lengths. Portcullis is portable, efficient, and, to our knowledge, currently the only SJ prediction tool that reliably scales for use with large RNA-seq datasets and large, highly fragmented genomes, while delivering accurate SJs.
Assuntos
Splicing de RNA , RNA/metabolismo , Software , Animais , Arabidopsis/genética , Bases de Dados Genéticas , Drosophila/genética , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , RNA/química , Sítios de Splice de RNA/genética , Análise de Sequência de RNARESUMO
Background: The performance of RNA sequencing (RNA-seq) aligners and assemblers varies greatly across different organisms and experiments, and often the optimal approach is not known beforehand. Results: Here, we show that the accuracy of transcript reconstruction can be boosted by combining multiple methods, and we present a novel algorithm to integrate multiple RNA-seq assemblies into a coherent transcript annotation. Our algorithm can remove redundancies and select the best transcript models according to user-specified metrics, while solving common artifacts such as erroneous transcript chimerisms. Conclusions: We have implemented this method in an open-source Python3 and Cython program, Mikado, available on GitHub.
Assuntos
Algoritmos , Perfilação da Expressão Gênica/métodos , Anotação de Sequência Molecular/métodos , Análise de Sequência de RNA/métodos , Animais , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Plantas/genética , SoftwareRESUMO
Mycorrhizal fungi live in the roots of host plants and are crucial components of all forest ecosystems. A large-scale study of fungal genomics provides new insights into the evolution of mycorrhizae and a deep exploration of mycorrhizal diversity that helps to uncover the molecular and genetic details of fungal symbiotic relationships with plants.
Assuntos
Genoma Fúngico/genética , Micorrizas/genética , Seleção Genética , Simbiose/genética , Virulência/genéticaRESUMO
AIMS: To investigate the feasibility of hepatitis B vaccination among heroin users, assessing adherence to the vaccination schedules and identifying factors associated with antibody response. DESIGN AND PARTICIPANTS: A large cohort study in nine public centres for drug users (PCDUs) in north-eastern Italy, with data collected between January 1989 and December 1998. A total of 1175 heroin users were selected and vaccinated with a recombinant vaccine using two schedules (0-1-6 months and 0-1-2 months). FINDINGS: Eighty-eight per cent of patients completed the vaccination series and a protective antibody response occurred in 77% of subjects. Completion of the vaccination series was not related to the length of the vaccination schedule or whether the patient was still in drug abuse treatment at the end of the series, but was related strongly to the number of patients enrolled at each PCDU (Spearman correlation = - 0.93, P < 0.001). Four variables were significantly associated with lack of seroconversion in response to vaccination: older age (AOR = 0.91 per year, 95% CI 0.88-0.94, P < 0.001), 2-month vaccination schedule (AOR = 3.10, 95% CI 2.06-4.68, P < 0.001), HCV seropositivity (AOR = 0.69, 95% CI 0.47-0.99, P = 0.04), HIV seropositivity (AOR = 0.27, 95% CI 0.10-0.77, P = 0.01). CONCLUSIONS: A large-scale, multi-site hepatitis B vaccination programme for heroin users proved feasible and effective. The factors associated with a lack of antibody response may be useful in identifying patients who would benefit most from routine post-vaccination testing, with booster doses for non-responders. These results suggest that hepatitis B vaccination for drug users should become a routine public health practice.
Assuntos
Vacinas contra Hepatite B , Hepatite B/prevenção & controle , Dependência de Heroína , Cooperação do Paciente/estatística & dados numéricos , Adulto , Formação de Anticorpos , Estudos de Coortes , Estudos de Viabilidade , Feminino , Vacinas contra Hepatite B/imunologia , Dependência de Heroína/imunologia , Dependência de Heroína/psicologia , Humanos , Itália/epidemiologia , Masculino , Cooperação do Paciente/psicologia , Resultado do Tratamento , Vacinação/estatística & dados numéricosRESUMO
BACKGROUND: Expansins are proteins that loosen plant cell walls in a pH-dependent manner, probably by increasing the relative movement among polymers thus causing irreversible expansion. The expansin superfamily (EXP) comprises four distinct families: expansin A (EXPA), expansin B (EXPB), expansin-like A (EXLA) and expansin-like B (EXLB). There is experimental evidence that EXPA and EXPB proteins are required for cell expansion and developmental processes involving cell wall modification, whereas the exact functions of EXLA and EXLB remain unclear. The complete grapevine (Vitis vinifera) genome sequence has allowed the characterization of many gene families, but an exhaustive genome-wide analysis of expansin gene expression has not been attempted thus far. METHODOLOGY/PRINCIPAL FINDINGS: We identified 29 EXP superfamily genes in the grapevine genome, representing all four EXP families. Members of the same EXP family shared the same exon-intron structure, and phylogenetic analysis confirmed a closer relationship between EXP genes from woody species, i.e. grapevine and poplar (Populus trichocarpa), compared to those from Arabidopsis thaliana and rice (Oryza sativa). We also identified grapevine-specific duplication events involving the EXLB family. Global gene expression analysis confirmed a strong correlation among EXP genes expressed in mature and green/vegetative samples, respectively, as reported for other gene families in the recently-published grapevine gene expression atlas. We also observed the specific co-expression of EXLB genes in woody organs, and the involvement of certain grapevine EXP genes in berry development and post-harvest withering. CONCLUSION: Our comprehensive analysis of the grapevine EXP superfamily confirmed and extended current knowledge about the structural and functional characteristics of this gene family, and also identified properties that are currently unique to grapevine expansin genes. Our data provide a model for the functional characterization of grapevine gene families by combining phylogenetic analysis with global gene expression profiling.