Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 76
Filtrar
1.
Mol Biol Evol ; 36(11): 2572-2590, 2019 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-31350563

RESUMO

The influence that bacterial adaptation (or niche partitioning) within species has on gene spillover and transmission among bacterial populations occupying different niches is not well understood. Streptococcus agalactiae is an important bacterial pathogen that has a taxonomically diverse host range making it an excellent model system to study these processes. Here, we analyze a global set of 901 genome sequences from nine diverse host species to advance our understanding of these processes. Bayesian clustering analysis delineated 12 major populations that closely aligned with niches. Comparative genomics revealed extensive gene gain/loss among populations and a large pan genome of 9,527 genes, which remained open and was strongly partitioned among niches. As a result, the biochemical characteristics of 11 populations were highly distinctive (significantly enriched). Positive selection was detected and biochemical characteristics of the dispensable genes under selection were enriched in ten populations. Despite the strong gene partitioning, phylogenomics detected gene spillover. In particular, tetracycline resistance (which likely evolved in the human-associated population) from humans to bovine, canines, seals, and fish, demonstrating how a gene selected in one host can ultimately be transmitted into another, and biased transmission from humans to bovines was confirmed with a Bayesian migration analysis. Our findings show high bacterial genome plasticity acting in balance with selection pressure from distinct functional requirements of niches that is associated with an extensive and highly partitioned dispensable genome, likely facilitating continued and expansive adaptation.

2.
Plant J ; 89(4): 789-804, 2017 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-27862469

RESUMO

The flowering plant Arabidopsis thaliana is a dicot model organism for research in many aspects of plant biology. A comprehensive annotation of its genome paves the way for understanding the functions and activities of all types of transcripts, including mRNA, the various classes of non-coding RNA, and small RNA. The TAIR10 annotation update had a profound impact on Arabidopsis research but was released more than 5 years ago. Maintaining the accuracy of the annotation continues to be a prerequisite for future progress. Using an integrative annotation pipeline, we assembled tissue-specific RNA-Seq libraries from 113 datasets and constructed 48 359 transcript models of protein-coding genes in eleven tissues. In addition, we annotated various classes of non-coding RNA including microRNA, long intergenic RNA, small nucleolar RNA, natural antisense transcript, small nuclear RNA, and small RNA using published datasets and in-house analytic results. Altogether, we identified 635 novel protein-coding genes, 508 novel transcribed regions, 5178 non-coding RNAs, and 35 846 small RNA loci that were formerly unannotated. Analysis of the splicing events and RNA-Seq based expression profiles revealed the landscapes of gene structures, untranslated regions, and splicing activities to be more intricate than previously appreciated. Furthermore, we present 692 uniformly expressed housekeeping genes, 43% of whose human orthologs are also housekeeping genes. This updated Arabidopsis genome annotation with a substantially increased resolution of gene models will not only further our understanding of the biological processes of this plant model but also of other species.


Assuntos
Proteínas de Arabidopsis/genética , Arabidopsis/genética , Perfilação da Expressão Gênica , Regulação da Expressão Gênica de Plantas/genética , Genoma de Planta/genética , RNA de Plantas/genética , Transcriptoma/genética
3.
Plant Cell Physiol ; 58(1): e4, 2017 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-28013278

RESUMO

ThaleMine (https://apps.araport.org/thalemine/) is a comprehensive data warehouse that integrates a wide array of genomic information of the model plant Arabidopsis thaliana. The data collection currently includes the latest structural and functional annotation from the Araport11 update, the Col-0 genome sequence, RNA-seq and array expression, co-expression, protein interactions, homologs, pathways, publications, alleles, germplasm and phenotypes. The data are collected from a wide variety of public resources. Users can browse gene-specific data through Gene Report pages, identify and create gene lists based on experiments or indexed keywords, and run GO enrichment analysis to investigate the biological significance of selected gene sets. Developed by the Arabidopsis Information Portal project (Araport, https://www.araport.org/), ThaleMine uses the InterMine software framework, which builds well-structured data, and provides powerful data query and analysis functionality. The warehoused data can be accessed by users via graphical interfaces, as well as programmatically via web-services. Here we describe recent developments in ThaleMine including new features and extensions, and discuss future improvements. InterMine has been broadly adopted by the model organism research community including nematode, rat, mouse, zebrafish, budding yeast, the modENCODE project, as well as being used for human data. ThaleMine is the first InterMine developed for a plant model. As additional new plant InterMines are developed by the legume and other plant research communities, the potential of cross-organism integrative data analysis will be further enabled.


Assuntos
Proteínas de Arabidopsis/genética , Arabidopsis/genética , Bases de Dados Genéticas , Perfilação da Expressão Gênica , Regulação da Expressão Gênica de Plantas/genética , Proteínas de Arabidopsis/metabolismo , Biologia Computacional/métodos , Ontologia Genética , Genômica/métodos , Armazenamento e Recuperação da Informação/métodos , Internet , Mapeamento de Interação de Proteínas/métodos , Mapas de Interação de Proteínas/genética , Reprodutibilidade dos Testes , Análise de Sequência de RNA
4.
Plant Cell ; 26(5): 1925-1937, 2014 May.
Artigo em Inglês | MEDLINE | ID: mdl-24876251

RESUMO

Polyploidization events are frequent among flowering plants, and the duplicate genes produced via such events contribute significantly to plant evolution. We sequenced the genome of wild radish (Raphanus raphanistrum), a Brassicaceae species that experienced a whole-genome triplication event prior to diverging from Brassica rapa. Despite substantial gene gains in these two species compared with Arabidopsis thaliana and Arabidopsis lyrata, ∼70% of the orthologous groups experienced gene losses in R. raphanistrum and B. rapa, with most of the losses occurring prior to their divergence. The retained duplicates show substantial divergence in sequence and expression. Based on comparison of A. thaliana and R. raphanistrum ortholog floral expression levels, retained radish duplicates diverged primarily via maintenance of ancestral expression level in one copy and reduction of expression level in others. In addition, retained duplicates differed significantly from genes that reverted to singleton state in function, sequence composition, expression patterns, network connectivity, and rates of evolution. Using these properties, we established a statistical learning model for predicting whether a duplicate would be retained postpolyploidization. Overall, our study provides new insights into the processes of plant duplicate loss, retention, and functional divergence and highlights the need for further understanding factors controlling duplicate gene fate.

5.
Nucleic Acids Res ; 43(Database issue): D1003-9, 2015 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-25414324

RESUMO

The Arabidopsis Information Portal (https://www.araport.org) is a new online resource for plant biology research. It houses the Arabidopsis thaliana genome sequence and associated annotation. It was conceived as a framework that allows the research community to develop and release 'modules' that integrate, analyze and visualize Arabidopsis data that may reside at remote sites. The current implementation provides an indexed database of core genomic information. These data are made available through feature-rich web applications that provide search, data mining, and genome browser functionality, and also by bulk download and web services. Araport uses software from the InterMine and JBrowse projects to expose curated data from TAIR, GO, BAR, EBI, UniProt, PubMed and EPIC CoGe. The site also hosts 'science apps,' developed as prototypes for community modules that use dynamic web pages to present data obtained on-demand from third-party servers via RESTful web services. Designed for sustainability, the Arabidopsis Information Portal strategy exploits existing scientific computing infrastructure, adopts a practical mixture of data integration technologies and encourages collaborative enhancement of the resource by its user community.


Assuntos
Arabidopsis/genética , Bases de Dados Genéticas , Genoma de Planta , Mineração de Dados , Internet , Software
6.
Plant Cell Physiol ; 56(1): e1, 2015 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-25432968

RESUMO

Medicago truncatula, a close relative of alfalfa (Medicago sativa), is a model legume used for studying symbiotic nitrogen fixation, mycorrhizal interactions and legume genomics. J. Craig Venter Institute (JCVI; formerly TIGR) has been involved in M. truncatula genome sequencing and annotation since 2002 and has maintained a web-based resource providing data to the community for this entire period. The website (http://www.MedicagoGenome.org) has seen major updates in the past year, where it currently hosts the latest version of the genome (Mt4.0), associated data and legacy project information, presented to users via a rich set of open-source tools. A JBrowse-based genome browser interface exposes tracks for visualization. Mutant gene symbols originally assembled and curated by the Frugoli lab are now hosted at JCVI and tie into our community annotation interface, Medicago EuCAP (to be integrated soon with our implementation of WebApollo). Literature pertinent to M. truncatula is indexed and made searchable via the Textpresso search engine. The site also implements MedicMine, an instance of InterMine that offers interconnectivity with other plant 'mines' such as ThaleMine and PhytoMine, and other model organism databases (MODs). In addition to these new features, we continue to provide keyword- and locus identifier-based searches served via a Chado-backed Tripal Instance, a BLAST search interface and bulk downloads of data sets from the iPlant Data Store (iDS). Finally, we maintain an E-mail helpdesk, facilitated by a JIRA issue tracking system, where we receive and respond to questions about the website and requests for specific data sets from the community.


Assuntos
Biologia Computacional , Bases de Dados Genéticas , Genoma de Planta/genética , Medicago truncatula/genética , Interface Usuário-Computador , Armazenamento e Recuperação da Informação , Internet
7.
BMC Genomics ; 15: 312, 2014 Apr 27.
Artigo em Inglês | MEDLINE | ID: mdl-24767513

RESUMO

BACKGROUND: Medicago truncatula, a close relative of alfalfa, is a preeminent model for studying nitrogen fixation, symbiosis, and legume genomics. The Medicago sequencing project began in 2003 with the goal to decipher sequences originated from the euchromatic portion of the genome. The initial sequencing approach was based on a BAC tiling path, culminating in a BAC-based assembly (Mt3.5) as well as an in-depth analysis of the genome published in 2011. RESULTS: Here we describe a further improved and refined version of the M. truncatula genome (Mt4.0) based on de novo whole genome shotgun assembly of a majority of Illumina and 454 reads using ALLPATHS-LG. The ALLPATHS-LG scaffolds were anchored onto the pseudomolecules on the basis of alignments to both the optical map and the genotyping-by-sequencing (GBS) map. The Mt4.0 pseudomolecules encompass ~360 Mb of actual sequences spanning 390 Mb of which ~330 Mb align perfectly with the optical map, presenting a drastic improvement over the BAC-based Mt3.5 which only contained 70% sequences (~250 Mb) of the current version. Most of the sequences and genes that previously resided on the unanchored portion of Mt3.5 have now been incorporated into the Mt4.0 pseudomolecules, with the exception of ~28 Mb of unplaced sequences. With regard to gene annotation, the genome has been re-annotated through our gene prediction pipeline, which integrates EST, RNA-seq, protein and gene prediction evidences. A total of 50,894 genes (31,661 high confidence and 19,233 low confidence) are included in Mt4.0 which overlapped with ~82% of the gene loci annotated in Mt3.5. Of the remaining genes, 14% of the Mt3.5 genes have been deprecated to an "unsupported" status and 4% are absent from the Mt4.0 predictions. CONCLUSIONS: Mt4.0 and its associated resources, such as genome browsers, BLAST-able datasets and gene information pages, can be found on the JCVI Medicago web site (http://www.jcvi.org/medicago). The assembly and annotation has been deposited in GenBank (BioProject: PRJNA10791). The heavily curated chromosomal sequences and associated gene models of Medicago will serve as a better reference for legume biology and comparative genomics.


Assuntos
Genoma de Planta , Medicago truncatula/genética , Cromossomos Artificiais Bacterianos
8.
J Gen Virol ; 95(Pt 4): 836-848, 2014 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-24394697

RESUMO

From 1 January 2009 to 31 May 2013, 15 287 respiratory specimens submitted to the Clinical Virology Laboratory at the Children's Hospital Colorado were tested for human coronavirus RNA by reverse transcription-PCR. Human coronaviruses HKU1, OC43, 229E and NL63 co-circulated during each of the respiratory seasons but with significant year-to-year variability, and cumulatively accounted for 7.4-15.6 % of all samples tested during the months of peak activity. A total of 79 (0.5 % prevalence) specimens were positive for human betacoronavirus HKU1 RNA. Genotypes HKU1 A and B were both isolated from clinical specimens and propagated on primary human tracheal-bronchial epithelial cells cultured at the air-liquid interface and were neutralized in vitro by human intravenous immunoglobulin and by polyclonal rabbit antibodies to the spike glycoprotein of HKU1. Phylogenetic analysis of the deduced amino acid sequences of seven full-length genomes of Colorado HKU1 viruses and the spike glycoproteins from four additional HKU1 viruses from Colorado and three from Brazil demonstrated remarkable conservation of these sequences with genotypes circulating in Hong Kong and France. Within genotype A, all but one of the Colorado HKU1 sequences formed a unique subclade defined by three amino acid substitutions (W197F, F613Y and S752F) in the spike glycoprotein and exhibited a unique signature in the acidic tandem repeat in the N-terminal region of the nsp3 subdomain. Elucidating the function of and mechanisms responsible for the formation of these varying tandem repeats will increase our understanding of the replication process and pathogenicity of HKU1 and potentially of other coronaviruses.


Assuntos
Infecções por Coronaviridae/epidemiologia , Infecções por Coronaviridae/virologia , Coronaviridae/classificação , Coronaviridae/isolamento & purificação , Infecções Respiratórias/epidemiologia , Infecções Respiratórias/virologia , Anticorpos Neutralizantes/sangue , Anticorpos Antivirais/sangue , Células Cultivadas , Análise por Conglomerados , Colorado , Coronaviridae/genética , Genótipo , Humanos , Dados de Sequência Molecular , Filogenia , RNA Viral/genética , Análise de Sequência de DNA , Glicoproteína da Espícula de Coronavírus/genética , Glicoproteína da Espícula de Coronavírus/imunologia , Cultura de Vírus
9.
Mol Plant Microbe Interact ; 25(8): 1118-31, 2012 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-22550957

RESUMO

Defensins are a class of small and diverse cysteine-rich proteins found in plants, insects, and vertebrates, which share a common tertiary structure and usually exert broad-spectrum antimicrobial activities. We used a bioinformatic approach to scan the Vitis vinifera genome and identified 79 defensin-like sequences (DEFL) corresponding to 46 genes and allelic variants, plus 33 pseudogenes and gene fragments. Expansion and diversification of grapevine DEFL has occurred after the split from the last common ancestor with the genera Medicago and Arabidopsis. Grapevine DEFL localization on the 'Pinot Noir' genome revealed the presence of several clusters likely evolved through local duplications. By sequencing reverse-transcription polymerase chain reaction products, we could demonstrate the expression of grapevine DEFL with no previously reported record of expression. Many of these genes are predominantly or exclusively expressed in tissues linked to plant reproduction, consistent with findings in other plant species, and some of them accumulated at fruit ripening. The transcripts of five DEFL were also significantly upregulated in tissues infected with Botrytis cinerea, a necrotrophic mold, suggesting a role of these genes in defense against this pathogen. Finally, three novel defensins were discovered among the identified DEFL. They inhibit B. cinerea conidia germination when expressed as recombinant proteins.


Assuntos
Defensinas/genética , Família Multigênica , Vitis/genética , Sequência de Aminoácidos , Botrytis/patogenicidade , Resistência à Doença/genética , Regulação da Expressão Gênica de Plantas , Genoma de Planta , Dados de Sequência Molecular , Filogenia , Proteínas Recombinantes/genética , Proteínas Recombinantes/metabolismo , Vitis/microbiologia
10.
BMC Genomics ; 13: 368, 2012 Aug 02.
Artigo em Inglês | MEDLINE | ID: mdl-22857610

RESUMO

BACKGROUND: Soybean (Glycine max (L. Merr.)) resistance to any population of Heterodera glycines (I.), or Fusarium virguliforme (Akoi, O'Donnell, Homma & Lattanzi) required a functional allele at Rhg1/Rfs2. H. glycines, the soybean cyst nematode (SCN) was an ancient, endemic, pest of soybean whereas F. virguliforme causal agent of sudden death syndrome (SDS), was a recent, regional, pest. This study examined the role of a receptor like kinase (RLK) GmRLK18-1 (gene model Glyma_18_02680 at 1,071 kbp on chromosome 18 of the genome sequence) within the Rhg1/Rfs2 locus in causing resistance to SCN and SDS. RESULTS: A BAC (B73p06) encompassing the Rhg1/Rfs2 locus was sequenced from a resistant cultivar and compared to the sequences of two susceptible cultivars from which 800 SNPs were found. Sequence alignments inferred that the resistance allele was an introgressed region of about 59 kbp at the center of which the GmRLK18-1 was the most polymorphic gene and encoded protein. Analyses were made of plants that were either heterozygous at, or transgenic (and so hemizygous at a new location) with, the resistance allele of GmRLK18-1. Those plants infested with either H. glycines or F. virguliforme showed that the allele for resistance was dominant. In the absence of Rhg4 the GmRLK18-1 was sufficient to confer nearly complete resistance to both root and leaf symptoms of SDS caused by F. virguliforme and provided partial resistance to three different populations of nematodes (mature female cysts were reduced by 30-50%). In the presence of Rhg4 the plants with the transgene were nearly classed as fully resistant to SCN (females reduced to 11% of the susceptible control) as well as SDS. A reduction in the rate of early seedling root development was also shown to be caused by the resistance allele of the GmRLK18-1. Field trials of transgenic plants showed an increase in foliar susceptibility to insect herbivory. CONCLUSIONS: The inference that soybean has adapted part of an existing pathogen recognition and defense cascade (H.glycines; SCN and insect herbivory) to a new pathogen (F. virguliforme; SDS) has broad implications for crop improvement. Stable resistance to many pathogens might be achieved by manipulation the genes encoding a small number of pathogen recognition proteins.


Assuntos
Glycine max/metabolismo , Proteínas de Plantas/genética , Alelos , Animais , Sequência de Bases , Morte Súbita , Feminino , Genes de Plantas , Loci Gênicos , Pleiotropia Genética , Genótipo , Dados de Sequência Molecular , Nematoides/patogenicidade , Doenças das Plantas/genética , Doenças das Plantas/parasitologia , Folhas de Planta/genética , Folhas de Planta/crescimento & desenvolvimento , Folhas de Planta/metabolismo , Proteínas de Plantas/metabolismo , Raízes de Plantas/genética , Raízes de Plantas/crescimento & desenvolvimento , Raízes de Plantas/metabolismo , Plantas Geneticamente Modificadas/genética , Plantas Geneticamente Modificadas/metabolismo , Polimorfismo de Nucleotídeo Único , Transdução de Sinais/genética , Glycine max/genética , Glycine max/crescimento & desenvolvimento , Síndrome , Transgenes
11.
J Gen Virol ; 93(Pt 11): 2387-2398, 2012 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-22837419

RESUMO

This study compared the complete genome sequences of 16 NL63 strain human coronaviruses (hCoVs) from respiratory specimens of paediatric patients with respiratory disease in Colorado, USA, and characterized the epidemiology and clinical characteristics associated with circulating NL63 viruses over a 3-year period. From 1 January 2009 to 31 December 2011, 92 of 9380 respiratory specimens were found to be positive for NL63 RNA by PCR, an overall prevalence of 1 %. NL63 viruses were circulating during all 3 years, but there was considerable yearly variation in prevalence and the month of peak incidence. Phylogenetic analysis comparing the genome sequences of the 16 Colorado NL63 viruses with those of the prototypical hCoV-NL63 and three other NL63 viruses from the Netherlands demonstrated that there were three genotypes (A, B and C) circulating in Colorado from 2005 to 2010, and evidence of recombination between virus strains was found. Genotypes B and C co-circulated in Colorado in 2005, 2009 and 2010, but genotype A circulated only in 2005 when it was the predominant NL63 strain. Genotype C represents a new lineage that has not been described previously. The greatest variability in the NL63 virus genomes was found in the N-terminal domain (NTD) of the spike gene (nt 1-600, aa 1-200). Ten different amino acid sequences were found in the NTD of the spike protein among these NL63 strains and the 75 partial published sequences of NTDs from strains found at different times throughout the world.


Assuntos
Coronavirus Humano NL63/genética , Variação Genética , Genótipo , Glicoproteínas de Membrana/genética , Recombinação Genética , Proteínas do Envelope Viral/genética , Adolescente , Criança , Pré-Escolar , Colorado/epidemiologia , Infecções por Coronavirus/epidemiologia , Infecções por Coronavirus/virologia , Feminino , Genoma Viral , Humanos , Lactente , Recém-Nascido , Masculino , Dados de Sequência Molecular , Filogenia , Estrutura Terciária de Proteína , Glicoproteína da Espícula de Coronavírus , Fatores de Tempo
12.
Theor Appl Genet ; 124(4): 685-95, 2012 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-22069119

RESUMO

The availability of genomic resources can facilitate progress in plant breeding through the application of advanced molecular technologies for crop improvement. This is particularly important in the case of less researched crops such as cassava, a staple and food security crop for more than 800 million people. Here, expressed sequence tags (ESTs) were generated from five drought stressed and well-watered cassava varieties. Two cDNA libraries were developed: one from root tissue (CASR), the other from leaf, stem and stem meristem tissue (CASL). Sequencing generated 706 contigs and 3,430 singletons. These sequences were combined with those from two other EST sequencing initiatives and filtered based on the sequence quality. Quality sequences were aligned using CAP3 and embedded in a Windows browser called HarvEST:Cassava which is made available. HarvEST:Cassava consists of a Unigene set of 22,903 quality sequences. A total of 2,954 putative SNPs were identified. Of these 1,536 SNPs from 1,170 contigs and 53 cassava genotypes were selected for SNP validation using Illumina's GoldenGate assay. As a result 1,190 SNPs were validated technically and biologically. The location of validated SNPs on scaffolds of the cassava genome sequence (v.4.1) is provided. A diversity assessment of 53 cassava varieties reveals some sub-structure based on the geographical origin, greater diversity in the Americas as opposed to Africa, and similar levels of diversity in West Africa and southern, eastern and central Africa. The resources presented allow for improved genetic dissection of economically important traits and the application of modern genomics-based approaches to cassava breeding and conservation.


Assuntos
Genes de Plantas/genética , Sequenciamento de Nucleotídeos em Larga Escala , Manihot/genética , Raízes de Plantas/genética , Polimorfismo de Nucleotídeo Único/genética , África , Mapeamento Cromossômico , DNA Complementar/genética , DNA de Plantas/genética , Etiquetas de Sequências Expressas , Biblioteca Gênica , Genótipo , Manihot/crescimento & desenvolvimento , Filogenia , Raízes de Plantas/crescimento & desenvolvimento
13.
G3 (Bethesda) ; 12(3)2022 03 04.
Artigo em Inglês | MEDLINE | ID: mdl-35100357

RESUMO

Many studies have highlighted the complex and diverse basis for heterosis in inbred crops. Despite the lack of a consensus model, it is vital that we turn our attention to understanding heterosis in undomesticated, heterozygous, and polyploid species, such as willow (Salix spp.). Shrub willow is a dedicated energy crop bred to be fast-growing and high yielding on marginal land without competing with food crops. A trend in willow breeding is the consistent pattern of heterosis in triploids produced from crosses between diploid and tetraploid species. Here, we test whether differentially expressed genes are associated with heterosis in triploid families derived from diploid Salix purpurea, diploid Salix viminalis, and tetraploid Salix miyabeana parents. Three biological replicates of shoot tips from all family progeny and parents were collected after 12 weeks in the greenhouse and RNA extracted for RNA-Seq analysis. This study provides evidence that nonadditive patterns of gene expression are correlated with nonadditive phenotypic expression in interspecific triploid hybrids of willow. Expression-level dominance was most correlated with heterosis for biomass yield traits and was highly enriched for processes involved in starch and sucrose metabolism. In addition, there was a global dosage effect of parent alleles in triploid hybrids, with expression proportional to copy number variation. Importantly, differentially expressed genes between family parents were most predictive of heterosis for both field and greenhouse collected traits. Altogether, these data will be used to progress models of heterosis to complement the growing genomic resources available for the improvement of heterozygous perennial bioenergy crops.


Assuntos
Salix , Triploidia , Variações do Número de Cópias de DNA , Regulação da Expressão Gênica de Plantas , Humanos , Vigor Híbrido/genética , Hibridização Genética , Melhoramento Vegetal , Salix/genética
14.
BMC Genomics ; 12: 1-11, 2011 Jul 06.
Artigo em Inglês | MEDLINE | ID: mdl-21733171

RESUMO

BACKGROUND: Single nucleotide polymorphisms (SNPs) are the most common type of sequence variation among plants and are often functionally important. We describe the use of 454 technology and high resolution melting analysis (HRM) for high throughput SNP discovery in tetraploid alfalfa (Medicago sativa L.), a species with high economic value but limited genomic resources. RESULTS: The alfalfa genotypes selected from M. sativa subsp. sativa var. 'Chilean' and M. sativa subsp. falcata var. 'Wisfal', which differ in water stress sensitivity, were used to prepare cDNA from tissue of clonally-propagated plants grown under either well-watered or water-stressed conditions, and then pooled for 454 sequencing. Based on 125.2 Mb of raw sequence, a total of 54,216 unique sequences were obtained including 24,144 tentative consensus (TCs) sequences and 30,072 singletons, ranging from 100 bp to 6,662 bp in length, with an average length of 541 bp. We identified 40,661 candidate SNPs distributed throughout the genome. A sample of candidate SNPs were evaluated and validated using high resolution melting (HRM) analysis. A total of 3,491 TCs harboring 20,270 candidate SNPs were located on the M. truncatula (MT 3.5.1) chromosomes. Gene Ontology assignments indicate that sequences obtained cover a broad range of GO categories. CONCLUSIONS: We describe an efficient method to identify thousands of SNPs distributed throughout the alfalfa genome covering a broad range of GO categories. Validated SNPs represent valuable molecular marker resources that can be used to enhance marker density in linkage maps, identify potential factors involved in heterosis and genetic variation, and as tools for association mapping and genomic selection in alfalfa.


Assuntos
Medicago sativa/genética , Polimorfismo de Nucleotídeo Único , Sequência de Bases , Etiquetas de Sequências Expressas , Genoma , Estudo de Associação Genômica Ampla , Genótipo , Dados de Sequência Molecular , Transição de Fase , Raízes de Plantas/genética , Brotos de Planta/genética , Alinhamento de Sequência , Análise de Sequência de DNA , Tetraploidia
15.
BMC Plant Biol ; 11: 56, 2011 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-21447154

RESUMO

BACKGROUND: Pigeonpea [Cajanus cajan (L.) Millsp.] is an important legume crop of rainfed agriculture. Despite of concerted research efforts directed to pigeonpea improvement, stagnated productivity of pigeonpea during last several decades may be accounted to prevalence of various biotic and abiotic constraints and the situation is exacerbated by availability of inadequate genomic resources to undertake any molecular breeding programme for accelerated crop improvement. With the objective of enhancing genomic resources for pigeonpea, this study reports for the first time, large scale development of SSR markers from BAC-end sequences and their subsequent use for genetic mapping and hybridity testing in pigeonpea. RESULTS: A set of 88,860 BAC (bacterial artificial chromosome)-end sequences (BESs) were generated after constructing two BAC libraries by using HindIII (34,560 clones) and BamHI (34,560 clones) restriction enzymes. Clustering based on sequence identity of BESs yielded a set of >52K non-redundant sequences, comprising 35 Mbp or >4% of the pigeonpea genome. These sequences were analyzed to develop annotation lists and subdivide the BESs into genome fractions (e.g., genes, retroelements, transpons and non-annotated sequences). Parallel analysis of BESs for microsatellites or simple sequence repeats (SSRs) identified 18,149 SSRs, from which a set of 6,212 SSRs were selected for further analysis. A total of 3,072 novel SSR primer pairs were synthesized and tested for length polymorphism on a set of 22 parental genotypes of 13 mapping populations segregating for traits of interest. In total, we identified 842 polymorphic SSR markers that will have utility in pigeonpea improvement. Based on these markers, the first SSR-based genetic map comprising of 239 loci was developed for this previously uncharacterized genome. Utility of developed SSR markers was also demonstrated by identifying a set of 42 markers each for two hybrids (ICPH 2671 and ICPH 2438) for genetic purity assessment in commercial hybrid breeding programme. CONCLUSION: In summary, while BAC libraries and BESs should be useful for genomics studies, BES-SSR markers, and the genetic map should be very useful for linking the genetic map with a future physical map as well as for molecular breeding in pigeonpea.


Assuntos
Cajanus/genética , Quimera/genética , Cromossomos Artificiais Bacterianos/genética , Repetições de Microssatélites , Sequência de Bases , Mapeamento Cromossômico , Marcadores Genéticos , Genótipo , Hibridização Genética , Dados de Sequência Molecular
16.
Plant Biotechnol J ; 9(8): 922-31, 2011 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-21615673

RESUMO

Chickpea (Cicer arietinum L.) is an important legume crop in the semi-arid regions of Asia and Africa. Gains in crop productivity have been low however, particularly because of biotic and abiotic stresses. To help enhance crop productivity using molecular breeding techniques, next generation sequencing technologies such as Roche/454 and Illumina/Solexa were used to determine the sequence of most gene transcripts and to identify drought-responsive genes and gene-based molecular markers. A total of 103,215 tentative unique sequences (TUSs) have been produced from 435,018 Roche/454 reads and 21,491 Sanger expressed sequence tags (ESTs). Putative functions were determined for 49,437 (47.8%) of the TUSs, and gene ontology assignments were determined for 20,634 (41.7%) of the TUSs. Comparison of the chickpea TUSs with the Medicago truncatula genome assembly (Mt 3.5.1 build) resulted in 42,141 aligned TUSs with putative gene structures (including 39,281 predicted intron/splice junctions). Alignment of ∼37 million Illumina/Solexa tags generated from drought-challenged root tissues of two chickpea genotypes against the TUSs identified 44,639 differentially expressed TUSs. The TUSs were also used to identify a diverse set of markers, including 728 simple sequence repeats (SSRs), 495 single nucleotide polymorphisms (SNPs), 387 conserved orthologous sequence (COS) markers, and 2088 intron-spanning region (ISR) markers. This resource will be useful for basic and applied research for genome analysis and crop improvement in chickpea.


Assuntos
Mapeamento Cromossômico/métodos , Cicer/genética , Perfilação da Expressão Gênica/métodos , Genoma de Planta , África , Ásia , Cicer/metabolismo , Cicer/fisiologia , Secas , Metabolismo Energético , Etiquetas de Sequências Expressas , Regulação da Expressão Gênica de Plantas , Biblioteca Gênica , Marcadores Genéticos , Genótipo , Íntrons , Medicago truncatula/genética , Repetições de Microssatélites , Raízes de Plantas/genética , Polimorfismo de Nucleotídeo Único , Alinhamento de Sequência/métodos , Estresse Fisiológico , Fatores de Transcrição/genética
17.
BMC Bioinformatics ; 10: 309, 2009 Sep 23.
Artigo em Inglês | MEDLINE | ID: mdl-19775460

RESUMO

BACKGROUND: SSWAP (Simple Semantic Web Architecture and Protocol; pronounced "swap") is an architecture, protocol, and platform for using reasoning to semantically integrate heterogeneous disparate data and services on the web. SSWAP was developed as a hybrid semantic web services technology to overcome limitations found in both pure web service technologies and pure semantic web technologies. RESULTS: There are currently over 2400 resources published in SSWAP. Approximately two dozen are custom-written services for QTL (Quantitative Trait Loci) and mapping data for legumes and grasses (grains). The remaining are wrappers to Nucleic Acids Research Database and Web Server entries. As an architecture, SSWAP establishes how clients (users of data, services, and ontologies), providers (suppliers of data, services, and ontologies), and discovery servers (semantic search engines) interact to allow for the description, querying, discovery, invocation, and response of semantic web services. As a protocol, SSWAP provides the vocabulary and semantics to allow clients, providers, and discovery servers to engage in semantic web services. The protocol is based on the W3C-sanctioned first-order description logic language OWL DL. As an open source platform, a discovery server running at http://sswap.info (as in to "swap info") uses the description logic reasoner Pellet to integrate semantic resources. The platform hosts an interactive guide to the protocol at http://sswap.info/protocol.jsp, developer tools at http://sswap.info/developer.jsp, and a portal to third-party ontologies at http://sswapmeet.sswap.info (a "swap meet"). CONCLUSION: SSWAP addresses the three basic requirements of a semantic web services architecture (i.e., a common syntax, shared semantic, and semantic discovery) while addressing three technology limitations common in distributed service systems: i.e., i) the fatal mutability of traditional interfaces, ii) the rigidity and fragility of static subsumption hierarchies, and iii) the confounding of content, structure, and presentation. SSWAP is novel by establishing the concept of a canonical yet mutable OWL DL graph that allows data and service providers to describe their resources, to allow discovery servers to offer semantically rich search engines, to allow clients to discover and invoke those resources, and to allow providers to respond with semantically tagged data. SSWAP allows for a mix-and-match of terms from both new and legacy third-party ontologies in these graphs.


Assuntos
Biologia Computacional/métodos , Disseminação de Informação/métodos , Semântica , Software , Bases de Dados Factuais , Armazenamento e Recuperação da Informação , Internet , Interface Usuário-Computador
18.
BMC Genomics ; 10: 539, 2009 Nov 18.
Artigo em Inglês | MEDLINE | ID: mdl-19922648

RESUMO

BACKGROUND: The Brassica species, related to Arabidopsis thaliana, include an important group of crops and represent an excellent system for studying the evolutionary consequences of polyploidy. Previous studies have led to a proposed structure for an ancestral karyotype and models for the evolution of the B. rapa genome by triplication and segmental rearrangement, but these have not been validated at the sequence level. RESULTS: We developed computational tools to analyse the public collection of B. rapa BAC end sequence, in order to identify candidates for representing collinearity discontinuities between the genomes of B. rapa and A. thaliana. For each putative discontinuity, one of the BACs was sequenced and analysed for collinearity with the genome of A. thaliana. Additional BAC clones were identified and sequenced as part of ongoing efforts to sequence four chromosomes of B. rapa. Strikingly few of the 19 inter-chromosomal rearrangements corresponded to the set of collinearity discontinuities anticipated on the basis of previous studies. Our analyses revealed numerous instances of newly detected collinearity blocks. For B. rapa linkage group A8, we were able to develop a model for the derivation of the chromosome from the ancestral karyotype. We were also able to identify a rearrangement event in the ancestor of B. rapa that was not shared with the ancestor of A. thaliana, and is represented in triplicate in the B. rapa genome. In addition to inter-chromosomal rearrangements, we identified and analysed 32 BACs containing the end points of segmental inversion events. CONCLUSION: Our results show that previous studies of segmental collinearity between the A. thaliana, Brassica and ancestral karyotype genomes, although very useful, represent over-simplifications of their true relationships. The presence of numerous cryptic collinear genome segments and the frequent occurrence of segmental inversions mean that inference of the positions of genes in B. rapa based on the locations of orthologues in A. thaliana can be misleading. Our results will be of relevance to a wide range of plants that have polyploid genomes, many of which are being considered according to a paradigm of comprising conserved synteny blocks with respect to sequenced, related genomes.


Assuntos
Brassica rapa/genética , Evolução Molecular , Genoma de Planta/genética , Genômica , Arabidopsis/genética , Cromossomos Artificiais Bacterianos/genética , Cromossomos de Plantas/genética , Clonagem Molecular , DNA de Plantas/genética , Rearranjo Gênico , Cariotipagem , Reprodutibilidade dos Testes , Análise de Sequência de DNA
19.
BMC Genomics ; 10: 523, 2009 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-19912666

RESUMO

BACKGROUND: Chickpea (Cicer arietinum L.), an important grain legume crop of the world is seriously challenged by terminal drought and salinity stresses. However, very limited number of molecular markers and candidate genes are available for undertaking molecular breeding in chickpea to tackle these stresses. This study reports generation and analysis of comprehensive resource of drought- and salinity-responsive expressed sequence tags (ESTs) and gene-based markers. RESULTS: A total of 20,162 (18,435 high quality) drought- and salinity- responsive ESTs were generated from ten different root tissue cDNA libraries of chickpea. Sequence editing, clustering and assembly analysis resulted in 6,404 unigenes (1,590 contigs and 4,814 singletons). Functional annotation of unigenes based on BLASTX analysis showed that 46.3% (2,965) had significant similarity (< or =1E-05) to sequences in the non-redundant UniProt database. BLASTN analysis of unique sequences with ESTs of four legume species (Medicago, Lotus, soybean and groundnut) and three model plant species (rice, Arabidopsis and poplar) provided insights on conserved genes across legumes as well as novel transcripts for chickpea. Of 2,965 (46.3%) significant unigenes, only 2,071 (32.3%) unigenes could be functionally categorised according to Gene Ontology (GO) descriptions. A total of 2,029 sequences containing 3,728 simple sequence repeats (SSRs) were identified and 177 new EST-SSR markers were developed. Experimental validation of a set of 77 SSR markers on 24 genotypes revealed 230 alleles with an average of 4.6 alleles per marker and average polymorphism information content (PIC) value of 0.43. Besides SSR markers, 21,405 high confidence single nucleotide polymorphisms (SNPs) in 742 contigs (with > or = 5 ESTs) were also identified. Recognition sites for restriction enzymes were identified for 7,884 SNPs in 240 contigs. Hierarchical clustering of 105 selected contigs provided clues about stress- responsive candidate genes and their expression profile showed predominance in specific stress-challenged libraries. CONCLUSION: Generated set of chickpea ESTs serves as a resource of high quality transcripts for gene discovery and development of functional markers associated with abiotic stress tolerance that will be helpful to facilitate chickpea breeding. Mapping of gene-based markers in chickpea will also add more anchoring points to align genomes of chickpea and other legume species.


Assuntos
Cicer/efeitos dos fármacos , Cicer/genética , Secas , Etiquetas de Sequências Expressas , Salinidade , Estresse Fisiológico/genética , Cicer/metabolismo , Perfilação da Expressão Gênica , Regulação da Expressão Gênica de Plantas/efeitos dos fármacos , Marcadores Genéticos/genética , Genótipo , Raízes de Plantas/efeitos dos fármacos , Raízes de Plantas/genética , Raízes de Plantas/metabolismo , Raízes de Plantas/fisiologia , Polimorfismo de Nucleotídeo Único/efeitos dos fármacos , Sequências Repetitivas de Ácido Nucleico/efeitos dos fármacos , Cloreto de Sódio/farmacologia , Estresse Fisiológico/efeitos dos fármacos
20.
BMC Plant Biol ; 9: 50, 2009 May 08.
Artigo em Inglês | MEDLINE | ID: mdl-19426481

RESUMO

BACKGROUND: The Brassica species include an important group of crops and provide opportunities for studying the evolutionary consequences of polyploidy. They are related to Arabidopsis thaliana, for which the first complete plant genome sequence was obtained and their genomes show extensive, although imperfect, conserved synteny with that of A. thaliana. A large number of EST sequences, derived from a range of different Brassica species, are available in the public database, but no public microarray resource has so far been developed for these species. RESULTS: We assembled unigenes using approximately 800,000 EST sequences, mainly from three species: B. napus, B. rapa and B. oleracea. The assembly was conducted with the aim of co-assembling ESTs of orthologous genes (including homoeologous pairs of genes in B. napus from each of the A and C genomes), but resolving assemblies of paralogous, or paleo-homoeologous, genes (i.e. the genes related by the ancestral genome triplication observed in diploid Brassica species). 90,864 unique sequence assemblies were developed. These were incorporated into the BAC sequence annotation for the Brassica rapa Genome Sequencing Project, enabling the identification of cognate genomic sequences for a proportion of them. A 60-mer oligo microarray comprising 94,558 probes was developed using the unigene sequences. Gene expression was analysed in reciprocal resynthesised B. napus lines and the B. oleracea and B. rapa lines used to produce them. The analysis showed that significant expression could consistently be detected in leaf tissue for 35,386 unigenes. Expression was detected across all four genotypes for 27,355 unigenes, genome-specific expression patterns were observed for 7,851 unigenes and 180 unigenes displayed other classes of expression pattern. Principal component analysis (PCA) clearly resolved the individual microarray datasets for B. rapa, B. oleracea and resynthesised B. napus. Quantitative differences in expression were observed between the resynthesised B. napus lines for 98 unigenes, most of which could be classified into non-additive expression patterns, including 17 that showed cytoplasm-specific patterns. We further characterized the unigenes for which A genome-specific expression was observed and cognate genomic sequences could be identified. Ten of these unigenes were found to be Brassica-specific sequences, including two that originate from complex loci comprising gene clusters. CONCLUSION: We succeeded in developing a Brassica community microarray resource. Although expression can be measured for the majority of unigenes across species, there were numerous probes that reported in a genome-specific manner. We anticipate that some proportion of these will represent species-specific transcripts and the remainder will be the consequence of variation of sequences within the regions represented by the array probes. Our studies demonstrated that the datasets obtained from the arrays can be used for typical analyses, including PCA and the analysis of differential expression. We have also demonstrated that Brassica-specific transcripts identified in silico in the sequence assembly of public EST database accessions are indeed reported by the array. These would not be detectable using arrays designed using A. thaliana sequences.


Assuntos
Brassica/genética , Etiquetas de Sequências Expressas , Perfilação da Expressão Gênica , Genoma de Planta , Bases de Dados Genéticas , Regulação da Expressão Gênica de Plantas , Genes de Plantas , Genótipo , Análise de Sequência com Séries de Oligonucleotídeos , Análise de Componente Principal , RNA de Plantas/genética , Análise de Sequência de DNA , Especificidade da Espécie
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA