Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 115
Filtrar
1.
Proc Natl Acad Sci U S A ; 120(4): e2216822120, 2023 01 24.
Artigo em Inglês | MEDLINE | ID: mdl-36652483

RESUMO

Clustered regularly interspaced short palindromic repeats and CRISPR-associated proteins (CRISPR-Cas) systems have been developed as important tools for plant genome engineering. Here, we demonstrate that the hypercompact CasΦ nuclease is able to generate stably inherited gene edits in Arabidopsis, and that CasΦ guide RNAs can be expressed with either the Pol-III U6 promoter or a Pol-II promoter together with ribozyme mediated RNA processing. Using the Arabidopsis fwa epiallele, we show that CasΦ displays higher editing efficiency when the target locus is not DNA methylated, suggesting that CasΦ is sensitive to chromatin environment. Importantly, two CasΦ protein variants, vCasΦ and nCasΦ, both showed much higher editing efficiency relative to the wild-type CasΦ enzyme. Consistently, vCasΦ and nCasΦ yielded offspring plants with inherited edits at much higher rates compared to WTCasΦ. Extensive genomic analysis of gene edited plants showed no off-target editing, suggesting that CasΦ is highly specific. The hypercompact size, T-rich minimal protospacer adjacent motif (PAM), and wide range of working temperatures make CasΦ an excellent supplement to existing plant genome editing systems.


Assuntos
Proteínas de Arabidopsis , Arabidopsis , Edição de Genes , Arabidopsis/genética , Sistemas CRISPR-Cas , Plantas/genética , Genoma de Planta/genética , Fatores de Transcrição/genética , Proteínas de Homeodomínio/genética , Proteínas de Arabidopsis/genética
2.
Plant J ; 115(1): 52-67, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-36965091

RESUMO

By contrast to their conserved mammalian counterparts, plant long interspersed nuclear elements (LINEs) are highly variable, splitting into many low-copy families. Curiously, LINE families from the retrotransposable element (RTE) clade retain a stronger sequence conservation and hence reach higher copy numbers. The cause of this RTE-typical property is not yet understood, but would help clarify why some transposable elements are removed quickly, whereas others persist in plant genomes. Here, we bring forward a detailed study of RTE LINE structure, diversity and evolution in plants. For this, we argue that the nightshade family is the ideal taxon to follow the evolutionary trajectories of RTE LINEs, given their high abundance, recent activity and partnership to non-autonomous elements. Using bioinformatic, cytogenetic and molecular approaches, we detect 4029 full-length RTE LINEs across the Solanaceae. We finely characterize and manually curate a core group of 458 full-length LINEs in allotetraploid tobacco, show an integration event after polyploidization and trace hybridization by RTE LINE composition of parental genomes. Finally, we reveal the role of the untranslated regions (UTRs) as causes for the unique RTE LINE amplification and evolution pattern in plants. On the one hand, we detected a highly conserved motif at the 3' UTR, suggesting strong selective constraints acting on the RTE terminus. On the other hand, we observed successive rounds of 5' UTR cycling, constantly rejuvenating the promoter sequences. This interplay between exchangeable promoters and conserved LINE bodies and 3' UTR likely allows RTE LINEs to persist and thrive in plant genomes.


Assuntos
Nicotiana , Retroelementos , Animais , Retroelementos/genética , Nicotiana/genética , Regiões 3' não Traduzidas , Genoma de Planta/genética , Plantas , Sequências Repetidas Terminais/genética , Evolução Molecular , Filogenia , Mamíferos
3.
BMC Genomics ; 25(1): 515, 2024 May 25.
Artigo em Inglês | MEDLINE | ID: mdl-38796435

RESUMO

BACKGROUND: The short-read whole-genome sequencing (WGS) approach has been widely applied to investigate the genomic variation in the natural populations of many plant species. With the rapid advancements in long-read sequencing and genome assembly technologies, high-quality genome sequences are available for a group of varieties for many plant species. These genome sequences are expected to help researchers comprehensively investigate any type of genomic variants that are missed by the WGS technology. However, multiple genome alignment (MGA) tools designed by the human genome research community might be unsuitable for plant genomes. RESULTS: To fill this gap, we developed the AnchorWave-Cactus Multiple Genome Alignment (ACMGA) pipeline, which improved the alignment of repeat elements and could identify long (> 50 bp) deletions or insertions (INDELs). We conducted MGA using ACMGA and Cactus for 8 Arabidopsis (Arabidopsis thaliana) and 26 Maize (Zea mays) de novo assembled genome sequences and compared them with the previously published short-read variant calling results. MGA identified more single nucleotide variants (SNVs) and long INDELs than did previously published WGS variant callings. Additionally, ACMGA detected significantly more SNVs and long INDELs in repetitive regions and the whole genome than did Cactus. Compared with the results of Cactus, the results of ACMGA were more similar to the previously published variants called using short-read. These two MGA pipelines identified numerous multi-allelic variants that were missed by the WGS variant calling pipeline. CONCLUSIONS: Aligning de novo assembled genome sequences could identify more SNVs and INDELs than mapping short-read. ACMGA combines the advantages of AnchorWave and Cactus and offers a practical solution for plant MGA by integrating global alignment, a 2-piece-affine-gap cost strategy, and the progressive MGA algorithm.


Assuntos
Arabidopsis , Genoma de Planta , Zea mays , Arabidopsis/genética , Zea mays/genética , Alinhamento de Sequência , Mutação INDEL , Genômica/métodos , Polimorfismo de Nucleotídeo Único , Sequenciamento Completo do Genoma/métodos , Software
4.
Plant Cell Environ ; 2024 Aug 13.
Artigo em Inglês | MEDLINE | ID: mdl-39136390

RESUMO

Heavy and costly use of phosphorus (P) fertiliser is often needed to achieve high crop yields, but only a small amount of applied P fertiliser is available to most crop plants. Hakea prostrata (Proteaceae) is endemic to the P-impoverished landscape of southwest Australia and has several P-saving traits. We identified 16 members of the Phosphate Transporter 1 (PHT1) gene family (HpPHT1;1-HpPHT1;12d) in a long-read genome assembly of H. prostrata. Based on phylogenetics, sequence structure and expression patterns, we classified HpPHT1;1 as potentially involved in Pi uptake from soil and HpPHT1;8 and HpPHT1;9 as potentially involved in Pi uptake and root-to-shoot translocation. Three genes, HpPHT1;4, HpPHT1;6 and HpPHT1;8, lacked regulatory PHR1-binding sites (P1BS) in the promoter regions. Available expression data for HpPHT1;6 and HpPHT1;8 indicated they are not responsive to changes in P supply, potentially contributing to the high P sensitivity of H. prostrata. We also discovered a Proteaceae-specific clade of closely-spaced PHT1 genes that lacked conserved genetic architecture among genera, indicating an evolutionary hot spot within the genome. Overall, the genome assembly of H. prostrata provides a much-needed foundation for understanding the genetic mechanisms of novel adaptations to low P soils in southwest Australian plants.

5.
Proc Natl Acad Sci U S A ; 118(22)2021 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-34050013

RESUMO

Conventional methods of DNA sequence insertion into plants, using Agrobacterium-mediated transformation or microprojectile bombardment, result in the integration of the DNA at random sites in the genome. These plants may exhibit altered agronomic traits as a consequence of disruption or silencing of genes that serve a critical function. Also, genes of interest inserted at random sites are often not expressed at the desired level. For these reasons, targeted DNA insertion at suitable genomic sites in plants is a desirable alternative. In this paper we review approaches of targeted DNA insertion in plant genomes, discuss current technical challenges, and describe promising applications of targeted DNA insertion for crop genetic improvement.


Assuntos
Produtos Agrícolas/genética , DNA de Plantas/genética , Técnicas de Transferência de Genes , Genoma de Planta , Plantas Geneticamente Modificadas/genética , Transformação Genética , Agrobacterium
6.
New Phytol ; 239(3): 868-874, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37282668

RESUMO

The CRISPR-Cas-based genome editing field in plants is expanding rapidly. Editing plant promoters to obtain cis-regulatory alleles with altered expression levels or patterns of target genes is a highly promising topic. However, primarily used CRISPR-Cas9 has significant limitations when editing noncoding sequences like promoters, which have unique structures and regulatory mechanisms, including A-T richness, repetitive redundancy, difficulty in identifying key regulatory regions, and a higher frequency of DNA structure, epigenetic modification, and protein binding accessibility issues. Researchers urgently require efficient and feasible editing tools and strategies to address these obstacles, enhance promoter editing efficiency, increase diversity in promoter polymorphism, and, most importantly, enable 'non-silent' editing events that achieve precise target gene expression regulation. This article provides insights into the key challenges and references for implementing promoter editing-based research in plants.


Assuntos
Sistemas CRISPR-Cas , Edição de Genes , Sistemas CRISPR-Cas/genética , Plantas/genética , Regiões Promotoras Genéticas/genética , Sequências Reguladoras de Ácido Nucleico , Genoma de Planta
7.
J Exp Bot ; 74(10): 2944-2955, 2023 05 19.
Artigo em Inglês | MEDLINE | ID: mdl-36882965

RESUMO

The angiosperm genus Cuscuta lives as an almost achlorophyllous root- and leafless holoparasite and has therefore occupied scientists for more than a century. The 'evolution' of Cuscuta research started with early studies that established the phylogenetic framework for this unusual genus. It continued to produce groundbreaking cytological, morphological, and physiological insight throughout the second half of the 20th century and culminated in the last two decades in exciting discoveries regarding the molecular basis of Cuscuta parasitism that were facilitated by the modern 'omics' tools and traceable fluorescent marker technologies of the 21st century. This review will show how present activities are inspired by those past breakthroughs. It will describe significant milestones and recurring themes of Cuscuta research and connect these to the remaining as well as newly evolving questions and future directions in this research field that is expected to sustain its strong growth in the future.


Assuntos
Cuscuta , Filogenia
8.
Genomics ; 114(3): 110384, 2022 05.
Artigo em Inglês | MEDLINE | ID: mdl-35533969

RESUMO

A promoter is a short DNA sequence near the start codon, responsible for initiating the transcription of a specific gene in the genome. The accurate recognition of promoters is important for achieving a better understanding of transcriptional regulation. Because of their importance in the process of biological transcriptional regulation, there is an urgent need to develop in silico tools to identify promoters and their types in a timely and accurate manner. A number of prediction methods have been developed in this regard; however, almost all of them are merely used for identifying promoters and their strength or sigma types. The TATA box region in TATA promoter influences the post-transcriptional processes; therefore, in the current study, we developed a two-layer predictor called "iProm-Zea" using the convolutional neural network (CNN) for identify TATA and TATA less promoters. The first layer can be used to identify a given DNA sequence as a promoter or non-promoter. The second layer can be used to identify whether the recognized promoter is the TATA promoter. To find an optimal feature encoding scheme and model, we employed four feature encoding schemes on different machine learning and CNN algorithms, and based on the evaluation results, we selected a one-hot encoding scheme and a CNN model for iProm-Zea. The 5-fold cross validation testing results demonstrated that the constructed predictor showed great potential for identifying promoters and classifying them as TATA and TATA less promoters. Furthermore, we performed cross-species analysis of iProm-Zea to evaluate its performance in other species. Moreover, to make it easier for other experimental scientists to obtain the results they need, we established a freely accessible and user-friendly web server at http://nsclbio.jbnu.ac.kr/tools/iProm-Zea/.


Assuntos
Redes Neurais de Computação , Zea mays , Zea mays/genética , Regiões Promotoras Genéticas , Sequência de Bases , Algoritmos , TATA Box
9.
Int J Mol Sci ; 24(10)2023 May 11.
Artigo em Inglês | MEDLINE | ID: mdl-37239967

RESUMO

Genome editing is an important strategy to maintain global food security and achieve sustainable agricultural development. Among all genome editing tools, CRISPR-Cas is currently the most prevalent and offers the most promise. In this review, we summarize the development of CRISPR-Cas systems, outline their classification and distinctive features, delineate their natural mechanisms in plant genome editing and exemplify the applications in plant research. Both classical and recently discovered CRISPR-Cas systems are included, detailing the class, type, structures and functions of each. We conclude by highlighting the challenges that come with CRISPR-Cas and offer suggestions on how to tackle them. We believe the gene editing toolbox will be greatly enriched, providing new avenues for a more efficient and precise breeding of climate-resilient crops.


Assuntos
Edição de Genes , Melhoramento Vegetal , Sistemas CRISPR-Cas/genética , Genoma de Planta , Produtos Agrícolas/genética
10.
Plant Biotechnol J ; 20(6): 1031-1041, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-35332665

RESUMO

Genome phasing is a recently developed assembly method that separates heterozygous eukaryotic genomic regions and builds haplotype-resolved assemblies. Because differences between haplotypes are ignored in most published de novo genomes, assemblies are available as consensus genomes consisting of haplotype mixtures, thus increasing the need for genome phasing. Here, we review the operating principles and characteristics of several freely available and widely used phasing tools (TrioCanu, FALCON-Phase, and ALLHiC). An examination of downstream analyses using haplotype-resolved genome assemblies in plants indicated significant differences among haplotypes regarding chromosomal rearrangements, sequence insertions, and expression of specific alleles that contribute to the acquisition of the biological characteristics of plant species. Finally, we suggest directions to solve addressing limitations of current genome-phasing methods. This review provides insights into the current progress, limitations, and future directions of de novo genome phasing, which will enable researchers to easily access and utilize genome-phasing in studies involving highly heterozygous complex plant genomes.


Assuntos
Genoma de Planta , Genômica , Alelos , Genoma de Planta/genética , Haplótipos/genética , Plantas/genética , Análise de Sequência de DNA/métodos
11.
Plant Cell Rep ; 41(4): 1163-1166, 2022 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-34977976

RESUMO

KEY MESSAGE: We re-annotated repeats of 459 plant genomes and released a new database: PlantRep ( http://www.plantrep.cn/ ). PlantRep sheds lights of repeat evolution and provides fundamental data for deep exploration of genome.


Assuntos
Elementos de DNA Transponíveis , Genoma de Planta , Evolução Molecular , Genoma de Planta/genética , Sequências Repetitivas de Ácido Nucleico/genética
12.
Int J Mol Sci ; 23(22)2022 Nov 17.
Artigo em Inglês | MEDLINE | ID: mdl-36430746

RESUMO

The nucleotide-binding and leucine-rich repeat (NB-LRR) genes, also known as resistance (R)-genes, play an important role in the activation of immune responses. In recent years, large-scale studies have been performed to highlight the diversification of plant NB-LRR repertories. It is well known that, to provide new functionalities, NB-LRR sequences are subject to duplication, domain fusions and acquisition and other kinds of mutations. Although some mechanisms that govern NB-LRR protein domain adaptations have been uncovered, to retrace the plant-lineage-specific evolution routes of R protein structure, a multi-genome comparative analysis was performed. This study allowed us to define groups of genes sharing homology relationships across different species. It is worth noting that the most populated groups contained well-characterized R proteins. The arsenal profile of such groups was investigated in five botanical families, including important crop species, to underline specific adaptation signatures. In addition, the dissection of 70 NB domains of well-characterized R-genes revealed the NB core motifs from which the three main R protein classes have been diversified. The structural remodeling of domain segments shaped the specific NB-LRR repertoires observed in each plant species. This analysis provided new evolutionary and functional insights on NB protein domain shuffling. Taken together, such findings improved our understanding of the molecular adaptive selection mechanisms occurring at plant R loci.


Assuntos
Proteínas de Plantas , Plantas , Humanos , Domínios Proteicos , Proteínas de Plantas/metabolismo , Plantas/metabolismo , Aclimatação
13.
Plant J ; 97(1): 182-198, 2019 01.
Artigo em Inglês | MEDLINE | ID: mdl-30500991

RESUMO

Recent advances in genomics technologies have greatly accelerated the progress in both fundamental plant science and applied breeding research. Concurrently, high-throughput plant phenotyping is becoming widely adopted in the plant community, promising to alleviate the phenotypic bottleneck. While these technological breakthroughs are significantly accelerating quantitative trait locus (QTL) and causal gene identification, challenges to enable even more sophisticated analyses remain. In particular, care needs to be taken to standardize, describe and conduct experiments robustly while relying on plant physiology expertise. In this article, we review the state of the art regarding genome assembly and the future potential of pangenomics in plant research. We also describe the necessity of standardizing and describing phenotypic studies using the Minimum Information About a Plant Phenotyping Experiment (MIAPPE) standard to enable the reuse and integration of phenotypic data. In addition, we show how deep phenotypic data might yield novel trait-trait correlations and review how to link phenotypic data to genomic data. Finally, we provide perspectives on the golden future of machine learning and their potential in linking phenotypes to genomic features.


Assuntos
Estudos de Associação Genética , Genoma de Planta/genética , Genômica , Aprendizado de Máquina , Fenômica , Plantas/genética , Fenótipo , Locos de Características Quantitativas/genética
14.
BMC Genomics ; 21(1): 237, 2020 Mar 17.
Artigo em Inglês | MEDLINE | ID: mdl-32183698

RESUMO

BACKGROUND: Plant genomes are rich in repetitive sequences, and transposable elements (TEs) are the most accumulated of them. This mobile fraction can be distinguished as Class I (retrotransposons) and Class II (transposons). Retrotransposons that are transposed using an intermediate RNA and that accumulate in a "copy-and-paste" manner were screened in three genomes of peppers (Solanaceae). The present study aimed to understand the genome relationships among Capsicum annuum, C. chinense, and C. baccatum, based on a comparative analysis of the function, diversity and chromosome distribution of TE lineages in the Capsicum karyotypes. Due to the great commercial importance of pepper in natura, as a spice or as an ornamental plant, these genomes have been widely sequenced, and all of the assemblies are available in the SolGenomics group. These sequences were used to compare all repetitive fractions from a cytogenomic point of view. RESULTS: The qualification and quantification of LTR-retrotransposons (LTR-RT) families were contrasted with molecular cytogenetic data, and the results showed a strong genome similarity between C. annuum and C. chinense as compared to C. baccatum. The Gypsy superfamily is more abundant than Copia, especially for Tekay/Del lineage members, including a high representation in C. annuum and C. chinense. On the other hand, C. baccatum accumulates more Athila/Tat sequences. The FISH results showed retrotransposons differentially scattered along chromosomes, except for CRM lineage sequences, which mainly have a proximal accumulation associated with heterochromatin bands. CONCLUSIONS: The results confirm a close genomic relationship between C. annuum and C. chinense in comparison to C. baccatum. Centromeric GC-rich bands may be associated with the accumulation regions of CRM elements, whereas terminal and subterminal AT- and GC-rich bands do not correspond to the accumulation of the retrotransposons in the three Capsicum species tested.


Assuntos
Capsicum/classificação , Capsicum/genética , Variação Genética , Genoma de Planta , Sequências Repetidas Terminais , Cromossomos de Plantas/genética , Genômica , Filogenia , Sequências Repetitivas de Ácido Nucleico , Retroelementos
15.
Plant Cell Physiol ; 61(11): 1946-1953, 2020 Dec 23.
Artigo em Inglês | MEDLINE | ID: mdl-32991731

RESUMO

Genome editing technology is important for plant science and crop breeding. Genome-edited plants prepared using general CRISPR-Cas9 methods usually contain foreign DNA, which is problematic for the production of genome-edited transgene-free plants for vegetative propagation or highly heterozygous hybrid cultivars. Here, we describe a method for highly efficient targeted mutagenesis in Nicotiana benthamiana through the expression of Cas9 and single-guide (sg)RNA using a potato virus X (PVX) vector. Following Agrobacterium-mediated introduction of virus vector cDNA, >60% of shoots regenerated without antibiotic selection carried targeted mutations, while ≤18% of shoots contained T-DNA. The PVX vector was also used to express a base editor consisting of modified Cas9 fused with cytidine deaminase to introduce targeted nucleotide substitution in regenerated shoots. We also report exogenous DNA-free genome editing by mechanical inoculation of virions comprising the PVX vector expressing Cas9. This simple and efficient virus vector-mediated delivery of CRISPR-Cas9 could facilitate transgene-free gene editing in plants.


Assuntos
Edição de Genes/métodos , Nicotiana/genética , Potexvirus/genética , Proteína 9 Associada à CRISPR , Sistemas CRISPR-Cas , Vetores Genéticos/genética , Genoma de Planta/genética , Mutagênese Sítio-Dirigida/métodos , Potexvirus/metabolismo
16.
BMC Plant Biol ; 20(1): 234, 2020 May 25.
Artigo em Inglês | MEDLINE | ID: mdl-32450802

RESUMO

Traditionally, generation of new plants with improved or desirable features has relied on laborious and time-consuming breeding techniques. Genome-editing technologies have led to a new era of genome engineering, enabling an effective, precise, and rapid engineering of the plant genomes. Clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated protein 9 (CRISPR/Cas9) has emerged as a new genome-editing tool, extensively applied in various organisms, including plants. The use of CRISPR/Cas9 allows generating transgene-free genome-edited plants ("null segregants") in a short period of time. In this review, we provide a critical overview of the recent advances in CRISPR/Cas9 derived technologies for inducing mutations at target sites in the genome and controlling the expression of target genes. We highlight the major breakthroughs in applying CRISPR/Cas9 to plant engineering, and challenges toward the production of null segregants. We also provide an update on the efforts of engineering Cas9 proteins, newly discovered Cas9 variants, and novel CRISPR/Cas systems for use in plants. The application of CRISPR/Cas9 and related technologies in plant engineering will not only facilitate molecular breeding of crop plants but also accelerate progress in basic research.


Assuntos
Sistemas CRISPR-Cas/genética , Edição de Genes/métodos , Genoma de Planta/genética , Plantas Geneticamente Modificadas/genética
17.
Mol Biol Rep ; 47(3): 2315-2325, 2020 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-31950325

RESUMO

Arabinogalactan Proteins (AGPs) are hydroxyproline-rich proteins containing a high proportion of carbohydrates, widely spread in the plant kingdom. AGPs have been suggested to play important roles in plant development processes, especially in sexual plant reproduction. Nevertheless, the functions of a large number of these molecules, remains to be discovered. In this review, we discuss two revolutionary genetic techniques that are able to decode the roles of these glycoproteins in an easy and efficient way. The RNA interference is a frequently technique used in plant biology that promotes genes silencing. The Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)-associated protein 9 (CRISPR/Cas9), emerged a few years ago as a revolutionary genome-editing technique that has allowed null mutants to be obtained in a wide variety of organisms, including plants. The two techniques have some differences between them and depending on the research objective, these may work as advantage or disadvantage. In the present work, we propose the use of the two techniques to obtain AGP mutants easily and quickly, helping to unravel the role of AGPs, surely a great asset for the future.


Assuntos
Sistemas CRISPR-Cas , Edição de Genes , Regulação da Expressão Gênica , Mucoproteínas/genética , Interferência de RNA , Animais , Inativação Gênica , Marcação de Genes , Humanos , Mucoproteínas/metabolismo , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , RNA Interferente Pequeno/genética , Pesquisa
18.
Biosci Biotechnol Biochem ; 84(12): 2405-2414, 2020 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-32856548

RESUMO

To evaluate crops generated by new breeding techniques, it is important to confirm the removal of recombinant DNAs (rDNAs) derived from foreign genes including unintentionally introduced short rDNA(s). We attempted to develop a sensitive detection method for such short rDNAs using Southern blot analysis and performed a model study targeting single-copy endogenous genes in plants. To increase the detection sensitivity, the general protocol for Southern blot analysis was modified. In the model study, we used endogenous-gene-targeting probes in which complementary sequences were serially replaced by dummy sequences, and detected complementary sequences as well as 30 bp. We further evaluated the sensitivity using short rDNAs derived from GM sequences as pseudoinsertions, and the results demonstrated that rDNA-insertions as small as 30 bp could be detected. The results suggested that unintentionally introduced rDNA-insertions were 30 bp or more in length could be detected by the Southern blot analysis.


Assuntos
Southern Blotting/métodos , DNA de Plantas/genética , Hibridização de Ácido Nucleico , Sequências Repetitivas de Ácido Nucleico
19.
BMC Bioinformatics ; 20(Suppl 15): 482, 2019 Dec 24.
Artigo em Inglês | MEDLINE | ID: mdl-31874598

RESUMO

BACKGROUND: Gene is a key step in genome annotation. Ab initio gene prediction enables gene annotation of new genomes regardless of availability of homologous sequences. There exist a number of ab initio gene prediction tools and they have been widely used for gene annotation for various species. However, existing tools are not optimized for identifying genes with highly variable GC content. In addition, some genes in grass genomes exhibit a sharp 5 '- 3' decreasing GC content gradient, which is not carefully modeled by available gene prediction tools. Thus, there is still room to improve the sensitivity and accuracy for predicting genes with GC gradients. RESULTS: In this work, we designed and implemented a new hidden Markov model (HMM)-based ab initio gene prediction tool, which is optimized for finding genes with highly variable GC contents, such as the genes with negative GC gradients in grass genomes. We tested the tool on three datasets from Arabidopsis thaliana and Oryza sativa. The results showed that our tool can identify genes missed by existing tools due to the highly variable GC contents. CONCLUSIONS: GPRED-GC can effectively predict genes with highly variable GC contents without manual intervention. It provides a useful complementary tool to existing ones such as Augustus for more sensitive gene discovery. The source code is freely available at https://sourceforge.net/projects/gpred-gc/.


Assuntos
Composição de Bases , Genoma , Genômica , Anotação de Sequência Molecular , Software
20.
BMC Genomics ; 20(1): 532, 2019 Jun 28.
Artigo em Inglês | MEDLINE | ID: mdl-31253093

RESUMO

BACKGROUND: MicroRNAs (miRNAs) play crucial roles in post-transcriptional regulation of eukaryotic gene expression and are involved in many aspects of plant development. Although several prediction tools are available for metazoan genomes, the number of tools dedicated to plants is relatively limited. RESULTS: Here, we present miRkwood, a user-friendly tool for the identification of miRNAs in plant genomes using small RNA sequencing data. Deep-sequencing data of Argonaute associated small RNAs showed that miRkwood is able to identify a large diversity of plant miRNAs and limits false positive predictions. Moreover, it outperforms current tools such as ShortStack and contrary to ShortStack, miRkwood provides a quality score allowing users to rank miRNA predictions. CONCLUSION: miRkwood is a very efficient tool for the annotation of miRNAs in plant genomes. It is available as a web server, as a standalone version, as a docker image and as a Galaxy tool: http://bioinfo.cristal.univ-lille.fr/mirkwood.


Assuntos
Genômica/métodos , MicroRNAs/genética , Software , Sequência de Bases , Genoma de Planta/genética , Sequências Repetidas Invertidas , Termodinâmica
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa