Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 28
Filtrar
1.
Cell ; 186(11): 2313-2328.e15, 2023 05 25.
Artigo em Inglês | MEDLINE | ID: mdl-37146612

RESUMO

Hybrid potato breeding will transform the crop from a clonally propagated tetraploid to a seed-reproducing diploid. Historical accumulation of deleterious mutations in potato genomes has hindered the development of elite inbred lines and hybrids. Utilizing a whole-genome phylogeny of 92 Solanaceae and its sister clade species, we employ an evolutionary strategy to identify deleterious mutations. The deep phylogeny reveals the genome-wide landscape of highly constrained sites, comprising ∼2.4% of the genome. Based on a diploid potato diversity panel, we infer 367,499 deleterious variants, of which 50% occur at non-coding and 15% at synonymous sites. Counterintuitively, diploid lines with relatively high homozygous deleterious burden can be better starting material for inbred-line development, despite showing less vigorous growth. Inclusion of inferred deleterious mutations increases genomic-prediction accuracy for yield by 24.7%. Our study generates insights into the genome-wide incidence and properties of deleterious mutations and their far-reaching consequences for breeding.


Assuntos
Melhoramento Vegetal , Solanum tuberosum , Diploide , Mutação , Filogenia , Solanum tuberosum/genética
2.
Proc Natl Acad Sci U S A ; 121(26): e2319811121, 2024 Jun 25.
Artigo em Inglês | MEDLINE | ID: mdl-38889146

RESUMO

Rational design of plant cis-regulatory DNA sequences without expert intervention or prior domain knowledge is still a daunting task. Here, we developed PhytoExpr, a deep learning framework capable of predicting both mRNA abundance and plant species using the proximal regulatory sequence as the sole input. PhytoExpr was trained over 17 species representative of major clades of the plant kingdom to enhance its generalizability. Via input perturbation, quantitative functional annotation of the input sequence was achieved at single-nucleotide resolution, revealing an abundance of predicted high-impact nucleotides in conserved noncoding sequences and transcription factor binding sites. Evaluation of maize HapMap3 single-nucleotide polymorphisms (SNPs) by PhytoExpr demonstrates an enrichment of predicted high-impact SNPs in cis-eQTL. Additionally, we provided two algorithms that harnessed the power of PhytoExpr in designing functional cis-regulatory variants, and de novo creation of species-specific cis-regulatory sequences through in silico evolution of random DNA sequences. Our model represents a general and robust approach for functional variant discovery in population genetics and rational design of regulatory sequences for genome editing and synthetic biology.


Assuntos
Polimorfismo de Nucleotídeo Único , Sequências Reguladoras de Ácido Nucleico , Zea mays , Sequências Reguladoras de Ácido Nucleico/genética , Zea mays/genética , Locos de Características Quantitativas , Algoritmos , Regulação da Expressão Gênica de Plantas , Aprendizado Profundo , Plantas/genética , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Modelos Genéticos , Genes de Plantas , Sítios de Ligação/genética
3.
PLoS Biol ; 21(7): e3002191, 2023 07.
Artigo em Inglês | MEDLINE | ID: mdl-37463141

RESUMO

We study natural DNA polymorphisms and associated phenotypes in the Arabidopsis relative Cardamine hirsuta. We observed strong genetic differentiation among several ancestry groups and broader distribution of Iberian relict strains in European C. hirsuta compared to Arabidopsis. We found synchronization between vegetative and reproductive development and a pervasive role for heterochronic pathways in shaping C. hirsuta natural variation. A single, fast-cycling ChFRIGIDA allele evolved adaptively allowing range expansion from glacial refugia, unlike Arabidopsis where multiple FRIGIDA haplotypes were involved. The Azores islands, where Arabidopsis is scarce, are a hotspot for C. hirsuta diversity. We identified a quantitative trait locus (QTL) in the heterochronic SPL9 transcription factor as a determinant of an Azorean morphotype. This QTL shows evidence for positive selection, and its distribution mirrors a climate gradient that broadly shaped the Azorean flora. Overall, we establish a framework to explore how the interplay of adaptation, demography, and development shaped diversity patterns of 2 related plant species.


Assuntos
Arabidopsis , Cardamine , Arabidopsis/genética , Cardamine/genética , Genótipo , Fenótipo , Demografia
4.
PLoS Genet ; 19(12): e1011086, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38134220

RESUMO

Structural differences between genomes are a major source of genetic variation that contributes to phenotypic differences. Transposable elements, mobile genetic sequences capable of increasing their copy number and propagating themselves within genomes, can generate structural variation. However, their repetitive nature makes it difficult to characterize fine-scale differences in their presence at specific positions, limiting our understanding of their impact on genome variation. Domesticated maize is a particularly good system for exploring the impact of transposable element proliferation as over 70% of the genome is annotated as transposable elements. High-quality transposable element annotations were recently generated for de novo genome assemblies of 26 diverse inbred maize lines. We generated base-pair resolved pairwise alignments between the B73 maize reference genome and the remaining 25 inbred maize line assemblies. From this data, we classified transposable elements as either shared or polymorphic in a given pairwise comparison. Our analysis uncovered substantial structural variation between lines, representing both simple and complex connections between TEs and structural variants. Putative insertions in SNP depleted regions, which represent recently diverged identity by state blocks, suggest some TE families may still be active. However, our analysis reveals that within these recently diverged genomic regions, deletions of transposable elements likely account for more structural variation events and base pairs than insertions. These deletions are often large structural variants containing multiple transposable elements. Combined, our results highlight how transposable elements contribute to structural variation and demonstrate that deletion events are a major contributor to genomic differences.


Assuntos
Elementos de DNA Transponíveis , Zea mays , Humanos , Elementos de DNA Transponíveis/genética , Zea mays/genética , Genômica
5.
Proc Natl Acad Sci U S A ; 119(1)2022 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-34934012

RESUMO

Millions of species are currently being sequenced, and their genomes are being compared. Many of them have more complex genomes than model systems and raise novel challenges for genome alignment. Widely used local alignment strategies often produce limited or incongruous results when applied to genomes with dispersed repeats, long indels, and highly diverse sequences. Moreover, alignment using many-to-many or reciprocal best hit approaches conflicts with well-studied patterns between species with different rounds of whole-genome duplication. Here, we introduce Anchored Wavefront alignment (AnchorWave), which performs whole-genome duplication-informed collinear anchor identification between genomes and performs base pair-resolved global alignment for collinear blocks using a two-piece affine gap cost strategy. This strategy enables AnchorWave to precisely identify multikilobase indels generated by transposable element (TE) presence/absence variants (PAVs). When aligning two maize genomes, AnchorWave successfully recalled 87% of previously reported TE PAVs. By contrast, other genome alignment tools showed low power for TE PAV recall. AnchorWave precisely aligns up to three times more of the genome as position matches or indels than the closest competitive approach when comparing diverse genomes. Moreover, AnchorWave recalls transcription factor-binding sites at a rate of 1.05- to 74.85-fold higher than other tools with significantly lower false-positive alignments. AnchorWave complements available genome alignment tools by showing obvious improvement when applied to genomes with dispersed repeats, active TEs, high sequence diversity, and whole-genome duplication variation.


Assuntos
Genoma de Planta , Polimorfismo Genético , Alinhamento de Sequência , Software , Zea mays/genética
6.
BMC Genomics ; 25(1): 515, 2024 May 25.
Artigo em Inglês | MEDLINE | ID: mdl-38796435

RESUMO

BACKGROUND: The short-read whole-genome sequencing (WGS) approach has been widely applied to investigate the genomic variation in the natural populations of many plant species. With the rapid advancements in long-read sequencing and genome assembly technologies, high-quality genome sequences are available for a group of varieties for many plant species. These genome sequences are expected to help researchers comprehensively investigate any type of genomic variants that are missed by the WGS technology. However, multiple genome alignment (MGA) tools designed by the human genome research community might be unsuitable for plant genomes. RESULTS: To fill this gap, we developed the AnchorWave-Cactus Multiple Genome Alignment (ACMGA) pipeline, which improved the alignment of repeat elements and could identify long (> 50 bp) deletions or insertions (INDELs). We conducted MGA using ACMGA and Cactus for 8 Arabidopsis (Arabidopsis thaliana) and 26 Maize (Zea mays) de novo assembled genome sequences and compared them with the previously published short-read variant calling results. MGA identified more single nucleotide variants (SNVs) and long INDELs than did previously published WGS variant callings. Additionally, ACMGA detected significantly more SNVs and long INDELs in repetitive regions and the whole genome than did Cactus. Compared with the results of Cactus, the results of ACMGA were more similar to the previously published variants called using short-read. These two MGA pipelines identified numerous multi-allelic variants that were missed by the WGS variant calling pipeline. CONCLUSIONS: Aligning de novo assembled genome sequences could identify more SNVs and INDELs than mapping short-read. ACMGA combines the advantages of AnchorWave and Cactus and offers a practical solution for plant MGA by integrating global alignment, a 2-piece-affine-gap cost strategy, and the progressive MGA algorithm.


Assuntos
Arabidopsis , Genoma de Planta , Zea mays , Arabidopsis/genética , Zea mays/genética , Alinhamento de Sequência , Mutação INDEL , Genômica/métodos , Polimorfismo de Nucleotídeo Único , Sequenciamento Completo do Genoma/métodos , Software
7.
Genome Res ; 31(7): 1245-1257, 2021 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-34045362

RESUMO

Thousands of species will be sequenced in the next few years; however, understanding how their genomes work, without an unlimited budget, requires both molecular and novel evolutionary approaches. We developed a sensitive sequence alignment pipeline to identify conserved noncoding sequences (CNSs) in the Andropogoneae tribe (multiple crop species descended from a common ancestor ∼18 million years ago). The Andropogoneae share similar physiology while being tremendously genomically diverse, harboring a broad range of ploidy levels, structural variation, and transposons. These contribute to the potential of Andropogoneae as a powerful system for studying CNSs and are factors we leverage to understand the function of maize CNSs. We found that 86% of CNSs were comprised of annotated features, including introns, UTRs, putative cis-regulatory elements, chromatin loop anchors, noncoding RNA (ncRNA) genes, and several transposable element superfamilies. CNSs were enriched in active regions of DNA replication in the early S phase of the mitotic cell cycle and showed different DNA methylation ratios compared to the genome-wide background. More than half of putative cis-regulatory sequences (identified via other methods) overlapped with CNSs detected in this study. Variants in CNSs were associated with gene expression levels, and CNS absence contributed to loss of gene expression. Furthermore, the evolution of CNSs was associated with the functional diversification of duplicated genes in the context of maize subgenomes. Our results provide a quantitative understanding of the molecular processes governing the evolution of CNSs in maize.

8.
New Phytol ; 242(5): 2115-2131, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38358006

RESUMO

Drought is one of the major environmental constraints for wheat production world-wide. As the progenitor and genetic reservoir of common wheat, emmer wheat is considered as an invaluable gene pool for breeding drought-tolerant wheat. Combining GWAS and eGWAS analysis of 107 accessions, we identified 86 QTLs, 105 462 eQTLs as well as 68 eQTL hotspots associating with drought tolerance (DT) in emmer wheat. A complex regulatory network composed of 185 upstream regulator and 2432 downstream drought-responsive candidates was developed, of which TtOTS1 was found to play a negative effect in determining DT through affecting root development. This study sheds light on revealing the genetic basis underlying DT, which will provide the indispensable genes and germplasm resources for elite drought tolerance wheat improvement and breeding.


Assuntos
Secas , Estudo de Associação Genômica Ampla , Locos de Características Quantitativas , Triticum , Triticum/genética , Triticum/fisiologia , Locos de Características Quantitativas/genética , Adaptação Fisiológica/genética , Regulação da Expressão Gênica de Plantas , Redes Reguladoras de Genes , Genes de Plantas , Polimorfismo de Nucleotídeo Único/genética , Fenótipo , Raízes de Plantas/genética , Raízes de Plantas/fisiologia , Resistência à Seca
9.
Plant Cell ; 33(6): 1863-1887, 2021 07 19.
Artigo em Inglês | MEDLINE | ID: mdl-33751107

RESUMO

Plants recognize surrounding microbes by sensing microbe-associated molecular patterns (MAMPs) to activate pattern-triggered immunity (PTI). Despite their significance for microbial control, the evolution of PTI responses remains largely uncharacterized. Here, by employing comparative transcriptomics of six Arabidopsis thaliana accessions and three additional Brassicaceae species to investigate PTI responses, we identified a set of genes that commonly respond to the MAMP flg22 and genes that exhibit species-specific expression signatures. Variation in flg22-triggered transcriptome responses across Brassicaceae species was incongruent with their phylogeny, while expression changes were strongly conserved within A. thaliana. We found the enrichment of WRKY transcription factor binding sites in the 5'-regulatory regions of conserved and species-specific responsive genes, linking the emergence of WRKY-binding sites with the evolution of gene expression patterns during PTI. Our findings advance our understanding of the evolution of the transcriptome during biotic stress.


Assuntos
Proteínas de Arabidopsis , Arabidopsis , Brassicaceae , Arabidopsis/metabolismo , Proteínas de Arabidopsis/metabolismo , Brassicaceae/genética , Brassicaceae/metabolismo , Expressão Gênica , Regulação da Expressão Gênica de Plantas/genética , Imunidade Vegetal/genética
10.
Plant Cell ; 32(5): 1479-1500, 2020 05.
Artigo em Inglês | MEDLINE | ID: mdl-32132131

RESUMO

Several pathways conferring environmental flowering responses in Arabidopsis (Arabidopsis thaliana) converge on developmental processes that mediate the floral transition in the shoot apical meristem. Many characterized mutations disrupt these environmental responses, but downstream developmental processes have been more refractory to mutagenesis. Here, we constructed a quintuple mutant impaired in several environmental pathways and showed that it possesses severely reduced flowering responses to changes in photoperiod and ambient temperature. RNA-sequencing (RNA-seq) analysis of the quintuple mutant showed that the expression of genes encoding gibberellin biosynthesis enzymes and transcription factors involved in the age pathway correlates with flowering. Mutagenesis of the quintuple mutant generated two late-flowering mutants, quintuple ems1 (qem1) and qem2 The mutated genes were identified by isogenic mapping and transgenic complementation. The qem1 mutant is an allele of the gibberellin 20-oxidase gene ga20ox2, confirming the importance of gibberellin for flowering in the absence of environmental responses. By contrast, qem2 is impaired in CHROMATIN REMODELING4 (CHR4), which has not been genetically implicated in floral induction. Using co-immunoprecipitation, RNA-seq, and chromatin immunoprecipitation sequencing, we show that CHR4 interacts with transcription factors involved in floral meristem identity and affects the expression of key floral regulators. Therefore, CHR4 mediates the response to endogenous flowering pathways in the inflorescence meristem to promote floral identity.


Assuntos
Proteínas de Arabidopsis/metabolismo , Arabidopsis/genética , Arabidopsis/fisiologia , Proteínas de Ligação a DNA/metabolismo , Meio Ambiente , Flores/genética , Flores/fisiologia , Mutagênese/genética , Mutação/genética , Proteínas de Arabidopsis/genética , DNA Helicases , Proteínas de Ligação a DNA/genética , Regulação da Expressão Gênica de Plantas , Loci Gênicos , Genoma de Planta , Histonas/metabolismo , Meristema/genética , Anotação de Sequência Molecular , Fenótipo , Polimorfismo de Nucleotídeo Único/genética , Ligação Proteica , Fatores de Tempo
11.
PLoS Genet ; 14(10): e1007699, 2018 10.
Artigo em Inglês | MEDLINE | ID: mdl-30325920

RESUMO

Short insertions, deletions (INDELs) and larger structural variants have been increasingly employed in genetic association studies, but few improvements over SNP-based association have been reported. In order to understand why this might be the case, we analysed two publicly available datasets and observed that 63% of INDELs called in A. thaliana and 64% in D. melanogaster populations are misrepresented as multiple alleles with different functional annotations, i.e. where the same underlying variant is represented by inconsistent alignments leading to different variant calls. To address this issue, we have developed the software Irisas to reclassify and re-annotate these variants, which we then used for single-locus tests of association. We also integrated them to predict the functional impact of SNPs, INDELs, and structural variants for burden testing. Using both approaches, we re-analysed the genetic architecture of complex traits in A. thaliana and D. melanogaster. Heritability analysis using SNPs alone explained on average 27% and 19% of phenotypic variance for A. thaliana and D. melanogaster respectively. Our method explained an additional 11% and 3%, respectively. We also identified novel trait loci that previous SNP-based association studies failed to map, and which contain established candidate genes. Our study shows the value of the association test with INDELs and integrating multiple types of variants in association studies in plants and animals.


Assuntos
Estudos de Associação Genética/métodos , Mutação INDEL/genética , Análise de Sequência de DNA/métodos , Animais , Arabidopsis/genética , Drosophila melanogaster/genética , Genótipo , Fenótipo , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas/genética , Software
12.
Plant Physiol ; 171(4): 2659-70, 2016 08.
Artigo em Inglês | MEDLINE | ID: mdl-27288362

RESUMO

Seed dormancy controls the timing of germination, which regulates the adaptation of plants to their environment and influences agricultural production. The time of germination is under strong natural selection and shows variation within species due to local adaptation. The identification of genes underlying dormancy quantitative trait loci is a major scientific challenge, which is relevant for agricultural and ecological goals. In this study, we describe the identification of the DELAY OF GERMINATION18 (DOG18) quantitative trait locus, which was identified as a factor in natural variation for seed dormancy in Arabidopsis (Arabidopsis thaliana). DOG18 encodes a member of the clade A of the type 2C protein phosphatases family, which we previously identified as the REDUCED DORMANCY5 (RDO5) gene. DOG18/RDO5 shows a relatively high frequency of loss-of-function alleles in natural accessions restricted to northwestern Europe. The loss of dormancy in these loss-of-function alleles can be compensated for by genetic factors like DOG1 and DOG6, and by environmental factors such as low temperature. RDO5 does not have detectable phosphatase activity. Analysis of the phosphoproteome in dry and imbibed seeds revealed a general decrease in protein phosphorylation during seed imbibition that is enhanced in the rdo5 mutant. We conclude that RDO5 acts as a pseudophosphatase that inhibits dephosphorylation during seed imbibition.


Assuntos
Proteínas de Arabidopsis/genética , Arabidopsis/genética , Arabidopsis/fisiologia , Fosfoproteínas Fosfatases/genética , Dormência de Plantas/genética , Polimorfismo Genético , Alelos , Proteínas de Arabidopsis/metabolismo , Teste de Complementação Genética , Geografia , Haplótipos/genética , Mutação/genética , Fenótipo , Fosfoproteínas Fosfatases/metabolismo , Mapeamento Físico do Cromossomo , Temperatura
13.
Trends Plant Sci ; 29(3): 355-369, 2024 03.
Artigo em Inglês | MEDLINE | ID: mdl-37749022

RESUMO

Genome alignment is one of the most foundational methods for genome sequence studies. With rapid advances in sequencing and assembly technologies, these newly assembled genomes present challenges for alignment tools to meet the increased complexity and scale. Plant genome alignment is technologically challenging because of frequent whole-genome duplications (WGDs) as well as chromosome rearrangements and fractionation, high nucleotide diversity, widespread structural variation, and high transposable element (TE) activity causing large proportions of repeat elements. We summarize classical pairwise and multiple genome alignment (MGA) methods, and highlight techniques that are widely used or are being developed by the plant research community. We also outline the remaining challenges for precise genome alignment and the interpretation of alignment results in plants.


Assuntos
Genoma de Planta , Plantas , Plantas/genética , Genoma de Planta/genética , Elementos de DNA Transponíveis/genética
14.
Genome Biol ; 24(1): 55, 2023 03 24.
Artigo em Inglês | MEDLINE | ID: mdl-36964601

RESUMO

BACKGROUND: Transcription bridges genetic information and phenotypes. Here, we evaluated how changes in transcriptional regulation enable maize (Zea mays), a crop originally domesticated in the tropics, to adapt to temperate environments. RESULT: We generated 572 unique RNA-seq datasets from the roots of 340 maize genotypes. Genes involved in core processes such as cell division, chromosome organization and cytoskeleton organization showed lower heritability of gene expression, while genes involved in anti-oxidation activity exhibited higher expression heritability. An expression genome-wide association study (eGWAS) identified 19,602 expression quantitative trait loci (eQTLs) associated with the expression of 11,444 genes. A GWAS for alternative splicing identified 49,897 splicing QTLs (sQTLs) for 7614 genes. Genes harboring both cis-eQTLs and cis-sQTLs in linkage disequilibrium were disproportionately likely to encode transcription factors or were annotated as responding to one or more stresses. Independent component analysis of gene expression data identified loci regulating co-expression modules involved in oxidation reduction, response to water deprivation, plastid biogenesis, protein biogenesis, and plant-pathogen interaction. Several genes involved in cell proliferation, flower development, DNA replication, and gene silencing showed lower gene expression variation explained by genetic factors between temperate and tropical maize lines. A GWAS of 27 previously published phenotypes identified several candidate genes overlapping with genomic intervals showing signatures of selection during adaptation to temperate environments. CONCLUSION: Our results illustrate how maize transcriptional regulatory networks enable changes in transcriptional regulation to adapt to temperate regions.


Assuntos
Transcriptoma , Zea mays , Estudo de Associação Genômica Ampla , Locos de Características Quantitativas , Fenótipo , Polimorfismo de Nucleotídeo Único
15.
Nat Commun ; 14(1): 6072, 2023 09 28.
Artigo em Inglês | MEDLINE | ID: mdl-37770474

RESUMO

Leaf rust, caused by Puccinia triticina Eriksson (Pt), is one of the most severe foliar diseases of wheat. Breeding for leaf rust resistance is a practical and sustainable method to control this devastating disease. Here, we report the identification of Lr47, a broadly effective leaf rust resistance gene introgressed into wheat from Aegilops speltoides. Lr47 encodes a coiled-coil nucleotide-binding leucine-rich repeat protein that is both necessary and sufficient to confer Pt resistance, as demonstrated by loss-of-function mutations and transgenic complementation. Lr47 introgression lines with no or reduced linkage drag are generated using the Pairing homoeologous1 mutation, and a diagnostic molecular marker for Lr47 is developed. The coiled-coil domain of the Lr47 protein is unable to induce cell death, nor does it have self-protein interaction. The cloning of Lr47 expands the number of leaf rust resistance genes that can be incorporated into multigene transgenic cassettes to control this devastating disease.


Assuntos
Aegilops , Basidiomycota , Aegilops/genética , Melhoramento Vegetal , Triticum/genética , Basidiomycota/genética , Clonagem Molecular , Doenças das Plantas/genética , Resistência à Doença/genética
16.
Proteins ; 80(7): 1736-43, 2012 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-22411607

RESUMO

Although functionally similar proteins across species have been widely studied, functionally similar proteins within species showing low sequence similarity have not been examined in detail. Identification of these proteins is of significant importance for understanding biological functions, evolution of protein families, progression of co-evolution, and convergent evolution and others which cannot be obtained by detection of functionally similar proteins across species. Here, we explored a method of detecting functionally similar proteins within species based on graph theory. After denoting protein-protein interaction networks using graphs, we split the graphs into subgraphs using the 1-hop method. Proteins with functional similarities in a species were detected using a method of modified shortest path to compare these subgraphs and to find the eligible optimal results. Using seven protein-protein interaction networks and this method, some functionally similar proteins with low sequence similarity that cannot detected by sequence alignment were identified. By analyzing the results, we found that, sometimes, it is difficult to separate homologous from convergent evolution. Evaluation of the performance of our method by gene ontology term overlap showed that the precision of our method was excellent.


Assuntos
Mapeamento de Interação de Proteínas/métodos , Mapas de Interação de Proteínas , Proteínas/química , Proteínas/metabolismo , Proteômica/métodos , Algoritmos , Animais , Bases de Dados de Proteínas , Humanos , Camundongos , Proteínas/classificação , Alinhamento de Sequência/métodos
17.
Proteome Sci ; 10(1): 2, 2012 Jan 10.
Artigo em Inglês | MEDLINE | ID: mdl-22230699

RESUMO

BACKGROUND: Studying the large-scale protein-protein interaction (PPI) network is important in understanding biological processes. The current research presents the first PPI map of swine, which aims to give new insights into understanding their biological processes. RESULTS: We used three methods, Interolog-based prediction of porcine PPI network, domain-motif interactions from structural topology-based prediction of porcine PPI network and motif-motif interactions from structural topology-based prediction of porcine PPI network, to predict porcine protein interactions among 25,767 porcine proteins. We predicted 20,213, 331,484, and 218,705 porcine PPIs respectively, merged the three results into 567,441 PPIs, constructed four PPI networks, and analyzed the topological properties of the porcine PPI networks. Our predictions were validated with Pfam domain annotations and GO annotations. Averages of 70, 10,495, and 863 interactions were related to the Pfam domain-interacting pairs in iPfam database. For comparison, randomized networks were generated, and averages of only 4.24, 66.79, and 44.26 interactions were associated with Pfam domain-interacting pairs in iPfam database. In GO annotations, we found 52.68%, 75.54%, 27.20% of the predicted PPIs sharing GO terms respectively. However, the number of PPI pairs sharing GO terms in the 10,000 randomized networks reached 52.68%, 75.54%, 27.20% is 0. Finally, we determined the accuracy and precision of the methods. The methods yielded accuracies of 0.92, 0.53, and 0.50 at precisions of about 0.93, 0.74, and 0.75, respectively. CONCLUSION: The results reveal that the predicted PPI networks are considerably reliable. The present research is an important pioneering work on protein function research. The porcine PPI data set, the confidence score of each interaction and a list of related data are available at (http://pppid.biositemap.com/).

18.
Plant Genome ; 15(2): e20204, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-35416423

RESUMO

Alignments of multiple genomes are a cornerstone of comparative genomics, but generating these alignments remains technically challenging and often impractical. We developed the msa_pipeline workflow (https://bitbucket.org/bucklerlab/msa_pipeline) to allow practical and sensitive multiple alignment of diverged plant genomes and calculation of conservation scores with minimal user inputs. As high repeat content and genomic divergence are substantial challenges in plant genome alignment, we also explored the effect of different masking approaches and parameters of the LAST aligner using genome assemblies of 33 grass species. Compared with conventional masking with RepeatMasker, a masking approach based on k-mers (nucleotide sequences of k length) increased the alignment rate of coding sequence and noncoding functional regions by 25 and 14%, respectively. We further found that default alignment parameters generally perform well, but parameter tuning can increase the alignment rate for noncoding functional regions by over 52% compared with default LAST settings. Finally, by increasing alignment sensitivity from the default baseline, parameter tuning can increase the number of noncoding sites that can be scored for conservation by over 76%. Overall, tuning of masking and alignment parameters can generate optimized multiple alignments to drive biological discovery in plants.


Assuntos
Genoma de Planta , Genômica , Sequência de Bases , Fluxo de Trabalho
19.
Front Plant Sci ; 13: 1004387, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36212364

RESUMO

The tea plant (Camellia sinensis) is an important economic crop, which is becoming increasingly popular worldwide, and is now planted in more than 50 countries. Tea green leafhopper is one of the major pests in tea plantations, which can significantly reduce the yield and quality of tea during the growth of plant. In this study, we report a genome assembly for DuyunMaojian tea plants using a combination of Oxford Nanopore Technology PromethION™ with high-throughput chromosome conformation capture technology and used multi-omics to study how the tea plant responds to infestation with tea green leafhoppers. The final genome was 3.08 Gb. A total of 2.97 Gb of the genome was mapped to 15 pseudo-chromosomes, and 2.79 Gb of them could confirm the order and direction. The contig N50, scaffold N50 and GC content were 723.7 kb, 207.72 Mb and 38.54%, respectively. There were 2.67 Gb (86.77%) repetitive sequences, 34,896 protein-coding genes, 104 miRNAs, 261 rRNA, 669 tRNA, and 6,502 pseudogenes. A comparative genomics analysis showed that DuyunMaojian was the most closely related to Shuchazao and Yunkang 10, followed by DASZ and tea-oil tree. The multi-omics results indicated that phenylpropanoid biosynthesis, α-linolenic acid metabolism, flavonoid biosynthesis and 50 differentially expressed genes, particularly peroxidase, played important roles in response to infestation with tea green leafhoppers (Empoasca vitis Göthe). This study on the tea tree is highly significant for its role in illustrating the evolution of its genome and discovering how the tea plant responds to infestation with tea green leafhoppers will contribute to a theoretical foundation to breed tea plants resistant to insects that will ultimately result in an increase in the yield and quality of tea.

20.
Hortic Res ; 2022 Jan 19.
Artigo em Inglês | MEDLINE | ID: mdl-35043206

RESUMO

Earliness and ripening behavior are important attributes of fruits on and off the vine, and affect quality and preference of both growers and consumers. Fruit ripening is a complex physiological process that involves metabolic shifts affecting fruit color, firmness, and aroma production. Melon is a promising model crop for the study of fruit ripening, as the full spectrum of climacteric behavior is represented across the natural variation. Using Recombinant Inbred Lines (RILs) population derived from the parental lines "Dulce" (reticulatus, climacteric) and "Tam Dew" (inodorus, non-climacteric) that vary in earliness and ripening traits, we mapped QTLs for ethylene emission, fruit firmness and days to flowering and maturity. To further annotate the main QTL intervals and identify candidate genes, we used Oxford Nanopore long-read sequencing in combination with Illumina short-read resequencing, to assemble the parental genomes de-novo. In addition to 2.5 million genome-wide SNPs and short InDels detected between the parents, we also highlight here the structural variation between these lines and the reference melon genome. Through systematic multi-layered prioritization process, we identified 18 potential polymorphisms in candidate genes within multi-trait QTLs. The associations of selected SNPs with earliness and ripening traits were further validated across a panel of 177 diverse melon accessions and across a diallel population of 190 F1 hybrids derived from a core subset of 20 diverse parents. The combination of advanced genomic tools with diverse germplasm and targeted mapping populations is demonstrated as a way to leverage forward genetics strategies to dissect complex horticulturally important traits.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA