Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
Mais filtros

Base de dados
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
BMC Biol ; 18(1): 73, 2020 06 26.
Artigo em Inglês | MEDLINE | ID: mdl-32591023

RESUMO

BACKGROUND: Copy number variations (CNVs) are an important type of structural variations in the genome that usually affect gene expression levels by gene dosage effect. Understanding CNVs as part of genome evolution may provide insights into the genetic basis of important agricultural traits and contribute to the crop breeding in the future. While available methods to detect CNVs utilizing next-generation sequencing technology have helped shed light on prevalence and effects of CNVs, the complexity of crop genomes poses a major challenge and requires development of additional tools. RESULTS: Here, we generated genomic and transcriptomic data of 93 rice (Oryza sativa L.) accessions and developed a comprehensive pipeline to call CNVs in this large-scale dataset. We analyzed the correlation between CNVs and gene expression levels and found that approximately 13% of the identified genes showed a significant correlation between their expression levels and copy numbers. Further analysis showed that about 36% of duplicate pairs were involved in pseudogenetic events while only 5% of them showed functional differentiation. Moreover, the offspring copy mainly contributed to the expression levels and seemed more likely to become a pseudogene, whereas the parent copy tended to maintain the function of ancestral gene. CONCLUSION: We provide a high-accuracy CNV dataset that will contribute to functional genomics studies and molecular breeding in rice. We also showed that gene dosage effect of CNVs in rice is not exponential or linear. Our work demonstrates that the evolution of duplicated genes is asymmetric in both expression levels and gene fates, shedding a new insight into the evolution of duplicated genes.


Assuntos
Variações do Número de Cópias de DNA , Evolução Molecular , Duplicação Gênica , Genes de Plantas , Oryza/genética , Genoma de Planta , Transcriptoma
2.
BMC Genomics ; 20(1): 955, 2019 Dec 09.
Artigo em Inglês | MEDLINE | ID: mdl-31818249

RESUMO

BACKGROUND: The advent of third-generation sequencing (TGS) technologies opens the door to improve genome assembly. Long reads are promising for enhancing the quality of fragmented draft assemblies constructed from next-generation sequencing (NGS) technologies. To date, a few algorithms that are capable of improving draft assemblies have released. There are SSPACE-LongRead, OPERA-LG, SMIS, npScarf, DBG2OLC, Unicycler, and LINKS. Hybrid assembly on large genomes remains challenging, however. RESULTS: We develop a scalable and computationally efficient scaffolder, Long Reads Scaffolder (LRScaf, https://github.com/shingocat/lrscaf), that is capable of significantly boosting assembly contiguity using long reads. In this study, we summarise a comprehensive performance assessment for state-of-the-art scaffolders and LRScaf on seven organisms, i.e., E. coli, S. cerevisiae, A. thaliana, O. sativa, S. pennellii, Z. mays, and H. sapiens. LRScaf significantly improves the contiguity of draft assemblies, e.g., increasing the NGA50 value of CHM1 from 127.1 kbp to 9.4 Mbp using 20-fold coverage PacBio dataset and the NGA50 value of NA12878 from 115.3 kbp to 12.9 Mbp using 35-fold coverage Nanopore dataset. Besides, LRScaf generates the best contiguous NGA50 on A. thaliana, S. pennellii, Z. mays, and H. sapiens. Moreover, LRScaf has the shortest run time compared with other scaffolders, and the peak RAM of LRScaf remains practical for large genomes (e.g., 20.3 and 62.6 GB on CHM1 and NA12878, respectively). CONCLUSIONS: The new algorithm, LRScaf, yields the best or, at least, moderate scaffold contiguity and accuracy in the shortest run time compared with other scaffolding algorithms. Furthermore, LRScaf provides a cost-effective way to improve contiguity of draft assemblies on large genomes.


Assuntos
Algoritmos , Biologia Computacional/métodos , Genoma/genética , Genômica/métodos , Benchmarking , Sequenciamento de Nucleotídeos em Larga Escala , Sequenciamento por Nanoporos , Análise de Sequência de DNA
3.
Natl Sci Rev ; 11(2): nwad229, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38213525

RESUMO

Error-correcting codes (ECCs) employed in the state-of-the-art DNA digital storage (DDS) systems suffer from a trade-off between error-correcting capability and the proportion of redundancy. To address this issue, in this study, we introduce soft-decision decoding approach into DDS by proposing a DNA-specific error prediction model and a series of novel strategies. We demonstrate the effectiveness of our approach through a proof-of-concept DDS system based on Reed-Solomon (RS) code, named as Derrick. Derrick shows significant improvement in error-correcting capability without involving additional redundancy in both in vitro and in silico experiments, using various sequencing technologies such as Illumina, PacBio and Oxford Nanopore Technology (ONT). Notably, in vitro experiments using ONT sequencing at a depth of 7× reveal that Derrick, compared with the traditional hard-decision decoding strategy, doubles the error-correcting capability of RS code, decreases the proportion of matrices with decoding-failure by 229-fold, and amplifies the potential maximum storage volume by impressive 32 388-fold. Also, Derrick surpasses 'state-of-the-art' DDS systems by comprehensively considering the information density and the minimum sequencing depth required for complete information recovery. Crucially, the soft-decision decoding strategy and key steps of Derrick are generalizable to other ECCs' decoding algorithms.

4.
Gigascience ; 132024 Jan 02.
Artigo em Inglês | MEDLINE | ID: mdl-38626722

RESUMO

BACKGROUND: Most currently available reference genomes lack the sequence map of sex-limited (such as Y and W) chromosomes, which results in incomplete assemblies that hinder further research on sex chromosomes. Recent advancements in long-read sequencing and population sequencing have provided the opportunity to assemble sex-limited chromosomes without the traditional complicated experimental efforts. FINDINGS: We introduce the first computational method, Sorting long Reads of Y or other sex-limited chromosome (SRY), which achieves improved assembly results compared to flow sorting. Specifically, SRY outperforms in the heterochromatic region and demonstrates comparable performance in other regions. Furthermore, SRY enhances the capabilities of the hybrid assembly software, resulting in improved continuity and accuracy. CONCLUSIONS: Our method enables true complete genome assembly and facilitates downstream research of sex-limited chromosomes.


Assuntos
Genoma , Cromossomos Sexuais , Cromossomos Sexuais/genética , Análise de Sequência de DNA/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos
5.
Genomics Proteomics Bioinformatics ; 21(1): 203-215, 2023 02.
Artigo em Inglês | MEDLINE | ID: mdl-35718271

RESUMO

Sika deer are known to prefer oak leaves, which are rich in tannins and toxic to most mammals; however, the genetic mechanisms underlying their unique ability to adapt to living in the jungle are still unclear. In identifying the mechanism responsible for the tolerance of a highly toxic diet, we have made a major advancement by explaining the genome of sika deer. We generated the first high-quality, chromosome-level genome assembly of sika deer and measured the correlation between tannin intake and RNA expression in 15 tissues through 180 experiments. Comparative genome analyses showed that the UGT and CYP gene families are functionally involved in the adaptation of sika deer to high-tannin food, especially the expansion of the UGT family 2 subfamily B of UGT genes. The first chromosome-level assembly and genetic characterization of the tolerance to a highly toxic diet suggest that the sika deer genome may serve as an essential resource for understanding evolutionary events and tannin adaptation. Our study provides a paradigm of comparative expressive genomics that can be applied to the study of unique biological features in non-model animals.


Assuntos
Cervos , Animais , Cervos/genética , Cervos/metabolismo , Taninos/metabolismo , Genoma , Genômica , Dieta
6.
GigaByte ; 2021: gigabyte15, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-36824332

RESUMO

Long-read single-molecule sequencing has revolutionized de novo genome assembly and enabled the automated reconstruction of reference-quality genomes. It has also been widely used to study structural variants, phase haplotypes and more. Here, we introduce the assembler SMARTdenovo, a single-molecule sequencing (SMS) assembler that follows the overlap-layout-consensus (OLC) paradigm. SMARTdenovo (RRID: SCR_017622) was designed to be a rapid assembler, which, unlike contemporaneous SMS assemblers, does not require highly accurate raw reads for error correction. It has performed well in the evaluation of congeneric assemblers and has been successfully users for various assembly projects. It is compatible with Canu for assembling high-quality genomes, and several of the assembly strategies in this program have been incorporated into subsequent popular assemblers. The assembler has been in use since 2015; here we provide information on the development of SMARTdenovo and how to implement its algorithms into current projects.

7.
Nat Plants ; 7(6): 748-756, 2021 06.
Artigo em Inglês | MEDLINE | ID: mdl-34135482

RESUMO

Gymnosperms are a unique lineage of plants that currently lack a high-quality reference genome due to their large genome size and high repetitive sequence content. Here, we report a nearly complete genome assembly for Ginkgo biloba with a genome size of 9.87 Gb, an N50 contig size of 1.58 Mb and an N50 scaffold size of 775 Mb. We were able to accurately annotate 27,832 protein-coding genes in total, superseding the inaccurate annotation of 41,840 genes in a previous draft genome assembly. We found that expansion of the G. biloba genome, accompanied by the notable extension of introns, was mainly caused by the insertion of long terminal repeats rather than the recent occurrence of whole-genome duplication events, in contrast to the findings of a previous report. We also identified candidate genes in the central pair, intraflagellar transport and dynein protein families that are associated with the formation of the spermatophore flagellum, which has been lost in all seed plants except ginkgo and cycads. The newly obtained Ginkgo genome provides new insights into the evolution of the gymnosperm genome.


Assuntos
Evolução Biológica , Genoma de Planta , Ginkgo biloba/genética , Proteínas de Plantas/genética , Cycadopsida/genética , Cycadopsida/fisiologia , Elementos de DNA Transponíveis , Flores/genética , Íntrons , Filogenia , Folhas de Planta/genética , Sequências Repetidas Terminais
8.
Nat Commun ; 11(1): 4447, 2020 09 07.
Artigo em Inglês | MEDLINE | ID: mdl-32895382

RESUMO

Tea is an economically important plant characterized by a large genome, high heterozygosity, and high species diversity. In this study, we assemble a 3.26-Gb high-quality chromosome-scale genome for the 'Longjing 43' cultivar of Camellia sinensis var. sinensis. Genomic resequencing of 139 tea accessions from around the world is used to investigate the evolution and phylogenetic relationships of tea accessions. We find that hybridization has increased the heterozygosity and wide-ranging gene flow among tea populations with the spread of tea cultivation. Population genetic and transcriptomic analyses reveal that during domestication, selection for disease resistance and flavor in C. sinensis var. sinensis populations has been stronger than that in C. sinensis var. assamica populations. This study provides resources for marker-assisted breeding of tea and sets the foundation for further research on tea genetics and evolution.


Assuntos
Camellia sinensis/genética , Resistência à Doença/genética , Evolução Molecular , Genoma de Planta/genética , Melhoramento Vegetal , Domesticação , Perfilação da Expressão Gênica , Genômica , Filogenia , Polimorfismo de Nucleotídeo Único
9.
Natl Sci Rev ; 7(3): 686-701, 2020 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-34692087

RESUMO

Domesticated buffaloes have been integral to rice-paddy agro-ecosystems for millennia, yet relatively little is known about the buffalo genomics. Here, we sequenced and assembled reference genomes for both swamp and river buffaloes and we re-sequenced 230 individuals (132 swamp buffaloes and 98 river buffaloes) sampled from across Asia and Europe. Beyond the many actionable insights that our study revealed about the domestication, basic physiology and breeding of buffalo, we made the striking discovery that the divergent domestication traits between swamp and river buffaloes can be explained with recent selections of genes on social behavior, digestion metabolism, strengths and milk production.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA