Your browser doesn't support javascript.
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 1.700
Filtrar
1.
Arch Virol ; 165(1): 227-231, 2020 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-31659444

RESUMO

Three viral contig sequences, which represented complete genome of a novel virus with three dsRNAs of 1,712 nucleotides (nt) (dsRNA1), 1,504 nt (dsRNA2) and 1,353 nt (dsRNA3), were found in tea-oil camellia plants by high-throughput sequencing analysis. The three dsRNAs were re-sequenced by RT-PCR cloning. The largest dsRNA, dsRNA1, had a single open reading frame (ORF) that encoded a putative 52.7-kDa protein of a putative viral RNA-dependent RNA polymerase (RdRp). DsRNA2 and dsRNA3 were predicted to encode putative capsid proteins (CPs) of 40.47 kDa and 40.59 kDa, respectively. The virus, which is provisionally named "tea-oil camellia deltapartitivirus 1",  shared amino acid sequence itentities of 36.09-69.18% with members of the genus Deltapartitivirus on RdRp. Phylogenetic analysis based on RdRp also placed the new virus and other deltapartitiviruses together in a group, suggesting that this virus should be considered a new member of the genus Deltapartitivirus.


Assuntos
Camellia/virologia , Vírus de RNA/genética , Sequenciamento Completo do Genoma/métodos , Mapeamento de Sequências Contíguas , Genoma Viral , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Fases de Leitura Aberta , Filogenia , Vírus de RNA/classificação , RNA de Cadeia Dupla/genética
2.
BMC Bioinformatics ; 20(Suppl 9): 367, 2019 Nov 22.
Artigo em Inglês | MEDLINE | ID: mdl-31757198

RESUMO

MOTIVATION: Sequencing technologies allow the sequencing of microbial communities directly from the environment without prior culturing. Because assembly typically produces only genome fragments, also known as contigs, it is crucial to group them into putative species for further taxonomic profiling and down-streaming functional analysis. Taxonomic analysis of microbial communities requires contig clustering, a process referred to as binning, that is still one of the most challenging tasks when analyzing metagenomic data. The major problems are the lack of taxonomically related genomes in existing reference databases, the uneven abundance ratio of species, sequencing errors, and the limitations due to binning contig of different lengths. RESULTS: In this context we present MetaCon a novel tool for unsupervised metagenomic contig binning based on probabilistic k-mers statistics and coverage. MetaCon uses a signature based on k-mers statistics that accounts for the different probability of appearance of a k-mer in different species, also contigs of different length are clustered in two separate phases. The effectiveness of MetaCon is demonstrated in both simulated and real datasets in comparison with state-of-art binning approaches such as CONCOCT, MaxBin and MetaBAT.


Assuntos
Algoritmos , Mapeamento de Sequências Contíguas , Metagenoma , Metagenômica , Probabilidade , Estatística como Assunto , Análise por Conglomerados , Bases de Dados Genéticas , Microbiota/genética
3.
BMC Genomics ; 20(1): 706, 2019 Sep 11.
Artigo em Inglês | MEDLINE | ID: mdl-31510940

RESUMO

BACKGROUND: Accurate de novo genome assembly has become reality with the advancements in sequencing technology. With the ever-increasing number of de novo genome assembly tools, assessing the quality of assemblies has become of great importance in genome research. Although many quality metrics have been proposed and software tools for calculating those metrics have been developed, the existing tools do not produce a unified measure to reflect the overall quality of an assembly. RESULTS: To address this issue, we developed the de novo Assembly Quality Evaluation Tool (dnAQET) that generates a unified metric for benchmarking the quality assessment of assemblies. Our framework first calculates individual quality scores for the scaffolds/contigs of an assembly by aligning them to a reference genome. Next, it computes a quality score for the assembly using its overall reference genome coverage, the quality score distribution of its scaffolds and the redundancy identified in it. Using synthetic assemblies randomly generated from the latest human genome build, various builds of the reference genomes for five organisms and six de novo assemblies for sample NA24385, we tested dnAQET to assess its capability for benchmarking quality evaluation of genome assemblies. For synthetic data, our quality score increased with decreasing number of misassemblies and redundancy and increasing average contig length and coverage, as expected. For genome builds, dnAQET quality score calculated for a more recent reference genome was better than the score for an older version. To compare with some of the most frequently used measures, 13 other quality measures were calculated. The quality score from dnAQET was found to be better than all other measures in terms of consistency with the known quality of the reference genomes, indicating that dnAQET is reliable for benchmarking quality assessment of de novo genome assemblies. CONCLUSIONS: The dnAQET is a scalable framework designed to evaluate a de novo genome assembly based on the aggregated quality of its scaffolds (or contigs). Our results demonstrated that dnAQET quality score is reliable for benchmarking quality assessment of genome assemblies. The dnQAET can help researchers to identify the most suitable assembly tools and to select high quality assemblies generated.


Assuntos
Genômica/métodos , Benchmarking , Mapeamento de Sequências Contíguas , Software
4.
Gigascience ; 8(7)2019 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-31307060

RESUMO

BACKGROUND: Acer yangbiense is a newly described critically endangered endemic maple tree confined to Yangbi County in Yunnan Province in Southwest China. It was included in a programme for rescuing the most threatened species in China, focusing on "plant species with extremely small populations (PSESP)". FINDINGS: We generated 64, 94, and 110 Gb of raw DNA sequences and obtained a chromosome-level genome assembly of A. yangbiense through a combination of Pacific Biosciences Single-molecule Real-time, Illumina HiSeq X, and Hi-C mapping, respectively. The final genome assembly is ∼666 Mb, with 13 chromosomes covering ∼97% of the genome and scaffold N50 sizes of 45 Mb. Further, BUSCO analysis recovered 95.5% complete BUSCO genes. The total number of repetitive elements account for 68.0% of the A. yangbiense genome. Genome annotation generated 28,320 protein-coding genes, assisted by a combination of prediction and transcriptome sequencing. In addition, a nearly 1:1 orthology ratio of dot plots of longer syntenic blocks revealed a similar evolutionary history between A. yangbiense and grape, indicating that the genome has not undergone a whole-genome duplication event after the core eudicot common hexaploidization. CONCLUSION: Here, we report a high-quality de novo genome assembly of A. yangbiense, the first genome for the genus Acer and the family Aceraceae. This will provide fundamental conservation genomics resources, as well as representing a new high-quality reference genome for the economically important Acer lineage and the wider order of Sapindales.


Assuntos
Acer/genética , Cromossomos de Plantas/genética , Espécies em Perigo de Extinção , Genoma de Planta , China , Mapeamento de Sequências Contíguas , Anotação de Sequência Molecular , Sequenciamento Completo do Genoma
5.
Gigascience ; 8(7)2019 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-31289832

RESUMO

BACKGROUND: The blood clam, Scapharca (Anadara) broughtonii, is an economically and ecologically important marine bivalve of the family Arcidae. Efforts to study their population genetics, breeding, cultivation, and stock enrichment have been somewhat hindered by the lack of a reference genome. Herein, we report the complete genome sequence of S. broughtonii, a first reference genome of the family Arcidae. FINDINGS: A total of 75.79 Gb clean data were generated with the Pacific Biosciences and Oxford Nanopore platforms, which represented approximately 86× coverage of the S. broughtonii genome. De novo assembly of these long reads resulted in an 884.5-Mb genome, with a contig N50 of 1.80 Mb and scaffold N50 of 45.00 Mb. Genome Hi-C scaffolding resulted in 19 chromosomes containing 99.35% of bases in the assembled genome. Genome annotation revealed that nearly half of the genome (46.1%) is composed of repeated sequences, while 24,045 protein-coding genes were predicted and 84.7% of them were annotated. CONCLUSIONS: We report here a chromosomal-level assembly of the S. broughtonii genome based on long-read sequencing and Hi-C scaffolding. The genomic data can serve as a reference for the family Arcidae and will provide a valuable resource for the scientific community and aquaculture sector.


Assuntos
Bivalves/genética , Cromossomos/genética , Genoma , Animais , Mapeamento de Sequências Contíguas , Anotação de Sequência Molecular , Sequenciamento Completo do Genoma
6.
Gigascience ; 8(7)2019 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-31289836

RESUMO

BACKGROUND: Mammalian X and Y chromosomes share a common evolutionary origin and retain regions of high sequence similarity. Similar sequence content can confound the mapping of short next-generation sequencing reads to a reference genome. It is therefore possible that the presence of both sex chromosomes in a reference genome can cause technical artifacts in genomic data and affect downstream analyses and applications. Understanding this problem is critical for medical genomics and population genomic inference. RESULTS: Here, we characterize how sequence homology can affect analyses on the sex chromosomes and present XYalign, a new tool that (1) facilitates the inference of sex chromosome complement from next-generation sequencing data; (2) corrects erroneous read mapping on the sex chromosomes; and (3) tabulates and visualizes important metrics for quality control such as mapping quality, sequencing depth, and allele balance. We find that sequence homology affects read mapping on the sex chromosomes and this has downstream effects on variant calling. However, we show that XYalign can correct mismapping, resulting in more accurate variant calling. We also show how metrics output by XYalign can be used to identify XX and XY individuals across diverse sequencing experiments, including low- and high-coverage whole-genome sequencing, and exome sequencing. Finally, we discuss how the flexibility of the XYalign framework can be leveraged for other uses including the identification of aneuploidy on the autosomes. XYalign is available open source under the GNU General Public License (version 3). CONCLUSIONS: Sex chromsome sequence homology causes the mismapping of short reads, which in turn affects downstream analyses. XYalign provides a reproducible framework to correct mismapping and improve variant calling on the sex chromsomes.


Assuntos
Cromossomos Humanos X/genética , Cromossomos Humanos Y/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Homologia de Sequência do Ácido Nucleico , Artefatos , Mapeamento de Sequências Contíguas/métodos , Mapeamento de Sequências Contíguas/normas , Feminino , Sequenciamento de Nucleotídeos em Larga Escala/normas , Humanos , Masculino , Alinhamento de Sequência/métodos , Alinhamento de Sequência/normas , Análise de Sequência de DNA/normas
7.
Gene ; 712: 143962, 2019 Sep 05.
Artigo em Inglês | MEDLINE | ID: mdl-31288057

RESUMO

Veratrum nigrum is protected plant of Melanthiaceae family, able to synthetize unique steroidal alkaloids important for pharmacy. Transcriptomes from leaves, stems and rhizomes of in vitro maintained V. nigrum plants were sequenced and annotated for genes and markers discovery. Sequencing of samples derived from the different organs resulted in a total of 108,511 contigs with a mean length of 596 bp. Transcripts derived from leaf and stalk were annotated at 28%, and 38% in Nr nucleotide database, respectively. The sequencing revealed 949 unigenes related with lipid metabolism, including 73 transcripts involved in steroids and genus-specific steroid alkaloids biosynthesis. Additionally, 3203 candidate SSRs markers we identified in unigenes with average density of one SSR locus every 6.2 kb sequence. Unraveling of biochemical machinery of the pathway responsible for steroidal alkaloids will open possibility to design and optimize biotechnological process. The transcriptomic data provide valuable resources for biochemical, molecular genetics, comparative transcriptomics, functional genomics, ecological and evolutionary studies of V. nigrum.


Assuntos
Alcaloides/biossíntese , Regulação da Expressão Gênica de Plantas , Esteroides/biossíntese , Transcriptoma , Veratrum/metabolismo , Mapeamento de Sequências Contíguas , DNA Complementar/metabolismo , Biblioteca Gênica , Ontologia Genética , Marcadores Genéticos , Sequenciamento de Nucleotídeos em Larga Escala , Repetições de Microssatélites , Anotação de Sequência Molecular , Análise de Sequência com Séries de Oligonucleotídeos , Folhas de Planta/metabolismo , Proteínas de Plantas/metabolismo , Raízes de Plantas/metabolismo , Análise de Sequência de RNA
8.
Genes Genomics ; 41(9): 1077-1083, 2019 09.
Artigo em Inglês | MEDLINE | ID: mdl-31187446

RESUMO

BACKGROUND: With the advent of next-generation sequencing techniques, culture-independent metagenome approaches have now made it possible to predict possible presence of genes in the environmental bacteria most of which may be non-cultivable. Short reads obtained from the deep sequencing can be assembled into long contigs some of which include plasmids. Plasmids are the circular double stranded DNA in bacteria and known as one of the major carriers of antibiotic resistance genes. OBJECTIVE: Metagenomic analyses, especially focused on plasmids, could help us predict dissemination mechanisms of antibiotic resistance genes in the environment. However, with the availability of a myriad of metagenomic assemblers, the selection of the most appropriate metagenome assembler for the plasmid metagenome study might be challenging. Therefore, in this study, we compared five open source assemblers to suggest most effective way of plasmid metagenome analysis. METHODS: IDBA-UD, MEGAHIT, SPAdes, SOAPdenovo2, and Velvet are compared for conducting plasmid metagenome analyses using two water samples. RESULTS: Our results clearly showed that abundance and types of antibiotic resistance genes on plasmids varied depending on the selection of assembly tools. IDBA-UD and MEGAHIT demonstrated the overall best assembly statistics with high N50 values with higher portion of longer contigs. CONCLUSION: These two assemblers also detected more diverse plasmids. Among the two, MEGAHIT showed more memory efficient assembly, therefore we suggest that the use of MEGAHIT for plasmid metagenome analysis may offer more diverse plasmids with less computer resource required. Here, we also summarized a fundamental plasmid metagenome work flow, especially for antibiotic resistance gene investigation.


Assuntos
Mapeamento de Sequências Contíguas/métodos , Metagenômica/métodos , Análise de Sequência de DNA/métodos , Software , Metagenoma , Microbiota/genética , Plasmídeos/genética , Microbiologia da Água
9.
BMC Genomics ; 20(Suppl 5): 426, 2019 Jun 06.
Artigo em Inglês | MEDLINE | ID: mdl-31167639

RESUMO

BACKGROUND: Closing gaps in draft genomes is an important post processing step in genome assembly. It leads to more complete genomes, which benefits downstream genome analysis such as annotation and genotyping. Several tools have been developed for gap closing. However, these tools don't fully utilize the information contained in the sequence data. For example, while it is known that many gaps are caused by genomic repeats, existing tools often ignore many sequence reads that originate from a repeat-related gap. RESULTS: We compare GAPPadder with GapCloser, GapFiller and Sealer on one bacterial genome, human chromosome 14 and the human whole genome with paired-end and mate-paired reads with both short and long insert sizes. Empirical results show that GAPPadder can close more gaps than these existing tools. Besides closing gaps on draft genomes assembled only from short sequence reads, GAPPadder can also be used to close gaps for draft genomes assembled with long reads. We show GAPPadder can close gaps on the bed bug genome and the Asian sea bass genome that are assembled partially and fully with long reads respectively. We also show GAPPadder is efficient in both time and memory usage. CONCLUSION: In this paper, we propose a new approach called GAPPadder for gap closing. The main advantage of GAPPadder is that it uses more information in sequence data for gap closing. In particular, GAPPadder finds and uses reads that originate from repeat-related gaps. We show that these repeat-associated reads are useful for gap closing, even though they are ignored by all existing tools. Other main features of GAPPadder include utilizing the information in sequence reads with different insert sizes and performing two-stage local assembly of gap sequences. The results show that our method can close more gaps than several existing tools. The software tool, GAPPadder, is available for download at https://github.com/Reedwarbler/GAPPadder .


Assuntos
Mapeamento de Sequências Contíguas/métodos , Genoma Humano , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequências Repetitivas de Ácido Nucleico , Análise de Sequência de DNA/métodos , Algoritmos , Humanos , Software
10.
Hum Genet ; 138(7): 757-769, 2019 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-31168775

RESUMO

An ethnicity is characterized by genomic fragments, single nucleotide polymorphisms (SNPs), and structural variations specific to it. However, the widely used 'standard human reference genome' GRCh37/38 is based on Caucasians. Therefore, de novo-assembled reference genomes for specific ethnicities would have advantages for genetics and precision medicine applications, especially with the long-read sequencing techniques that facilitate genome assembly. In this study, we assessed the de novo-assembled Chinese Han reference genome HX1 vis-à-vis the standard GRCh38 for improving the quality of assembly and for ethnicity-specific applications. Surprisingly, all genomic sequencing datasets mapped better to GRCh38 than to HX1, even for the datasets of the Chinese Han population. This gap was mainly due to the massive structural misassembly of the HX1 reference genome rather than the SNPs between the ethnicities, and this misassembly could not be corrected by short-read whole-genome sequencing (WGS). For example, HX1 and the other de novo-assembled personal genomes failed to assemble the mitochondrial genome as a contig. We mapped 97.1% of dbSNP, 98.8% of ClinVar, and 97.2% of COSMIC variants to HX1. HX1-absent, non-synonymous ClinVar SNPs were involved in 140 genes and many important functions in various diseases, most of which were due to the assembly failure of essential exons. In contrast, the HX1-specific regions were scantly expressible, as shown in the cell lines and clinical samples of Chinese patients. Our results demonstrated that the de novo-assembled individual genome such as HX1 did not have advantages against the standard GRCh38 genome due to insufficient assembly quality, and that it is, therefore, not recommended for common use.


Assuntos
Grupo com Ancestrais do Continente Asiático/genética , Grupos Étnicos/genética , Genoma Humano , Genômica/normas , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Padrões de Referência , Análise de Sequência de DNA/normas , Algoritmos , Mapeamento de Sequências Contíguas , Genética Populacional , Humanos , Polimorfismo de Nucleotídeo Único , Transcriptoma
11.
Genome Res ; 29(8): 1352-1362, 2019 08.
Artigo em Inglês | MEDLINE | ID: mdl-31160374

RESUMO

Predicting biosynthetic gene clusters (BGCs) is critically important for discovery of antibiotics and other natural products. While BGC prediction from complete genomes is a well-studied problem, predicting BGCs in fragmented genomic assemblies remains challenging. The existing BGC prediction tools often assume that each BGC is encoded within a single contig in the genome assembly, a condition that is violated for most sequenced microbial genomes where BGCs are often scattered through several contigs, making it difficult to reconstruct them. The situation is even more severe in shotgun metagenomics, where the contigs are often short, and the existing tools fail to predict a large fraction of long BGCs. While it is difficult to assemble BGCs in a single contig, the structure of the genome assembly graph often provides clues on how to combine multiple contigs into segments encoding long BGCs. We describe biosyntheticSPAdes, a tool for predicting BGCs in assembly graphs and demonstrate that it greatly improves the reconstruction of BGCs from genomic and metagenomics data sets.


Assuntos
Genes Bacterianos , Metagenoma , Metagenômica/métodos , Família Multigênica , Software , Mapeamento de Sequências Contíguas , Conjuntos de Dados como Assunto , Placa Dentária/microbiologia , Gengiva/microbiologia , Humanos , Internet , Mucosa Bucal/microbiologia , Faringe/microbiologia , Biossíntese de Proteínas , Língua/microbiologia
12.
PLoS One ; 14(5): e0215077, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31042716

RESUMO

Here, we present the genome of the industrial ethanol production strain Brettanomyces bruxellensis CBS 11270. The nuclear genome was found to be diploid, containing four chromosomes with sizes of ranging from 2.2 to 4.0 Mbp. A 75 Kbp mitochondrial genome was also identified. Comparing the homologous chromosomes, we detected that 0.32% of nucleotides were polymorphic, i.e. formed single nucleotide polymorphisms (SNPs), 40.6% of them were found in coding regions (i.e. 0.13% of all nucleotides formed SNPs and were in coding regions). In addition, 8,538 indels were found. The total number of protein coding genes was 4897, of them, 4,284 were annotated on chromosomes; and the mitochondrial genome contained 18 protein coding genes. Additionally, 595 genes, which were annotated, were on contigs not associated with chromosomes. A number of genes was duplicated, most of them as tandem repeats, including a six-gene cluster located on chromosome 3. There were also examples of interchromosomal gene duplications, including a duplication of a six-gene cluster, which was found on both chromosomes 1 and 4. Gene copy number analysis suggested loss of heterozygosity for 372 genes. This may reflect adaptation to relatively harsh but constant conditions of continuous fermentation. Analysis of gene topology showed that most of these losses occurred in clusters of more than one gene, the largest cluster comprising 33 genes. Comparative analysis against the wine isolate CBS 2499 revealed 88,534 SNPs and 8,133 indels. Moreover, when the scaffolds of the CBS 2499 genome assembly were aligned against the chromosomes of CBS 11270, many of them aligned completely, some have chunks aligned to different chromosomes, and some were in fact rearranged. Our findings indicate a highly dynamic genome within the species B. bruxellensis and a tendency towards reduction of gene number in long-term continuous cultivation.


Assuntos
Brettanomyces/metabolismo , Cromossomos Fúngicos/genética , Etanol/metabolismo , Mitocôndrias/genética , Brettanomyces/genética , Mapeamento de Sequências Contíguas , Evolução Molecular , Dosagem de Genes , Variação Genética , Tamanho do Genoma , Anotação de Sequência Molecular , Filogenia , Sequenciamento Completo do Genoma/métodos
13.
Gigascience ; 8(5)2019 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-31077315

RESUMO

BACKGROUND: In recent years, massively parallel complementary DNA sequencing (RNA sequencing [RNA-Seq]) has emerged as a fast, cost-effective, and robust technology to study entire transcriptomes in various manners. In particular, for non-model organisms and in the absence of an appropriate reference genome, RNA-Seq is used to reconstruct the transcriptome de novo. Although the de novo transcriptome assembly of non-model organisms has been on the rise recently and new tools are frequently developing, there is still a knowledge gap about which assembly software should be used to build a comprehensive de novo assembly. RESULTS: Here, we present a large-scale comparative study in which 10 de novo assembly tools are applied to 9 RNA-Seq data sets spanning different kingdoms of life. Overall, we built >200 single assemblies and evaluated their performance on a combination of 20 biological-based and reference-free metrics. Our study is accompanied by a comprehensive and extensible Electronic Supplement that summarizes all data sets, assembly execution instructions, and evaluation results. Trinity, SPAdes, and Trans-ABySS, followed by Bridger and SOAPdenovo-Trans, generally outperformed the other tools compared. Moreover, we observed species-specific differences in the performance of each assembler. No tool delivered the best results for all data sets. CONCLUSIONS: We recommend a careful choice and normalization of evaluation metrics to select the best assembling results as a critical step in the reconstruction of a comprehensive de novo transcriptome assembly.


Assuntos
Análise de Sequência de RNA/métodos , Software , Transcriptoma , Animais , Arabidopsis , Mapeamento de Sequências Contíguas/métodos , Mapeamento de Sequências Contíguas/normas , Escherichia coli , Humanos , Camundongos , Análise de Sequência de RNA/normas
14.
BMC Genomics ; 20(1): 370, 2019 May 14.
Artigo em Inglês | MEDLINE | ID: mdl-31088494

RESUMO

BACKGROUND: The club-legged grasshopper Gomphocerus sibiricus is a Gomphocerinae grasshopper with a promising future as model species for studying the maintenance of colour-polymorphism, the genetics of sexual ornamentation and genome size evolution. However, limited molecular resources are available for this species. Here, we present a de novo transcriptome assembly as reference resource for gene expression studies. We used high-throughput Illumina sequencing to generate 5,070,036 paired-end reads after quality filtering. We then combined the best-assembled contigs from three different de novo transcriptome assemblers (Trinity, SOAPdenovo-trans and Oases/Velvet) into a single assembly. RESULTS: This resulted in 82,251 contigs with a N50 of 1357 and a TransRate assembly score of 0.325, which compares favourably with other orthopteran transcriptome assemblies. Around 87% of the transcripts could be annotated using InterProScan 5, BLASTx and the dammit! annotation pipeline. We identified a number of genes involved in pigmentation and green pigment metabolism pathways. Furthermore, we identified 76,221 putative single nucleotide polymorphisms residing in 8400 contigs. We also assembled the mitochondrial genome and investigated levels of sequence divergence with other species from the genus Gomphocerus. Finally, we detected and assembled Wolbachia sequences, which revealed close sequence similarity to the strain pel wPip. CONCLUSIONS: Our study has generated a significant resource for uncovering genotype-phenotype associations in a species with an extraordinarily large genome, while also providing mitochondrial and Wolbachia sequences that will be useful for comparative studies.


Assuntos
Perfilação da Expressão Gênica/métodos , Gafanhotos/genética , Mitocôndrias/genética , Análise de Sequência de RNA/métodos , Animais , Mapeamento de Sequências Contíguas , Feminino , Estudos de Associação Genética , Tamanho do Genoma , Sequenciamento de Nucleotídeos em Larga Escala , Masculino , Anotação de Sequência Molecular
15.
Gene ; 710: 30-38, 2019 Aug 20.
Artigo em Inglês | MEDLINE | ID: mdl-31128222

RESUMO

Pelodera strongyloides is a generally free-living gonochoristic facultative nematode. The whole genomic sequence of P. strongyloides remains unknown but 4 small subunit ribosomal RNA (ssrRNA) gene sequences are available. This project launched a de novo transcriptome assembly with 100 bp paired-end RNA-seq reads from normal, starved and wet-plate cultured animals. Trinity assembly tool generated 104,634 transcript contigs with N50 contig being 2195 bp and average contig length at 1103 bp. Transcriptome BLASTX matching results of five nematodes (C. elegans, Strongyloides stercoralis, Necator americanus, Trichuris trichiura, and Pristionchus pacificus) were consistent with their evolutionary relationships. Sixteen genes were identified to be homologous to key elements of the C. elegans RNA interference system, such as Dicer, Argonaute, RNA-dependent RNA polymerase and double strand RNA transport proteins. In starved samples, we observed up-regulation of cuticle related genes and 3 dauer formation genes. Dauer morphology was captured with enlarged phasmid under light microscopy, and dauer and normal larvae counts in clumps had a Pearson's product-moment correlation of 0.805 with P-value = 0.0088. Our results demonstrate that P. strongyloides could be used for studying nematode-related human or pet parasitic diseases. The sequenced assembled transcriptome reported here may be useful to understand the evolution of parasitism in Nematoda.


Assuntos
Perfilação da Expressão Gênica/métodos , Proteínas de Helminto/genética , Rhabditoidea/genética , Animais , Mapeamento de Sequências Contíguas , Evolução Molecular , Regulação da Expressão Gênica , Filogenia , Rhabditoidea/anatomia & histologia , Análise de Sequência de RNA/métodos
16.
Nat Commun ; 10(1): 1702, 2019 04 12.
Artigo em Inglês | MEDLINE | ID: mdl-30979905

RESUMO

The ultimate goal for diploid genome determination is to completely decode homologous chromosomes independently, and several phasing programs from consensus sequences have been developed. These methods work well for lowly heterozygous genomes, but the manifold species have high heterozygosity. Additionally, there are highly divergent regions (HDRs), where the haplotype sequences differ considerably. Because HDRs are likely to direct various interesting biological phenomena, many genomic analysis targets fall within these regions. However, they cannot be accessed by existing phasing methods, and we have to adopt costly traditional methods. Here, we develop a de novo haplotype assembler, Platanus-allee ( http://platanus.bio.titech.ac.jp/platanus2 ), which initially constructs each haplotype sequence and then untangles the assembly graphs utilizing sequence links and synteny information. A comprehensive benchmark analysis reveals that Platanus-allee exhibits high recall and precision, particularly for HDRs. Using this approach, previously unknown HDRs are detected in the human genome, which may uncover novel aspects of genome variability.


Assuntos
Alelos , Biologia Computacional/métodos , Haplótipos , Heterozigoto , Algoritmos , Animais , Benchmarking , Borboletas , Caenorhabditis elegans , Mapeamento de Sequências Contíguas , Variação Genética , Humanos , Distribuição de Poisson , Schistosoma japonicum , Software
17.
Gene ; 703: 35-49, 2019 Jun 30.
Artigo em Inglês | MEDLINE | ID: mdl-30953708

RESUMO

The facultative air-breathing magur catfish (Clarias magur) frequently face different environmental challenges, such as hyper-ammonia, and desiccation stresses in their natural habitats. All these stresses lead to higher accumulation of body ammonia, thereby causing various harmful effects to the fish due to its toxicity. Nonetheless, the mechanisms underlying ammonia-induced toxicity is yet not clear. In the present study, we used RNA sequencing and utilized a modified method for de novo assembly of the transcriptome to provide an exhaustive study on the transcriptomic alterations of magur catfish in response to high environmental ammonia (HEA; 25 mM NH4Cl). The final contig assembly produced a total of 311,076 unique transcripts (termed as unigenes) with a GC content of 48.3% and the average length of 599 bp. A considerable number of SSR marker associated with these unigenes were also detected. A total of 279,156 transcripts were successfully annotated by using various databases. Comparative transcriptomic analysis revealed a total of 3453 and 19,455 genes were differentially expressed in the liver and brain tissues, respectively, in ammonia-treated fish compared to the control. Enrichment analysis of the differentially expressed genes (DEGs) showed that several GO and KEGG pathway terms were significantly over-represented. Functional analysis of significantly elevated DEGs demonstrated that ammonia stress tolerance of the magur catfish was associated with quite a few pathways related to immune response, oxidative stress, and apoptosis, as well as few transporter proteins involved with ammonia and urea transport. Both liver and brain tissues showed HEA-mediated oxidative damage with consequent activation of antioxidant machinery. However, elevated ROS levels led to an activation of inflammatory cytokines and thus innate immune response in the liver. Conversely, in the brain ROS-mediated irreversible cell damages activated apoptosis via both p53-Bax-Bcl2 and caspase-mediated pathways. The present study provides a novel understanding of the molecular responses of this air-breathing catfish against the ammonia-induced stressors, which could elucidate the underlying mechanisms of adaptation of this facultative air-breather living under various environmental constraints.


Assuntos
Amônia/toxicidade , Peixes-Gato/fisiologia , Proteínas de Peixes/genética , Perfilação da Expressão Gênica/métodos , Adaptação Fisiológica , Animais , Composição de Bases , Encéfalo/efeitos dos fármacos , Encéfalo/metabolismo , Peixes-Gato/genética , Mapeamento de Sequências Contíguas , Regulação da Expressão Gênica/efeitos dos fármacos , Redes Reguladoras de Genes/efeitos dos fármacos , Fígado/efeitos dos fármacos , Fígado/metabolismo , Espécies Reativas de Oxigênio/metabolismo , Análise de Sequência de RNA/métodos
18.
Mol Phylogenet Evol ; 135: 193-202, 2019 06.
Artigo em Inglês | MEDLINE | ID: mdl-30914393

RESUMO

Holoparasitism has led to extreme plastome reduction. Plastomes in the legume holoparasite Pilostyles (Apodanthaceae) are the most reduced in both size and gene content known so far in Embryophytes. Here, we found that the Pilostyles boyacensis plastome, the only American species sequenced so far, is reduced to seven functional genes, accD, rpl2, rrn16 (=16S), rrn23 (=23S), rps3, rps12 and a putative oxidoreductase (PbOx). An additional gene, not annotated in the genome, is actively transcribed between accD and rps12, and by synteny we predict corresponds to rps4. We present data on plastome assembly, transcriptomic data that confirm the transcriptional activity of all genes and describe for the first time six transcript variants of a putative ORF likely having oxidoreductase activity. Our data show that such extreme reduction in P. boyacensis is similar but not identical to that reported in one Australian and one African species of the genus. Such intercontinental similarity suggests that the legume-Pilostyles holoparasitism was already in place during the main African-Australian-South American break-up. We compare plastome content and synteny between the three sequenced species, perform phylogenetic analyses across angiosperms of the six annotated plastome genes, and discuss the odd phylogenetic affinities of 16S and 23S, likely caused by HGT prior the diversification of both legumes and Pilostyles.


Assuntos
Genes de Plantas , Genomas de Plastídeos/genética , Magnoliopsida/genética , África , Sequência de Aminoácidos , Austrália , Sequência de Bases , Mapeamento de Sequências Contíguas , Anotação de Sequência Molecular , Filogenia , Sintenia/genética , Transcrição Genética
19.
Nat Biotechnol ; 37(2): 186-192, 2019 02.
Artigo em Inglês | MEDLINE | ID: mdl-30718869

RESUMO

Understanding gut microbiome functions requires cultivated bacteria for experimental validation and reference bacterial genome sequences to interpret metagenome datasets and guide functional analyses. We present the Human Gastrointestinal Bacteria Culture Collection (HBC), a comprehensive set of 737 whole-genome-sequenced bacterial isolates, representing 273 species (105 novel species) from 31 families found in the human gastrointestinal microbiota. The HBC increases the number of bacterial genomes derived from human gastrointestinal microbiota by 37%. The resulting global Human Gastrointestinal Bacteria Genome Collection (HGG) classifies 83% of genera by abundance across 13,490 shotgun-sequenced metagenomic samples, improves taxonomic classification by 61% compared to the Human Microbiome Project (HMP) genome collection and achieves subspecies-level classification for almost 50% of sequences. The improved resource of gastrointestinal bacterial reference sequences circumvents dependence on de novo assembly of metagenomes and enables accurate and cost-effective shotgun metagenomic analyses of human gastrointestinal microbiota.


Assuntos
Genoma Bacteriano , Metagenoma , Metagenômica , Bactérias/classificação , Biologia Computacional/métodos , Mapeamento de Sequências Contíguas , Microbioma Gastrointestinal , Genoma Humano , Humanos , Filogenia , RNA Ribossômico 16S/metabolismo , Análise de Sequência de DNA , Especificidade da Espécie
20.
Gene ; 691: 96-105, 2019 Apr 05.
Artigo em Inglês | MEDLINE | ID: mdl-30630096

RESUMO

Vriesea carinata is an endemic bromeliad from the Brazilian Atlantic Forest. It has trichome and tank system in their leaves which allows to absorb water and nutrients. It belongs to Bromeliaceae family, which includes several species highly enriched of cysteine-proteases (CysPs). These proteolytic enzymes regulate processes as senescence, cell differentiation, pathogen-linked programmed cell death and mobilization of proteins. Although, their biological importance, there are not genomic resources in V. carinata that can help to identify and understand their molecular mechanisms involved in different biological processes. Thus high-throughput transcriptome sequencing of V. carinata is necessary to generate sequences for the purpose of gene discovery and functional genomic studies. In the present study, we sequenced and assembled the V. carinata transcriptome to the identification of CysPs. A total of 43,232 contigs were assembled for the leaf tissue. BLAST analysis indicated that 23,803 contigs exhibited similarity to non-redundant Viridiplantae proteins. 28.24% of the contigs were classified into the COG database, and gene ontology categorized them into 61 functional groups. A metabolic pathway analysis with KEGG revealed 9679 contigs assigned to 31 metabolic pathways. Among 16 full-length CysPs identified, 11 were evaluated in respect to their expression patterns in the leaf apex, base and inflorescence tissues. The results showed differential expression levels of legumain, metacaspase, pyroglutamyl and papain-like CysPs depending of the leaf region. These results provide a global overview of V. carinata gene functions and expression activities of CysPs in those tissues.


Assuntos
Bromeliaceae/genética , Mapeamento de Sequências Contíguas/métodos , Cisteína Proteases/genética , Perfilação da Expressão Gênica/métodos , Regulação da Expressão Gênica de Plantas , Sequenciamento de Nucleotídeos em Larga Escala , Redes e Vias Metabólicas , Anotação de Sequência Molecular , Família Multigênica , Folhas de Planta/genética , Proteínas de Plantas/genética , Análise de Sequência de RNA
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA