RESUMEN
In present study, single molecule-real time sequencing technology was used to obtain a validated set of microsatellite markers for application in population genetics of the primitive fish, Chitala chitala. Assembly of circular consensus sequencing reads resulted into 1164 sequences which contained 2005 repetitive motifs. A total of 100 sequences were used for primer designing and amplification yielded a set of 28 validated polymorphic markers. These loci were used to genotype n = 72 samples from three distant riverine populations of India, namely Son, Satluj and Brahmaputra, for determining intraspecific genetic variation. The microsatellite loci exhibited high level of polymorphism with PIC values ranging from 0.281 to 0.901. The genetic parameters revealed that mean heterozygosity ranged from 0.6802 to 0.6826 and the populations were found to be genetically diverse (Fst 0.03-0.06). This indicated the potential application of these microsatellite marker set that can used for stock characterization of C. chitala, in the wild. These newly developed loci were assayed for cross transferability in another notopterid fish, Notopterus notopterus.
Asunto(s)
Peces/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Repeticiones de Microsatélite/genética , Animales , Variación Genética/genética , Genética de Población/métodos , Genotipo , India , Polimorfismo Genético/genética , Análisis de Secuencia de ADN/métodosRESUMEN
BACKGROUND: Interferon inducible transmembrane (IFITM) proteins are effectors of the immune system widely characterized for their role in restricting infection by diverse enveloped and non-enveloped viruses. The chicken IFITM (chIFITM) genes are clustered on chromosome 5 and to date four genes have been annotated, namely chIFITM1, chIFITM3, chIFITM5 and chIFITM10. However, due to poor assembly of this locus in the Gallus Gallus v4 genome, accurate characterization has so far proven problematic. Recently, a new chicken reference genome assembly Gallus Gallus v5 was generated using Sanger, 454, Illumina and PacBio sequencing technologies identifying considerable differences in the chIFITM locus over the previous genome releases. METHODS: We re-sequenced the locus using both Illumina MiSeq and PacBio RS II sequencing technologies and we mapped RNA-seq data from the European Nucleotide Archive (ENA) to this finalized chIFITM locus. Using SureSelect probes capture probes designed to the finalized chIFITM locus, we sequenced the locus of a different chicken breed, namely a White Leghorn, and a turkey. RESULTS: We confirmed the Gallus Gallus v5 consensus except for two insertions of 5 and 1 base pair within the chIFITM3 and B4GALNT4 genes, respectively, and a single base pair deletion within the B4GALNT4 gene. The pull down revealed a single amino acid substitution of A63V in the CIL domain of IFITM2 compared to Red Jungle fowl and 13, 13 and 11 differences between IFITM1, 2 and 3 of chickens and turkeys, respectively. RNA-seq shows chIFITM2 and chIFITM3 expression in numerous tissue types of different chicken breeds and avian cell lines, while the expression of the putative chIFITM1 is limited to the testis, caecum and ileum tissues. CONCLUSIONS: Locus resequencing using these capture probes and RNA-seq based expression analysis will allow the further characterization of genetic diversity within Galliformes.
Asunto(s)
Galliformes/genética , Sitios Genéticos/genética , Variación Genética , Análisis de Secuencia de ARN , AnimalesRESUMEN
Among the biotic factors, which affect the productivity and quality of sugarcane, red rot disease caused by the fungal pathogen, Colletotrichum falcatum is the most devastating that cause enormous loss to millers as well as cane growers. We present a highly contiguous genome assembly of C. falcatum pathotype Cf08 which is virulent to popular sugarcane varieties grown in more than 3 million hectares in sub-tropical India. By performing long read sequencing on PacBio RSII system, 56.06 Mb assemblies with 238 contigs having N50 of 0.51 Mb and L50 of 34 was produced. A BUSCO completeness score of 97.24% (including 4.1% fragmented) of the entire C. falcatum Cf08 nuclear genome, greatly improved contiguity compared to an existing highly fragmented draft of C. falcatum Cf671 genome (48.13 Mb) was obtained. This Cf08 assembly had 54.14% GC content and possessed < 1% repetitive elements. A total of 18,635 protein-coding genes were predicted compared with 12,270 for Cf671. Among 617 CAZymes predicted, glycoside hydrolases were the predominant (298), and among 7264 genes associated with pathogenicity/virulence, 77 genes having effector functions were identified. The assembled genome showed its similarity with the genome of C. graminicola and C. higginsianum, the causal organisms of anthracnose in maize and in members of Brassicaceae, respectively. A total of 94 large sequences (> 100 kb) of Cf08 were mapped over C. higginsianum 10 of 12 chromosomes with 106 synteny blocks. Results discussed here would provide an important tool for future studies of evolutionary and functional genomics in C. falcatum. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s13205-021-02695-x.
RESUMEN
Erysipelothrix rhusiopathiae is a common pathogen responsible for pig erysipelas. However, the molecular basis for the pathogenesis of E. rhusiopathiae remains to be elucidated. In this study, the complete genome sequence of the E. rhusiopathiae strain WH13013, a pathogenic isolate from a diseased pig, was generated using a combined strategy of PacBio RSII and Illumina sequencing technologies. The strategy finally generated a single circular chromosome of approximately 1.78 Mb in size for the complete genome of WH13013, with an average GC content of 36.49%. The genome of WH13013 encoded 1633 predicted proteins, 55 tRNAs, as well as 15 rRNAs. It contained four genomic islands and several resistance-associated genes were identified within these islands. Phylogenetic analysis revealed that WH13013 was close to many other sequenced E. rhusiopathiae virulent strains. The comprehensive comparative analysis of eight E. rhusiopathiae virulent strains, including WH13013, identified a total of 1184 core genes. A large proportion (approximately 75.31%) of these core genes participated in nutrition and energy uptake and metabolism as well as the other bioactivities that are necessary for bacterial survival and adaption. The core genes also contained those encoding proteins participating in the biosynthesis and/or the components of the proposed virulence factors of E. rhusiopathiae, including the capsule (cpsA, cpsB, cpsC), neuraminidase (nanH), hyaluronidase (hylA, hylB, hylC), and surface proteins (spaA, rspA, rspB). The obtaining of the complete genome sequence of this virulent strain, WH13013, and this comprehensive comparative genome analysis will help in further studies of the genetic basis of the pathogenesis of E. rhusiopathiae.
RESUMEN
Single molecular real-time (SMRT) sequencing, also called third-generation sequencing, is a novel sequencing technique capable of generating extremely long contiguous sequence reads. While conventional short-read sequencing cannot evaluate the linkage of nucleotide substitutions distant from one another, SMRT sequencing can directly demonstrate linkage of nucleotide changes over a span of more than 20 kbp, and thus can be applied to directly examine the haplotypes of viruses or bacteria whose genome structures are changing in real time. In addition, an error correction method (circular consensus sequencing) has been established and repeated sequencing of a single-molecule DNA template can result in extremely high accuracy. The advantages of long read sequencing enable accurate determination of the haplotypes of individual viral clones. SMRT sequencing has been applied in various studies of viral genomes including determination of the full-length contiguous genome sequence of hepatitis C virus (HCV), targeted deep sequencing of the HCV NS5A gene, and assessment of heterogeneity among viral populations. Recently, the emergence of multi-drug resistant HCV viruses has become a significant clinical issue and has been also demonstrated using SMRT sequencing. In this review, we introduce the novel third-generation PacBio RSII/Sequel systems, compare them with conventional next-generation sequencers, and summarize previous studies in which SMRT sequencing technology has been applied for HCV genome analysis. We also refer to another long-read sequencing platform, nanopore sequencing technology, and discuss the advantages, limitations and future perspectives in using these third-generation sequencers for HCV genome analysis.
Asunto(s)
Genoma Viral/genética , Hepacivirus/genética , Hepatitis C/diagnóstico , Secuenciación de Nucleótidos de Alto Rendimiento/instrumentación , Antivirales/farmacología , Antivirales/uso terapéutico , Farmacorresistencia Viral Múltiple/genética , Hepacivirus/aislamiento & purificación , Hepatitis C/tratamiento farmacológico , Hepatitis C/virología , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , ARN Viral/genéticaRESUMEN
Conventional mitochondrial-DNA (MT DNA) sequencing approaches use Sanger sequencing of 20-40 partially overlapping PCR fragments per individual, which is a time- and resource-consuming process. We have developed a high-throughput, accurate, fast, and cost-effective human MT DNA sequencing approach. In this setup we first generate long-range PCR products for two partially overlapping 7.7 and 9.2 kb MT DNA-specific amplicons, add sample-specific barcodes, and sequence these on the PacBio RSII system to obtain full-length MT DNA sequences for genotyping/haplotyping purposes.
Asunto(s)
ADN Mitocondrial/genética , Reacción en Cadena de la Polimerasa/métodos , Análisis de Secuencia de ADN/métodos , HumanosRESUMEN
This is a de novo assembly and annotation of a complete mitochondrial genome from Pyrus pyrifolia in the family Rosaceae. The complete mitochondrial genome of P. pyrifolia was assembled from PacBio RSII P6-C4 sequencing reads. The circular genome was 458,873 bp in length, containing 39 protein-coding genes, 23 tRNA genes and three rRNA genes. The nucleotide composition was A (27.5%), T (27.3%), G (22.6%) and C (22.6%) with GC content of 45.2%. Most of protein-coding genes use the canonical start codon ATG, whereas nad1, cox1, matR and rps4 use ACG, mttB uses ATT, rpl16 and rps19 uses GTG. The stop codon is also common in all mitochondrial genes. The phylogenetic analysis showed that P. pyrifolia was clustered with the Malus of Rosaceae family. Maximum-likelihood analysis suggests a clear relationship of Rosids and Asterids, which support the traditional classification.
RESUMEN
BACKGROUND: Cystic fibrosis (CF) is an autosomal recessive disease characterized by recurrent lung infections. Studies of the lung microbiome have shown an association between decreasing diversity and progressive disease. 454 pyrosequencing has frequently been used to study the lung microbiome in CF, but will no longer be supported. We sought to identify the benefits and drawbacks of using two state-of-the-art next generation sequencing (NGS) platforms, MiSeq and PacBio RSII, to characterize the CF lung microbiome. Each has its advantages and limitations. METHODS: Twelve samples of extracted bacterial DNA were sequenced on both MiSeq and PacBio NGS platforms. DNA was amplified for the V4 region of the 16S rRNA gene and libraries were sequenced on the MiSeq sequencing platform, while the full 16S rRNA gene was sequenced on the PacBio RSII sequencing platform. Raw FASTQ files generated by the MiSeq and PacBio platforms were processed in mothur v1.35.1. RESULTS: There was extreme discordance in alpha-diversity of the CF lung microbiome when using the two platforms. Because of its depth of coverage, sequencing of the 16S rRNA V4 gene region using MiSeq allowed for the observation of many more operational taxonomic units (OTUs) and higher Chao1 and Shannon indices than the PacBio RSII. Interestingly, several patients in our cohort had Escherichia, an unusual pathogen in CF. Also, likely because of its coverage of the complete 16S rRNA gene, only PacBio RSII was able to identify Burkholderia, an important CF pathogen. CONCLUSION: When comparing microbiome diversity in clinical samples from CF patients using 16S sequences, MiSeq and PacBio NGS platforms may generate different results in microbial community composition and structure. It may be necessary to use different platforms when trying to correctly identify dominant pathogens versus measuring alpha-diversity estimates, and it would be important to use the same platform for comparisons to minimize errors in interpretation.
Asunto(s)
Bacterias/clasificación , Bacterias/genética , Fibrosis Quística/microbiología , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Pulmón/microbiología , Microbiota/genética , Esputo/microbiología , Bacterias/patogenicidad , Secuencia de Bases , Biodiversidad , Clasificación , Biología Computacional/métodos , ADN Bacteriano/genética , Humanos , Metagenoma , Filogenia , ARN Ribosómico 16S/genéticaRESUMEN
INTRODUCTION: We present here a simple, phenotype-independent mutation assay using a PacBio RSII DNA sequencer employing single-molecule real-time (SMRT) sequencing technology. Salmonella typhimurium YG7108 was treated with the alkylating agent N-ethyl-N-nitrosourea (ENU) and grown though several generations to fix the induced mutations, the DNA was extracted and the mutations were analyzed by using the SMRT DNA sequencer. RESULTS: The ENU-induced base-substitution frequency was 15.4 per Megabase pair, which is highly consistent with our previous results based on colony isolation and next-generation sequencing. The induced mutation spectrum (95% G:C â A:T, 5% A:T â G:C) is also consistent with the known ENU signature. The base-substitution frequency of the control was calculated to be less than 0.12 per Megabase pair. A current limitation of the approach is the high frequency of artifactual insertion and deletion mutations it detects. CONCLUSIONS: Ultra-low frequency base-substitution mutations can be detected directly by using the SMRT DNA sequencer, and this technology provides a phenotype-independent mutation assay.