Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 19 de 19
Filter
Add more filters











Publication year range
1.
Vet Sci ; 9(10)2022 Oct 19.
Article in English | MEDLINE | ID: mdl-36288192

ABSTRACT

Avian metapneumoviruses (aMPV subtypes A-D) are respiratory and reproductive pathogens of poultry. Since aMPV-A was initially reported in Mexico in 2014, there have been no additional reports of its detection in the country. Using nontargeted next-generation sequencing (NGS) of FTA card-spotted respiratory samples from commercial chickens in Mexico, seven full genome sequences of aMPV-A (lengths of 13,288-13,381 nucleotides) were de novo assembled. Additionally, complete coding sequences of genes N (n = 2), P and M (n = 7 each), F and L (n = 1 each), M2 (n = 6), SH (n = 5) and G (n = 2) were reference-based assembled from another seven samples. The Mexican isolates phylogenetically group with, but in a distinct clade separate from, other aMPV-A strains. The genome and G-gene nt sequences of the Mexican aMPVs are closest to strain UK/8544/06 (97.22-97.47% and 95.07-95.83%, respectively). Various amino acid variations distinguish the Mexican isolates from each other, and other aMPV-A strains, most of which are in the G (n = 38), F (n = 12), and L (n = 19) proteins. Using our sequence data and publicly available aMPV-A data, we revised a previously published rRT-PCR test, which resulted in different cycling and amplification conditions for aMPV-A to make it more compatible with other commonly used rRT-PCR diagnostic cycling conditions. This is the first comprehensive sequence analysis of aMPVs in Mexico and demonstrates the value of nontargeted NGS to identify pathogens where targeted virus surveillance is likely not routinely performed.

2.
Plants (Basel) ; 11(14)2022 Jul 07.
Article in English | MEDLINE | ID: mdl-35890432

ABSTRACT

Soursop (Annona muricata L.) is climacteric fruit with a short ripening period and postharvest shelf life, leading to a rapid softening. In this study, transcriptome analysis of soursop fruits was performed to identify key gene families involved in ripening under postharvest storage conditions (Day 0, Day 3 stored at 28 ± 2 °C, Day 6 at 28 ± 2 °C, Day 3 at 15 ± 2 °C, Day 6 at 15 ± 2 °C, Day 9 at 15 ± 2 °C). The transcriptome analysis showed 224,074 transcripts assembled clustering into 95, 832 unigenes, of which 21, 494 had ORF. RNA-seq analysis showed the highest number of differentially expressed genes on Day 9 at 15 ± 2 °C with 9291 genes (4772 up-regulated and 4519 down-regulated), recording the highest logarithmic fold change in pectin-related genes. Enrichment analysis presented significantly represented GO terms and KEGG pathways associated with molecular function, metabolic process, catalytic activity, biological process terms, as well as biosynthesis of secondary metabolites, plant hormone signal, starch, and sucrose metabolism, plant-pathogen interaction, plant-hormone signal transduction, and MAPK-signaling pathways, among others. Network analysis revealed that pectinesterase genes directly regulate the loss of firmness in fruits stored at 15 ± 2 °C.

3.
Front Plant Sci ; 12: 697556, 2021.
Article in English | MEDLINE | ID: mdl-34490003

ABSTRACT

Melocactus glaucescens is an endangered cactus highly valued for its ornamental properties. In vitro shoot production of this species provides a sustainable alternative to overharvesting from the wild; however, its propagation could be improved if the genetic regulation underlying its developmental processes were known. The present study generated de novo transcriptome data, describing in vitro shoot organogenesis induction in M. glaucescens. Total RNA was extracted from explants before (control) and after shoot organogenesis induction (treated). A total of 14,478 unigenes (average length, 520 bases) were obtained using Illumina HiSeq 3000 (Illumina Inc., San Diego, CA, USA) sequencing and transcriptome assembly. Filtering for differential expression yielded 2,058 unigenes. Pairwise comparison of treated vs. control genes revealed that 1,241 (60.3%) unigenes exhibited no significant change, 226 (11%) were downregulated, and 591 (28.7%) were upregulated. Based on database analysis, more transcription factor families and unigenes appeared to be upregulated in the treated samples than in controls. Expression of WOUND INDUCED DEDIFFERENTIATION 1 (WIND1) and CALMODULIN (CaM) genes, both of which were upregulated in treated samples, was further validated by real-time quantitative PCR (RT-qPCR). Differences in gene expression patterns between control and treated samples indicate substantial changes in the primary and secondary metabolism of M. glaucescens after the induction of shoot organogenesis. These results help to clarify the molecular genetics and functional genomic aspects underlying propagation in the Cactaceae family.

4.
Front Plant Sci ; 12: 667060, 2021.
Article in English | MEDLINE | ID: mdl-33968119

ABSTRACT

Plukenetia volubilis L. (Malpighiales: Euphorbiaceae), also known as Sacha inchi, is considered a promising crop due to its high seed content of unsaturated fatty acids (UFAs), all of them highly valuable for food and cosmetic industries, but the genetic basis of oil biosynthesis of this non-model plant is still insufficient. Here, we sequenced the total DNA of Sacha inchi by using Illumina and Nanopore technologies and approached a de novo reconstruction of the whole nucleotide sequence and the organization of its 164,111 bp length of the chloroplast genome, displaying two copies of an inverted repeat sequence [inverted repeat A (IRA) and inverted repeat B (IRB)] of 28,209 bp, each one separating a small single copy (SSC) region of 17,860 bp and a large single copy (LSC) region of 89,833 bp. We detected two large inversions on the chloroplast genome that were not presented in the previously reported sequence and studied a promising cpDNA marker, useful in phylogenetic approaches. This chloroplast DNA (cpDNA) marker was used on a set of five distinct Colombian cultivars of P. volubilis from different geographical locations to reveal their phylogenetic relationships. Thus, we evaluated if it has enough resolution to genotype cultivars, intending to crossbreed parents and following marker's trace down to the F1 generation. We finally elucidated, by using molecular and cytological methods on cut flower buds, that the inheritance mode of P. volubilis cpDNA is maternally transmitted and proposed that it occurs as long as it is physically excluded during pollen development. This de novo chloroplast genome will provide a valuable resource for studying this promising crop, allowing the determination of the organellar inheritance mechanism of some critical phenotypic traits and enabling the use of genetic engineering in breeding programs to develop new varieties.

5.
Zoology (Jena) ; 146: 125923, 2021 06.
Article in English | MEDLINE | ID: mdl-33901836

ABSTRACT

Silks produced by webspinners (Order Embioptera) interact with water by transforming from fiber to film, which then becomes slippery and capable of shedding water. We chose to explore this mechanism by analyzing and comparing the silk protein transcripts of two species with overlapping distributions in Trinidad but from different taxonomic families. The transcript of one, Antipaluria urichi (Clothodidae), was partially characterized in 2009 providing a control for our methods to characterize a second species: Pararhagadochir trinitatis (Scelembiidae), a family that adds to the taxon sampling for this little known order of insects. Previous reports showed that embiopteran silk protein (dubbed Efibroin) consists of a protein core of repetitive motifs largely composed of glycine (Gly), serine (Ser), and alanine (Ala) and a highly conserved C-terminal region. Based on mRNA extracted from silk glands, Next Generation sequencing, and de novo assembly, P. trinitatis silk can be characterized by repetitive motifs of Gly-Ser followed periodically by Gly-Asparagine (Asn-an unusual amino acid for Efibroins) and by a lack of Ala which is otherwise common in Efibroins. The putative N-terminal domain, composed mostly of polar, charged and bulky amino acids, is ten amino acids long with cysteine in the 10th position-a feature likely related to stabilization of the silk fibers. The 29 amino acids of the C-terminus for P. trinitatis silk closely resemble that of other Efibroin sequences, which show 74% shared identity on average. Examination of hydropathicity of Efibroins of both P. trinitatis and An. urichi revealed that these proteins are largely hydrophilic despite having a thin lipid coating on each nano-fiber. We deduced that the hydrophilic quality differs for the two species: due to Ser and Asn for P. trinitatis silk and to previously undetected spacers in An. urichi silk. Spacers are known from some spider and silkworm silks but this is the first report of such for Embioptera. Analysis of hydropathicity revealed the largely hydrophilic quality of these silks and this feature likely explains why water causes the transformation from fiber to film. We compared spun silk to the transcript and detected not insignificant differences between the two measurements implying that as yet undetermined post-translational modifications of their silk may occur. In addition, we found evidence for codon bias in the nucleotides of the putative silk transcript for P. trinitatis, a feature also known for other embiopteran silk genes.


Subject(s)
Insecta/physiology , Silk/chemistry , Amino Acid Sequence , Animals , Ecosystem , Silk/physiology , Species Specificity , Trinidad and Tobago
6.
Heliyon ; 6(8): e04518, 2020 Aug.
Article in English | MEDLINE | ID: mdl-32817888

ABSTRACT

Raspberry (Rubus sp.) is a berries fruit with an ongoing agricultural and commercial interest due to its high contents of flavonoids and nutrients beneficial for human health. The growing demand for raspberries is facing great challenges associated mainly with the dispersal of diseases, which produces a decrease in productivity and fruit quality. A broad range of genomic resources is available for other Rosaceae species; however, genomic resources for species of the Rubus genus are still limited. Here, we characterize the transcriptome of the Rubus idaeus (Var. Amira) in order to 1) provide clues in the transcriptional changes of R. idaeus against tomato ringspot virus (ToRSV); and 2) generate genomic resources for this economically important species. We generate more than 200 million sequencing reads from two mRNA samples of raspberry, infected and not infected by ToRSV, using Illumina technology. After de novo assembly, we obtained 68,853 predicted protein-coding sequences of which 71.3% and 61.3% were annotated using Gene Ontology and Pfam databases, respectively. Moreover, we find 2,340 genes with differential expression between raspberries infected and not infected by ToRSV. Analysis of these genes shows functional enrichments of the oxidation-reduction process, cell wall biogenesis, terpene synthase activity, and lyase activity. These genes could be involved in the raspberry immune response through the interaction of different metabolic pathways; however, this statement needs further investigations. Up-regulation of genes encoding terpene synthases, multicopper oxidases, laccases, and beta-glucosidases might suggest that these enzymes appear to be the predominant transcriptome immune response of R. idaeus against ToRSV. Furthermore, we identify thousands of molecular markers (i.e., SSRs and SNPs), increasing considerably the genomic resources currently available for raspberries. This study is the first report on investigating the transcriptional changes of R. idaeus against ToRSV.

7.
Front Plant Sci ; 11: 729, 2020.
Article in English | MEDLINE | ID: mdl-32636853

ABSTRACT

Chloroplast genomes (plastomes) are frequently treated as highly conserved among land plants. However, many lineages of vascular plants have experienced extensive structural rearrangements, including inversions and modifications to the size and content of genes. Cacti are one of these lineages, containing the smallest plastome known for an obligately photosynthetic angiosperm, including the loss of one copy of the inverted repeat (∼25 kb) and the ndh gene suite, but only a few cacti from the subfamily Cactoideae have been sufficiently characterized. Here, we investigated the variation of plastome sequences across the second-major lineage of the Cactaceae, the subfamily Opuntioideae, to address (1) how variable is the content and arrangement of chloroplast genome sequences across the subfamily, and (2) how phylogenetically informative are the plastome sequences for resolving major relationships among the clades of Opuntioideae. Our de novo assembly of the Opuntia quimilo plastome recovered an organelle of 150,347 bp in length with both copies of the inverted repeat and the presence of all the ndh gene suite. An expansion of the large single copy unit and a reduction of the small single copy unit was observed, including translocations and inversion of genes, as well as the putative pseudogenization of some loci. Comparative analyses among all clades within Opuntioideae suggested that plastome structure and content vary across taxa of this subfamily, with putative independent losses of the ndh gene suite and pseudogenization of genes across disparate lineages, further demonstrating the dynamic nature of plastomes in Cactaceae. Our plastome dataset was robust in resolving three tribes with high support within Opuntioideae: Cylindropuntieae, Tephrocacteae and Opuntieae. However, conflicting topologies were recovered among major clades when exploring different assemblies of markers. A plastome-wide survey for highly informative phylogenetic markers revealed previously unused regions for future use in Sanger-based studies, presenting a valuable dataset with primers designed for continued evolutionary studies across Cactaceae. These results bring new insights into the evolution of plastomes in cacti, suggesting that further analyses should be carried out to address how ecological drivers, physiological constraints and morphological traits of cacti may be related with the common rearrangements in plastomes that have been reported across the family.

8.
BMC Bioinformatics ; 21(1): 293, 2020 Jul 08.
Article in English | MEDLINE | ID: mdl-32640978

ABSTRACT

BACKGROUND: Spliced Leader trans-splicing is an important mechanism for the maturation of mRNAs in several lineages of eukaryotes, including several groups of parasites of great medical and economic importance. Nevertheless, its study across the tree of life is severely hindered by the problem of identifying the SL sequences that are being trans-spliced. RESULTS: In this paper we present SLFinder, a four-step pipeline meant to identify de novo candidate SL sequences making very few assumptions regarding the SL sequence properties. The pipeline takes transcriptomic de novo assemblies and a reference genome as input and allows the user intervention on several points to account for unexpected features of the dataset. The strategy and its implementation were tested on real RNAseq data from species with and without SL Trans-Splicing. CONCLUSIONS: SLFinder is capable to identify SL candidates with good precision in a reasonable amount of time. It is especially suitable for species with unknown SL sequences, generating candidate sequences for further refining and experimental validation.


Subject(s)
RNA, Spliced Leader/chemistry , Software , Trans-Splicing , Animals , Genomics , Mice , RNA-Seq
9.
Front Genet ; 11: 604, 2020.
Article in English | MEDLINE | ID: mdl-32582300

ABSTRACT

Pacu (Piaractus mesopotamicus) is a Neotropical fish of major importance for South American aquaculture. Septicemia caused by Aeromonas hydrophila bacteria is currently considered a substantial threat for pacu aquaculture that have provoked infectious disease outbreaks with high economic losses. The understanding of molecular aspects on progress of A. hydrophila infection and pacu immune response is scarce, which have limited the development of genomic selection for resistance to this infection. The present study aimed to generate information on transcriptome of pacu in face of A. hydrophila infection, and compare the transcriptomic responses between two groups of time-series belonging to a disease resistance challenge, peak mortality (HM) and mortality plateau (PM) groups of individuals. Nine RNA sequencing (RNA-Seq) libraries were prepared from liver tissue of challenged individuals, generating ∼160 million 150 bp pair-end reads. After quality trimming/cleanup, these reads were assembled de novo generating 211,259 contigs. When the expression of genes from individuals of HM group were compared to individuals from control group, a total of 4,413 differentially expressed transcripts were found (2,000 upregulated and 2,413 downregulated candidate genes). Additionally, 433 transcripts were differentially expressed when individuals from MP group were compared with those in the control group (155 upregulated and 278 downregulated candidate genes). The resulting differentially expressed transcripts were clustered into the following functional categories: cytokines and signaling, epithelial protection, antigen processing and presentation, apoptosis, phagocytosis, complement system cascades and pattern recognition receptors. The proposed results revealing relevant differential gene expression on HM and PM groups which will contribute to a better understanding of the molecular defense mechanisms during A. hydrophila infection.

10.
BMC Genomics ; 21(1): 148, 2020 Feb 11.
Article in English | MEDLINE | ID: mdl-32046653

ABSTRACT

BACKGROUND: RNA-Seq is the preferred method to explore transcriptomes and to estimate differential gene expression. When an organism has a well-characterized and annotated genome, reads obtained from RNA-Seq experiments can be directly mapped to that genome to estimate the number of transcripts present and relative expression levels of these transcripts. However, for unknown genomes, de novo assembly of RNA-Seq reads must be performed to generate a set of contigs that represents the transcriptome. These contig sets contain multiple transcripts, including immature mRNAs, spliced transcripts and allele variants, as well as products of close paralogs or gene families that can be difficult to distinguish. Thus, tools are needed to select a set of less redundant contigs to represent the transcriptome for downstream analyses. Here we describe the development of Compacta to produce contig sets from de novo assemblies. RESULTS: Compacta is a fast and flexible computational tool that allows selection of a representative set of contigs from de novo assemblies. Using a graph-based algorithm, Compacta groups contigs into clusters based on the proportion of shared reads. The user can determine the minimum coverage of the contigs to be clustered, as well as a threshold for the proportion of shared reads in the clustered contigs, thus providing a dynamic range of transcriptome compression that can be adapted according to experimental aims. We compared the performance of Compacta against state of the art clustering algorithms on assemblies from Arabidopsis, mouse and mango, and found that Compacta yielded more rapid results and had competitive precision and recall ratios. We describe and demonstrate a pipeline to tailor Compacta parameters to specific experimental aims. CONCLUSIONS: Compacta is a fast and flexible algorithm for the determination of optimum contig sets that represent the transcriptome for downstream analyses.


Subject(s)
Contig Mapping/methods , RNA-Seq/methods , Software , Algorithms , Arabidopsis/genetics , Cluster Analysis
11.
Brief Bioinform ; 20(6): 2116-2129, 2019 11 27.
Article in English | MEDLINE | ID: mdl-30137230

ABSTRACT

MOTIVATION: With the recent advances in DNA sequencing technologies, the study of the genetic composition of living organisms has become more accessible for researchers. Several advances have been achieved because of it, especially in the health sciences. However, many challenges which emerge from the complexity of sequencing projects remain unsolved. Among them is the task of assembling DNA fragments from previously unsequenced organisms, which is classified as an NP-hard (nondeterministic polynomial time hard) problem, for which no efficient computational solution with reasonable execution time exists. However, several tools that produce approximate solutions have been used with results that have facilitated scientific discoveries, although there is ample room for improvement. As with other NP-hard problems, machine learning algorithms have been one of the approaches used in recent years in an attempt to find better solutions to the DNA fragment assembly problem, although still at a low scale. RESULTS: This paper presents a broad review of pioneering literature comprising artificial intelligence-based DNA assemblers-particularly the ones that use machine learning-to provide an overview of state-of-the-art approaches and to serve as a starting point for further study in this field.


Subject(s)
Genome , Machine Learning , Algorithms , High-Throughput Nucleotide Sequencing/methods , Sequence Analysis, DNA
12.
BMC Genomics ; 19(1): 891, 2018 Dec 07.
Article in English | MEDLINE | ID: mdl-30526481

ABSTRACT

BACKGROUND: The most common infusion in southern Latin-American countries is prepared with dried leaves of Ilex paraguariensis A. St.-Hil., an aboriginal ancestral beverage known for its high polyphenols concentration currently consumed in > 90% of homes in Argentina, in Paraguay and Uruguay. The economy of entire provinces heavily relies on the production, collection and manufacture of Ilex paraguariensis, the fifth plant species with highest antioxidant activity. Polyphenols are associated to relevant health benefits including strong antioxidant properties. Despite its regional relevance and potential biotechnological applications, little is known about functional genomics and genetics underlying phenotypic variation of relevant traits. By generating tissue specific transcriptomic profiles, we aimed to comprehensively annotate genes in the Ilex paraguariensis phenylpropanoid pathway and to evaluate differential expression profiles. RESULTS: In this study we generated a reliable transcriptome assembly based on a collection of 15 RNA-Seq libraries from different tissues of Ilex paraguariensis. A total of 554 million RNA-Seq reads were assembled into 193,897 transcripts, where 24,612 annotated full-length transcripts had complete ORF. We assessed the transcriptome assembly quality, completeness and accuracy using BUSCO and TransRate; consistency was also evaluated by experimentally validating 11 predicted genes by PCR and sequencing. Functional annotation against KEGG Pathway database identified 1395 unigenes involved in biosynthesis of secondary metabolites, 531 annotated transcripts corresponded to the phenylpropanoid pathway. The top 30 differentially expressed genes among tissue revealed genes involved in photosynthesis and stress response. These significant differences were then validated by qRT-PCR. CONCLUSIONS: Our study is the first to provide data from whole genome gene expression profiles in different Ilex paraguariensis tissues, experimentally validating in-silico predicted genes key to the phenylpropanoid (antioxidant) pathway. Our results provide essential genomic data of potential use in breeding programs for polyphenol content. Further studies are necessary to assess if the observed expression variation in the phenylpropanoid pathway annotated genes is related to variations in leaves' polyphenol content at the population scale. These results set the current reference for Ilex paraguariensis genomic studies and provide a substantial contribution to research and biotechnological applications of phenylpropanoid secondary metabolites.


Subject(s)
Genome, Plant , Ilex paraguariensis/genetics , Organ Specificity/genetics , Sequence Analysis, RNA/methods , Transcriptome/genetics , Gene Expression Regulation, Plant , Gene Ontology , Genes, Plant , Molecular Sequence Annotation , Plant Leaves/genetics , Plant Roots/genetics , RNA, Messenger/genetics , RNA, Messenger/metabolism , Reproducibility of Results , Secondary Metabolism/genetics
13.
3 Biotech ; 8(4): 185, 2018 Apr.
Article in English | MEDLINE | ID: mdl-29556439

ABSTRACT

To understand the physiological responses of the brown macroalga Macrocystis integrifolia during the marine tidal cycle, two RNA libraries were prepared from algal frond samples collected in the intertidal zone (0 m depth) and subtidal zone (10 m depth). Samples collected from intertidal zone during low tide was considered as abiotic stressed (MI0), while samples collected from subtidal zone was considered as control (MI10). Both RNA libraries were sequenced on Illumina NextSeq 500 which generated approx. 46.9 million and 47.7 million raw paired-end reads for MI0 and MI10, respectively. Among the representative transcripts (RTs), a total of 16,398 RTs (39.20%) from MI0 and 21,646 RTs (39.24%) from MI10 were successfully annotated. A total of 535 unigenes (271 upregulated and 264 downregulated) showed significantly altered expression between MI0 and MI10. In abiotic-stressed condition (MI0), the relative expression levels of genes associated with antioxidant defenses (vanadium-dependent bromoperoxidase, glutathione S-transferase, lipoxygenase, serine/threonine-protein kinase, aspartate Aminotransferase, HSPs), water transport (aquaporin), photosynthesis (light-harvesting complex) protein were significantly upregulated, while in control condition (MI10) most of the genes predominantly involved in energy metabolism (NADH-ubiquinone oxidoreductase/NADH dehydrogenase, NAD(P)H-Nitrate reductase, long-chain acyl-CoA synthetase, udp-n-acetylglucosamine pyrophosphorylase) were overexpressed.

14.
Genet. mol. biol ; Genet. mol. biol;40(1): 168-180, Jan.-Mar. 2017. tab, graf
Article in English | LILACS | ID: biblio-892360

ABSTRACT

Abstract Red swamp crayfish is an important model organism for research of the invertebrate innate immunity mechanism. Its excellent disease resistance against bacteria, fungi, and viruses is well-known. However, the antiviral mechanisms of crayfish remain unclear. In this study, we obtained high-quality sequence reads from normal and white spot syndrome virus (WSSV)-challenged crayfish gills. For group normal (GN), 39,390,280 high-quality clean reads were randomly assembled to produce 172,591 contigs; whereas, 34,011,488 high-quality clean reads were randomly assembled to produce 182,176 contigs for group WSSV-challenged (GW). After GO annotations analysis, a total of 35,539 (90.01%), 14,931 (37.82%), 28,221 (71.48%), 25,290 (64.05%), 15,595 (39.50%), and 13,848 (35.07%) unigenes had significant matches with sequences in the Nr, Nt, Swiss-Prot, KEGG, COG and GO databases, respectively. Through the comparative analysis between GN and GW, 12,868 genes were identified as differentially up-regulated DEGs, and 9,194 genes were identified as differentially down-regulated DEGs. Ultimately, these DEGs were mapped into different signaling pathways, including three important signaling pathways related to innate immunity responses. These results could provide new insights into crayfish antiviral immunity mechanism.

15.
Mitochondrial DNA B Resour ; 2(1): 337-338, 2017 Jun 01.
Article in English | MEDLINE | ID: mdl-33473819

ABSTRACT

The mitochondrion genome of Occidentarius platypogon was assembled from Illumina short reads, and consisted of 16,714 base pairs, with 13 protein-coding, two ribosomal RNAs (rRNAs), and 22 transfer RNA (tRNA) genes. Base composition is 30.7% A, 26.4% T, 28.5% C, and 14.4% G, and 42.9% GC content. Two start codon (ATG and GTG) and seven stop codon (TAA, ACT, CCT, TTA, CAT, AAT, and TAG) patterns were found in protein-coding genes. Control region presented the highest A + T (64%) and lowest G + C content (35.7%) among all mitochondrial regions.

16.
G3 (Bethesda) ; 6(10): 3283-3295, 2016 10 13.
Article in English | MEDLINE | ID: mdl-27558666

ABSTRACT

Several fruit flies species of the Anastrepha fraterculus group are of great economic importance for the damage they cause to a variety of fleshy fruits. Some species in this group have diverged recently, with evidence of introgression, showing similar morphological attributes that render their identification difficult, reinforcing the relevance of identifying new molecular markers that may differentiate species. We investigated genes expressed in head tissues from two closely related species: A. obliqua and A. fraterculus, aiming to identify fixed single nucleotide polymorphisms (SNPs) and highly differentiated transcripts, which, considering that these species still experience some level of gene flow, could indicate potential candidate genes involved in their differentiation process. We generated multiple libraries from head tissues of these two species, at different reproductive stages, for both sexes. Our analyses indicate that the de novo transcriptome assemblies are fairly complete. We also produced a hybrid assembly to map each species' reads, and identified 67,470 SNPs in A. fraterculus, 39,252 in A. obliqua, and 6386 that were common to both species. We identified 164 highly differentiated unigenes that had a mean interspecific index ([Formula: see text]) of at least 0.94. We selected unigenes that had Ka/Ks higher than 0.5, or had at least three or more highly differentiated SNPs as potential candidate genes for species differentiation. Among these candidates, we identified proteases, regulators of redox homeostasis, and an odorant-binding protein (Obp99c), among other genes. The head transcriptomes described here enabled the identification of thousands of genes hitherto unavailable for these species, and generated a set of candidate genes that are potentially important to genetically identify species and understand the speciation process in the presence of gene flow of A. obliqua and A. fraterculus.


Subject(s)
Gene Flow , Genes, Insect , Genetic Variation , Tephritidae/genetics , Transcriptome , Alleles , Animals , Computational Biology/methods , Evolution, Molecular , Gene Expression Profiling , High-Throughput Nucleotide Sequencing , Molecular Sequence Annotation , Organ Specificity/genetics , Polymorphism, Single Nucleotide , Selection, Genetic , Species Specificity
17.
Mol Ecol Resour ; 15(1): 28-41, 2015 01.
Article in English | MEDLINE | ID: mdl-24916682

ABSTRACT

Restriction site-associated DNA sequencing (RADseq) provides researchers with the ability to record genetic polymorphism across thousands of loci for nonmodel organisms, potentially revolutionizing the field of molecular ecology. However, as with other genotyping methods, RADseq is prone to a number of sources of error that may have consequential effects for population genetic inferences, and these have received only limited attention in terms of the estimation and reporting of genotyping error rates. Here we use individual sample replicates, under the expectation of identical genotypes, to quantify genotyping error in the absence of a reference genome. We then use sample replicates to (i) optimize de novo assembly parameters within the program Stacks, by minimizing error and maximizing the retrieval of informative loci; and (ii) quantify error rates for loci, alleles and single-nucleotide polymorphisms. As an empirical example, we use a double-digest RAD data set of a nonmodel plant species, Berberis alpina, collected from high-altitude mountains in Mexico.


Subject(s)
Diagnostic Errors , Genetics, Population/methods , Genotyping Techniques/methods , Sequence Analysis, DNA/methods , Berberis/classification , Berberis/genetics , Genetic Variation , Genotype , Mexico
18.
Plant J ; 79(1): 162-72, 2014 Jul.
Article in English | MEDLINE | ID: mdl-24773339

ABSTRACT

Many economically important crops have large and complex genomes that hamper their sequencing by standard methods such as whole genome shotgun (WGS). Large tracts of methylated repeats occur in plant genomes that are interspersed by hypomethylated gene-rich regions. Gene-enrichment strategies based on methylation profiles offer an alternative to sequencing repetitive genomes. Here, we have applied methyl filtration with McrBC endonuclease digestion to enrich for euchromatic regions in the sugarcane genome. To verify the efficiency of methylation filtration and the assembly quality of sequences submitted to gene-enrichment strategy, we have compared assemblies using methyl-filtered (MF) and unfiltered (UF) libraries. The use of methy filtration allowed a better assembly by filtering out 35% of the sugarcane genome and by producing 1.5× more scaffolds and 1.7× more assembled Mb in length compared with unfiltered dataset. The coverage of sorghum coding sequences (CDS) by MF scaffolds was at least 36% higher than by the use of UF scaffolds. Using MF technology, we increased by 134× the coverage of gene regions of the monoploid sugarcane genome. The MF reads assembled into scaffolds that covered all genes of the sugarcane bacterial artificial chromosomes (BACs), 97.2% of sugarcane expressed sequence tags (ESTs), 92.7% of sugarcane RNA-seq reads and 98.4% of sorghum protein sequences. Analysis of MF scaffolds from encoded enzymes of the sucrose/starch pathway discovered 291 single-nucleotide polymorphisms (SNPs) in the wild sugarcane species, S. spontaneum and S. officinarum. A large number of microRNA genes was also identified in the MF scaffolds. The information achieved by the MF dataset provides a valuable tool for genomic research in the genus Saccharum and for improvement of sugarcane as a biofuel crop.


Subject(s)
Chromosomes, Plant/genetics , Genome, Plant/genetics , Genomics/methods , High-Throughput Nucleotide Sequencing/methods , Saccharum/genetics , Chromosomes, Artificial, Bacterial , Crops, Agricultural , DNA Methylation , DNA, Plant/genetics , Expressed Sequence Tags , Gene Library , MicroRNAs/genetics , Plant Leaves/genetics , Plant Proteins/genetics , Polymorphism, Single Nucleotide/genetics , RNA, Plant/genetics , Repetitive Sequences, Nucleic Acid/genetics , Sequence Analysis , Sorghum/genetics
19.
Stand Genomic Sci ; 9(1): 42-56, 2013 Oct 16.
Article in English | MEDLINE | ID: mdl-24501644

ABSTRACT

Lysinibacillus sphaericus strain OT4b.31 is a native Colombian strain having no larvicidal activity against Culex quinquefasciatus and is widely applied in the bioremediation of heavy-metal polluted environments. Strain OT4b.31 was placed between DNA homology groups III and IV. By gap-filling and alignment steps, we propose a 4,096,672 bp chromosomal scaffold. The whole genome (consisting of 4,856,302 bp long, 94 contigs and 4,846 predicted protein-coding sequences) revealed differences in comparison to the L. sphaericus C3-41 genome, such as syntenial relationships, prophages and putative mosquitocidal toxins. Sphaericolysin B354, the coleopteran toxin Sip1A and heavy metal resistance clusters from nik, ars, czc, cop, chr, czr and cad operons were identified. Lysinibacillus sphaericus OT4b.31 has applications not only in bioremediation efforts, but also in the biological control of agricultural pests.

SELECTION OF CITATIONS
SEARCH DETAIL