RESUMEN
Integrative taxonomy is central to modern taxonomy and systematic biology, including behavior, niche preference, distribution, morphological analysis, and DNA barcoding. However, decades of use demonstrate that these methods can face challenges when used in isolation, for instance, potential misidentifications due to phenotypic plasticity for morphological methods, and incorrect identifications because of introgression, incomplete lineage sorting, and horizontal gene transfer for DNA barcoding. Although researchers have advocated the use of integrative taxonomy, few detailed algorithms have been proposed. Here, we develop a convolutional neural network method (morphology-molecule network [MMNet]) that integrates morphological and molecular data for species identification. The newly proposed method (MMNet) worked better than four currently available alternative methods when tested with 10 independent data sets representing varying genetic diversity from different taxa. High accuracies were achieved for all groups, including beetles (98.1% of 123 species), butterflies (98.8% of 24 species), fishes (96.3% of 214 species), and moths (96.4% of 150 total species). Further, MMNet demonstrated a high degree of accuracy ($>$98%) in four data sets including closely related species from the same genus. The average accuracy of two modest subgenomic (single nucleotide polymorphism) data sets, comprising eight putative subspecies respectively, is 90%. Additional tests show that the success rate of species identification under this method most strongly depends on the amount of training data, and is robust to sequence length and image size. Analyses on the contribution of different data types (image vs. gene) indicate that both morphological and genetic data are important to the model, and that genetic data contribute slightly more. The approaches developed here serve as a foundation for the future integration of multimodal information for integrative taxonomy, such as image, audio, video, 3D scanning, and biosensor data, to characterize organisms more comprehensively as a basis for improved investigation, monitoring, and conservation of biodiversity. [Convolutional neural network; deep learning; integrative taxonomy; single nucleotide polymorphism; species identification.].
Asunto(s)
Mariposas Diurnas , Animales , Biodiversidad , Mariposas Diurnas/genética , ADN/genética , Código de Barras del ADN Taxonómico/métodos , Redes Neurales de la Computación , FilogeniaRESUMEN
In this study, the complete mitochondrial genome of a white tussock moth, Laelia suffusa (Walker, 1855) (Lepidoptera: Erebidae, Lymantriinae), was sequenced and annotated. The genome sequence was 15,502 bp in length and comprised 13 PCGs, 2 rRNAs, 22 tRNAs, and a single noncoding control region (CR). The nucleotide composition of the genome was highly A + T biased, accounting for 79.04% of the whole genome and with a slightly positive AT skewness (0.015). Comparing the gene order with the basal species of Lepidoptera, a typical trnM rearrangement was detected in the mitogenome of L. suffusa. Besides, the trnM rearrangement was found at the head of trnI and trnQ, rather than at the back. The 13 PCGs used ATN as their start codons, except for the cox1 which used CGA. Out of the 22 tRNAs, only 1 tRNA (trnS1) failed to fold in a typical cloverleaf secondary structure. The conserved motif 'ATAGA + poly-T' was detected at the start of the control region which was similar to other Lepidoptera species. In total, 10 overlapping regions and 19 intergenic spacers were identified, ranging from 1 to 41 and 2 to 73 bp, respectively. Phylogenetic analysis showed that Lymantriinae was a monophyletic group with a high support value and L. suffusa was closely related to tribe Orgyiini (Erebidae, Lymantriinae). Moreover, the phylogenetic relationship of Noctuoidea (Lepidoptera) species was reconstructed using two datasets (13 PCGs and 37 genes) and these supported the topology of (Notodontidae + (Erebidae + (Nolidae + (Euteliidae + Noctuidae)))).
Asunto(s)
Genoma de los Insectos , Genoma Mitocondrial , Mariposas Nocturnas/genética , Animales , Orden Génico , Filogenia , Análisis de Secuencia de ADNRESUMEN
BACKGROUND: Over the last decade, the rapid development of high-throughput sequencing platforms has accelerated species description and assisted morphological classification through DNA barcoding. However, the current high-throughput DNA barcoding methods cannot obtain full-length barcode sequences due to read length limitations (e.g. a maximum read length of 300 bp for the Illumina's MiSeq system), or are hindered by a relatively high cost or low sequencing output (e.g. a maximum number of eight million reads per cell for the PacBio's SEQUEL II system). RESULTS: Pooled cytochrome c oxidase subunit I (COI) barcodes from individual specimens were sequenced on the MGISEQ-2000 platform using the single-end 400 bp (SE400) module. We present a bioinformatic pipeline, HIFI-SE, that takes reads generated from the 5' and 3' ends of the COI barcode region and assembles them into full-length barcodes. HIFI-SE is written in Python and includes four function modules of filter, assign, assembly and taxonomy. We applied the HIFI-SE to a set of 845 samples (30 marine invertebrates, 815 insects) and delivered a total of 747 fully assembled COI barcodes as well as 70 Wolbachia and fungi symbionts. Compared to their corresponding Sanger sequences (72 sequences available), nearly all samples (71/72) were correctly and accurately assembled, including 46 samples that had a similarity score of 100% and 25 of ca. 99%. CONCLUSIONS: The HIFI-SE pipeline represents an efficient way to produce standard full-length barcodes, while the reasonable cost and high sensitivity of our method can contribute considerably more DNA barcodes under the same budget. Our method thereby advances DNA-based species identification from diverse ecosystems and increases the number of relevant applications.
Asunto(s)
Código de Barras del ADN Taxonómico , Ecosistema , Animales , ADN , Secuenciación de Nucleótidos de Alto Rendimiento , InsectosRESUMEN
Understanding diversity patterns requires accounting for the roles of both historical and contemporary factors in the assembly of communities. Here, we compared diversity patterns of two moth assemblages sampled from Taihang and Yanshan mountains in Northern China and performed ancestral range reconstructions using the Multi-State Speciation and Extinction model, to track the origins of these patterns. Further, we estimated diversification rates of the two moth assemblages and explored the effects of contemporary ecological factors. From 7,788 specimens we identified 835 species belonging to 23 families, using both DNA barcode analysis and morphology. Moths in Yanshan mountains showed higher species diversity than in Taihang mountains. Ancestral range analysis indicated Yanshan as the origin, with significant historical dispersals from Yanshan to Taihang. Asymmetrical diversification, population expansion, along with frequent and considerable gene flow were detected between communities. Moreover, dispersal limitation or the joint effect of environment filtering and dispersal limitation were inferred as main driving forces shaping current diversity patterns. In summary, we demonstrate that a multiscale (community, population and species level) analysis incorporating both historical and contemporary factors can be useful in delineating factors contributing to community assembly and patterning in diversity.
Asunto(s)
Biodiversidad , Mariposas Nocturnas/clasificación , Animales , China , Código de Barras del ADN Taxonómico , Flujo Génico , FilogeniaRESUMEN
BACKGROUND: Pine moths (Lepidoptera; Bombycoidea; Lasiocampidae: Dendrolimus spp.) are among the most serious insect pests of forests, especially in southern China. Although COI barcodes (a standardized portion of the mitochondrial cytochrome c oxidase subunit I gene) can distinguish some members of this genus, the evolutionary relationships of the three morphospecies Dendrolimus punctatus, D. tabulaeformis and D. spectabilis have remained largely unresolved. We sequenced whole mitochondrial genomes of eight specimens, including D. punctatus wenshanensis. This is an unambiguous subspecies of D. punctatus, and was used as a reference for inferring the relationships of the other two morphospecies of the D. punctatus complex. We constructed phylogenetic trees from this data, including twelve published mitochondrial genomes of other Bombycoidea species, and examined the relationships of the Dendrolimus taxa using these trees and the genomic features of the mitochondrial genome. RESULTS: The eight fully sequenced mitochondrial genomes from the three morphospecies displayed similar genome structures as other Bombycoidea species in terms of gene content, base composition, level of overall AT-bias and codon usage. However, the Dendrolimus genomes possess a unique feature in the large ribosomal 16S RNA subunits (rrnL), which are more than 60 bp longer than other members of the superfamily and have a higher AC proportion. The eight mitochondrial genomes of Dendrolimus were highly conservative in many aspects, for example with identical stop codons and overlapping regions. But there were many differences in start codons, intergenic spacers, and numbers of mismatched base pairs of tRNA (transfer RNA genes). Our results, based on phylogenetic trees, genetic distances, species delimitation and genomic features (such as intergenic spacers) of the mitochondrial genome, indicated that D. tabulaeformis is as close to D. punctatus as is D. punctatus wenshanensis, whereas D. spectabilis evolved independently from D. tabulaeformis and D. punctatus. Whole mitochondrial DNA phylogenies showed that D. spectabilis formed a well-supported monophyletic clade, with a clear species boundary separating it from the other congeners examined here. However, D. tabulaeformis often clustered with D. punctatus and with the subspecies D. punctatus wenshanensis. Genetic distance analyses showed that the distance between D. tabulaeformis and D. punctatus is generally less than the intraspecific distance of D. punctatus and its subspecies D. punctatus wenshanensis. In the species delimitation analysis of Poisson Tree Processes (PTP), D. tabulaeformis, D. punctatus and D. punctatus wenshanensis clustered into a putative species separated from D. spectabilis. In comparison with D. spectabilis, D. tabulaeformis and D. punctatus also exhibit a similar structure in intergenic spacer characterization. These different types of evidence suggest that D. tabulaeformis is very close to D. punctatus and its subspecies D. punctatus wenshanensis, and is likely to be another subspecies of D. punctatus. CONCLUSIONS: Whole mitochondrial genomes possess relatively rich genetic information compared with the traditional use of single or multiple genes for phylogenetic purposes. They can be used to better infer phylogenetic relationships and degrees of relatedness of taxonomic groups, at least from the aspect of maternal lineage: caution should be taken due to the maternal-only inheritance of this genome. Our results indicate that D. spectabilis is an independent lineage, while D. tabulaeformis shows an extremely close relationship to D. punctatus.
Asunto(s)
Genoma Mitocondrial , Mitocondrias/genética , Mariposas Nocturnas/genética , Animales , Disparidad de Par Base , China , Codón Iniciador/genética , ADN Mitocondrial/análisis , ADN Mitocondrial/aislamiento & purificación , Evolución Molecular , Proteínas Mitocondriales/genética , Proteínas Mitocondriales/metabolismo , Mariposas Nocturnas/clasificación , Filogenia , ARN Ribosómico 16S/genética , ARN Ribosómico 16S/metabolismo , ARN de Transferencia/genética , ARN de Transferencia/metabolismo , Análisis de Secuencia de ADNRESUMEN
Phthorimaea operculella is a major potato pest of global importance, early warning and detection of which are of significance. In this study, we analyzed the climate niche conservation of P. operculella during its invasion by comparing the overall climate niche from three dimensions, including the differences between native range (South America) and entire invaded region (excluding South America), the differences bwtween native range (South America) and five invaded continents (North America, Oceania, Asia, Africa, and Europe), as well as the differences between native region (South America) and an invaded region (China). We constructed ecological niche models for its native range (South America) and invaded region (China). The results showed that the climatic niche of the pest has expanded to varying degrees in different regions, indicating that the pest could well adapt to new environments during the invasion. Almost all areas of South America are suitable for P. operculella. In China, its suitable area is mainly concentrated in Shandong, Hebei, Tianjin, Beijing, Henan, Hubei, Yunnan, Guizhou, Sichuan, Hainan, northern Guangxi, southern Hunan, Anhui, Guangdong, Jiangsu, southern Shanxi, and southern Shaanxi. With increasing greenhouse gas emissions and global temperature, its suitable area will decrease at low latitude and increase gradually at high latitude. Specifically, the northern boundary will extend to Liaoning, Jilin, and the southeastern region of Inner Mongolia, while the western boundary extends to Sichuan and the southeast Qinghai-Tibet Plateau. The suitable area in the southeast Yunnan-Guizhou Plateau, Hainan Island, and the south of Yangtze River, will gradually decrease. The total suitable habitat area for P. operculella in China is projected to increase under future climate condition. From 2081 to 2100, under the three greenhouse gas emissions scenarios of ssp126, ssp370, and ssp585, the suitable area is expected to increase by 27.78, 165.54, and 140.41 hm2, respectively. Therefore, it is crucial to strengtehen vigilance and implement strict measures to prevent the further expansion of P. operculella.
Asunto(s)
Ecosistema , Especies Introducidas , China , Animales , América del Sur , ClimaRESUMEN
Dogs were present in pre-Columbian America, presumably brought by early human migrants from Asia. Studies of free-ranging village/street dogs have indicated almost total replacement of these original dogs by European dogs, but the extent to which Arctic, North and South American breeds are descendants of the original population remains to be assessed. Using a comprehensive phylogeographic analysis, we traced the origin of the mitochondrial DNA lineages for Inuit, Eskimo and Greenland dogs, Alaskan Malamute, Chihuahua, xoloitzcuintli and perro sín pelo del Peru, by comparing to extensive samples of East Asian (n = 984) and European dogs (n = 639), and previously published pre-Columbian sequences. Evidence for a pre-Columbian origin was found for all these breeds, except Alaskan Malamute for which results were ambigous. No European influence was indicated for the Arctic breeds Inuit, Eskimo and Greenland dog, and North/South American breeds had at most 30% European female lineages, suggesting marginal replacement by European dogs. Genetic continuity through time was shown by the sharing of a unique haplotype between the Mexican breed Chihuahua and ancient Mexican samples. We also analysed free-ranging dogs, confirming limited pre-Columbian ancestry overall, but also identifying pockets of remaining populations with high proportion of indigenous ancestry, and we provide the first DNA-based evidence that the Carolina dog, a free-ranging population in the USA, may have an ancient Asian origin.
Asunto(s)
ADN Mitocondrial/química , Perros/genética , Filogenia , Animales , Asia , Perros/clasificación , Europa (Continente) , Haplotipos , América del Norte , América del Sur , Especificidad de la EspecieRESUMEN
We describe the mitogenome sequence of Leucoma chrysoscela (Collenette, 1934) collected in the Longtan National Forest Park, which is located in the southeast of China. The assembled mitogenome is 15,508 bp in length and consists 13 protein coding genes, 22 transfer-RNA genes, 2 ribosomal-RNA genes, and one A + T rich region. The most common start codon for 13 PCGs is ATT and the most common termination codon is TAA. The overall G + C content was only 20.45% in the heavy strand. The result of phylogenetic analysis shows that the relationship of L. chrysoscela is close to the species in the same subfamily Lymantriinae.
RESUMEN
Animals widely use minerals and organic components to construct biomaterials with excellent properties, such as teeth, bones, molluscan shells and eggshells. The larvae of the oriental moth, Monema (Cnidocampa) flavescens Walker, secrete silk proteins that combine closely with calcareous minerals to construct a hard cocoon, which is completely different from the mineral-free Bombyx mori cocoon. The cocoons of oriental moths are likely to be the hardest among the cocoons constructed by insect species. The cocoons of oriental moths were found to be mainly composed of calcium oxalates and Asx/Ser/Gly-rich cocoon proteins, but the types of calcium oxalates and cocoon proteins remain to be elucidated. In this study, we provide an in-depth explanation of the inorganic and organic components in the oriental moth cocoon. Microscopy and imaging technologies revealed that the cocoon is composed of mineral crystals, silk fibers and other organic matter. X-ray diffraction and infrared spectral analyses showed that the mineral crystals in the oriental moth cocoon were mainly CaC2H2O4·H2O. ICP-OES analysis suggested that the mineral crystals in the cocoons were mainly CaC2H2O4·H2O. LC-MS/MS-based proteomics allowed us to identify 467 proteins from the oriental moth cocoon, including 252 uncharacterized proteins, 87 enzymes, 36 small molecule binding proteins, and 5 silk proteins. Among the uncharacterized proteins, 25 of which were Asn-rich proteins because they contained a high proportion of Asn residues (19.1%-41.4%). Among the top 20 cocoon proteins with the highest abundance, 9 of which were Asn-rich proteins. The qPCR was used to investigate the expression patterns of the major cocoon protein-coding genes. Three fibroins and three Asn-rich proteins were expressed only in the silk gland but not in other tissues. The expression of Asn-rich proteins in the silk gland gradually increased from the anterior silk gland to the posterior silk gland. These findings provide important references for understanding the formation mechanism and mechanical properties of mineralized hard cocoons constructed by oriental moths.
Asunto(s)
Bombyx , Mariposas Nocturnas , Animales , Mariposas Nocturnas/metabolismo , Cromatografía Liquida , Calcio/metabolismo , Espectrometría de Masas en Tándem , Seda/metabolismo , Bombyx/química , Oxalato de Calcio/metabolismoRESUMEN
Interactions between plants and insects are among the most important life functions for all organism at a particular natural community. Usually a large number of samples are required to identify insect diets in food web studies. Previously, Sanger sequencing and next generation sequencing (NGS) with short DNA barcodes were used, resulting in low species-level identification; meanwhile the costs of Sanger sequencing are expensive for metabarcoding together with more samples. Here, we present a fast and effective sequencing strategy to identify larvae of Lepidoptera and their diets at the same time without increasing the cost on Illumina platform in a single HiSeq run, with long-multiplex-metabarcoding (COI for insects, rbcL, matK, ITS and trnL for plants) obtained by Trinity assembly (SHMMT). Meanwhile, Sanger sequencing (for single individuals) and NGS (for polyphagous) were used to verify the reliability of the SHMMT approach. Furthermore, we show that SHMMT approach is fast and reliable, with most high-quality sequences of five DNA barcodes of 63 larvae individuals (54 species) recovered (full length of 100% of the COI gene and 98.3% of plant DNA barcodes) using Trinity assembly (up-sized to 1015 bp). For larvae diets identification, 95% are reliable; the other 5% failed because their guts were empty. The diets identified by SHMMT approach are 100% consistent with the host plants that the larvae were feeding on during our collection. Our study demonstrates that SHMMT approach is reliable and cost-effective for insect-plants network studies. This will facilitate insect-host plant studies that generally contain a huge number of samples.
Asunto(s)
Privación de Alimentos , Herbivoria , Mariposas Nocturnas/fisiología , Nicotiana , Pinus , Salix , Vitis , Animales , Código de Barras del ADN Taxonómico , ADN de Plantas/análisis , Dieta , Larva/crecimiento & desarrollo , Larva/fisiología , Mariposas Nocturnas/crecimiento & desarrolloRESUMEN
There is no generally accepted picture of where, when, and how the domestic dog originated. Previous studies of mitochondrial DNA (mtDNA) have failed to establish the time and precise place of origin because of lack of phylogenetic resolution in the so far studied control region (CR), and inadequate sampling. We therefore analyzed entire mitochondrial genomes for 169 dogs to obtain maximal phylogenetic resolution and the CR for 1,543 dogs across the Old World for a comprehensive picture of geographical diversity. Hereby, a detailed picture of the origins of the dog can for the first time be suggested. We obtained evidence that the dog has a single origin in time and space and an estimation of the time of origin, number of founders, and approximate region, which also gives potential clues about the human culture involved. The analyses showed that dogs universally share a common homogenous gene pool containing 10 major haplogroups. However, the full range of genetic diversity, all 10 haplogroups, was found only in southeastern Asia south of Yangtze River, and diversity decreased following a gradient across Eurasia, through seven haplogroups in Central China and five in North China and Southwest (SW)Asia, down to only four haplogroups in Europe. The mean sequence distance to ancestral haplotypes indicates an origin 5,400-16,300 years ago (ya) from at least 51 female wolf founders. These results indicate that the domestic dog originated in southern China less than 16,300 ya, from several hundred wolves. The place and time coincide approximately with the origin of rice agriculture, suggesting that the dogs may have originated among sedentary hunter-gatherers or early farmers, and the numerous founders indicate that wolf taming was an important culture trait.
Asunto(s)
ADN Mitocondrial/genética , Perros/genética , Filogenia , Ríos , Lobos/genética , Animales , Asia Sudoriental , China , Europa (Continente) , Femenino , Pool de Genes , Genoma Mitocondrial/genética , Geografía , Haplotipos/genética , Región de Control de Posición/genética , Datos de Secuencia Molecular , Factores de TiempoRESUMEN
Although the Masson pine moth, Dendrolimus punctatus, is one of the most destructive forest pest insects and is an endemic condition in China, we still do not fully understand the patterns of how its distribution range varies in response to Quaternary climatic oscillations. Here, we sequenced one maternally inherited mitochondrial gene (COI) and biparentally inherited nuclear data (ITS1 and ITS2) among 23 natural populations across the entire range of the species in China. A total of 51 mitotypes and 38 ribotypes were separately obtained using mtDNA and ITS1 data. Furthermore, significant phylogeographical structure (N ST > G ST, p < 0.01) were detected. The spatial distribution of mitotypes implied that two distinct groups existed in the species: one in the southwest distribution, including 10 locations, and the other located in the northeast region of China. It is suggested, therefore, that each group was derived from ancestors that occupied different isolated refugia during previous periods, possibly last glacial maximum. Mismatch distribution and Bayesian population dynamics analysis suggested the population size underwent sudden expansion, which is consistent with the results of ecological niche modeling. As a typical phytophagous insect, the history of population expansion was in accordance with the host plants, providing abundant food resources and habitat. Intraspecific success rate of barcoding identification was lower than interspecific ones, indicating a level of difficulty in barcoding individuals from different populations. However, it still provides an early insight into the pattern of genetic diversity within a species. OPEN RESEARCH BADGES: This article has been awarded an Open Data and Open Materials. All materials and data are publicly accessible via the Open Science Framework at https://doi.org/10.5061/dryad.2df87g2. Learn more about the Open Practices badges from the Center for Open Science: https://osf.io/tvyxz/wiki.
RESUMEN
BACKGROUND: Pine moths, Dendrolimus spp. (Lasiocampidae), are serious economic pests of conifer forests. Six closely related species (Dendrolimus punctatus, D. tabulaeformis, D. spectabilis, D. superans, D. houi, and D. kikuchii) occur in China and cause serious damage to coniferophyte. The complete mito genomes of Dendrolimus genus are significant to resolve the phylogenetic relationship and provide theoretical support in pest control. METHODS: The complete mitogenomes of three species (D. superans, D. houi, and D. kikuchii) were sequenced based on PCR-amplified with universal primers, which were used to amplify initial fragments. Phylogenetic analyses were carried out with 78 complete mitogenomes of lepidopteran species from 10 superfamilies. RESULTS: The complete mitochondrial genomes of these three species were 15,417, 15,381, and 15,377 bp in length, separately. The phylogenetic analyses produced consistent results for six Dendrolimus species based on complete mitogenomes, two major clades were formed, one containing D. spectabilis clustered with D. punctatus + D. tabulaeformis, and D. superans as the sister group to this three-taxon clade, the other containing D. kikuchii and D. houi. Comparative analyses of the congeneric mitochondrial genomes were performed, which showed that non-coding regions were more variable than the A+T rich region. The mitochondrial nucleotide diversity was more variable when compared within than among genus, and the concatenated tRNA region was the most conserved and the nd6 genes was the most variable.
RESUMEN
Species identification through DNA barcoding or metabarcoding has become a key approach for biodiversity evaluation and ecological studies. However, the rapid accumulation of barcoding data has created some difficulties: for instance, global enquiries to a large reference library can take a very long time. We here devise a two-step searching strategy to speed identification procedures of such queries. This firstly uses a Hidden Markov Model (HMM) algorithm to narrow the searching scope to genus level and then determines the corresponding species using minimum genetic distance. Moreover, using a fuzzy membership function, our approach also estimates the credibility of assignment results for each query. To perform this task, we developed a new software pipeline, FuzzyID2, using Python and C++. Performance of the new method was assessed using eight empirical data sets ranging from 70 to 234,535 barcodes. Five data sets (four animal, one plant) deployed the conventional barcode approach, one used metabarcodes, and two were eDNA-based. The results showed mean accuracies of generic and species identification of 98.60% (with a minimum of 95.00% and a maximum of 100.00%) and 94.17% (with a range of 84.40%-100.00%), respectively. Tests with simulated NGS sequences based on realistic eDNA and metabarcode data demonstrated that FuzzyID2 achieved a significantly higher identification success rate than the commonly used Blast method, and the TIPP method tends to find many fewer species than either FuzztID2 or Blast. Furthermore, data sets with tens of thousands of barcodes need only a few seconds for each query assignment using FuzzyID2. Our approach provides an efficient and accurate species identification protocol for biodiversity-related projects with large DNA sequence data sets.
Asunto(s)
Lógica Difusa , Cadenas de Markov , Programas Informáticos , Clasificación/métodos , Código de Barras del ADN TaxonómicoRESUMEN
DNA barcoding, based on a fragment of cytochrome c oxidase I (COI) mtDNA, is as an effective molecular tool for identification, discovery, and biodiversity assessment for most animals. However, multiple gene markers coupled with more sophisticated analytical approaches may be necessary to clarify species boundaries in cases of cryptic diversity or morphological plasticity. Using 339 moths collected from mountains surrounding Beijing, China, we tested a pipeline consisting of two steps: (1) rapid morphospecies sorting and screening of the investigated fauna with standard COI barcoding approaches; (2) additional analyses with multiple molecular markers for those specimens whose morphospecies and COI barcode grouping were incongruent. In step 1, 124 morphospecies were delimited into 116 barcode units, with 90% of the conflicts being associated with specimens identified to the genus Hypena. In step 2, 55 individuals representing all 12 Hypena morphospecies were analysed using COI, COII, 28S, EF-1a, Wgl sequences or their combinations with the BPP (Bayesian Phylogenetics and Phylogeography) multigene species delimitation method. The multigene analyses supported the delimitation of 5 species, consistent with the COI analysis. We conclude that a two-step barcoding analysis pipeline is able to rapidly characterize insect biodiversity and help to elucidate species boundaries for taxonomic complexes without jeopardizing overall project efficiency by substantially increasing analytical costs.
Asunto(s)
Biodiversidad , Código de Barras del ADN Taxonómico/métodos , ADN Mitocondrial/genética , Mariposas Nocturnas/genética , Animales , Teorema de Bayes , China , Filogeografía , Especificidad de la EspecieRESUMEN
Cuticular proteins (CPs) are vital components of the insects' cuticle that support movement and protect insect from adverse environmental conditions. The CPs exist in a large number and diversiform structures, thus, the accurate annotation is the first step to interpreting their roles in insect growth. The rapid development of sequencing technology has simplified the access to the information on protein sequences, especially for non-model species. Dendrolimus punctatus is a Lepidopteran defoliator, and its periodic outbreaks cause severe damage to the coniferous forests. The transcriptome of D. punctatus integrating the whole developmental periods are available for the potential investigation of CPs. In this study, we identified 216 CPs from D. punctatus, including 147 from CPR family, 4 from TWDL family, 3 from CPF/CPFL families, 22 from CPAP families, 8 low complexity proteins, 1 CPCPC and 31 from other CP families. The putative CPs were compared with homologs in other species such as Bombyx mori, Manduca sexta and Drosophila melanogaster. We further identified five co-orthologous groups have highly similar sequences of CRPs in nine lepidopteran species, which exclusively presented in RR-2 subfamily rather than RR-1. We inferred that in Lepidoptera the difference in RR-2 numbers was maintained by homologs in co-orthologous groups, coincided with observation in Drosophila and Anopheles that gene cluster was the model and source for the expansion of RR-2 genes. In combination with the variation of members in each CP family among different species, these results indicated the evolution of CPs was highly correlated to the adaptation of insect to environment. Furthermore, we compared the amino acid composition of the different types CPRs, and examined the expression patterns of CP genes in various developmental stages. The comprehensive overview of CPs from our study provides an insight into their evolution and the association between them and insect development.
Asunto(s)
Evolución Molecular , Proteínas de Insectos/metabolismo , Mariposas Nocturnas/metabolismo , Animales , Proteínas de Insectos/genética , Mariposas Nocturnas/genética , Familia de Multigenes , FilogeniaRESUMEN
We analyzed the intraspecific gene genealogies of three Leptocarabus ground beetle species (L. seishinensis, L. semiopacus, L. koreanus) in South Korea using sequence data from the mitochondrial cytochrome oxidase subunit I (COI) and nuclear 28S rRNA (28S) genes, and compared phylogeographical patterns among the species. The COI data detected significant genetic differentiation among local populations of all three species, whereas the 28S data showed genetic differentiation only for L. seishinensis. The clearest differentiation of L. seishinensis among local populations was between the northern and southern regions in the COI clades, whereas the 28S clade, which likely indicates relatively ancient events, revealed a range expansion across the northern and southern regions. Leptocarabus semiopacus had the most shallow differentiation of the COI haplotypes, and some clades occurred across the northern and southern regions. In L. koreanus, four diverged COI clades occurred in different regions, with partial overlaps. We discuss the difference in phylogeographical patterns among these Leptocarabus species, as well as between these and other groups of carabid beetles in South Korea.
Asunto(s)
Escarabajos/clasificación , Escarabajos/genética , Complejo IV de Transporte de Electrones/genética , Filogenia , ARN Ribosómico 28S/genética , Animales , Cartilla de ADN/química , ADN Mitocondrial/química , Variación Genética , Genética de Población , Geografía , Japón , Corea (Geográfico) , Datos de Secuencia Molecular , Reacción en Cadena de la Polimerasa , Alineación de Secuencia , Especificidad de la EspecieRESUMEN
The pine moth Dendrolimus punctatus (Walker) is a common insect pest that confers serious damage to conifer forests in south of China. Extensive physiology and ecology studies on D. punctatus have been carried out, but the lack of genetic information has limited our understanding of the molecular mechanisms behind its development and resistance. Using RNA-seq approach, we characterized the transcriptome of this pine moth and investigated its developmental expression profiles during egg, larval, pupal, and adult stages. A total of 107.6 million raw reads were generated that were assembled into 70,664 unigenes. More than 30% unigenes were annotated by searching for homology in protein databases. To better understand the process of metamorphosis, we pairwise compared four developmental phases and obtained 17,624 differential expression genes. Functional enrichment analysis of differentially expressed genes showed positive correlation with specific physiological activities of each stage, and these results were confirmed by qRT-PCR experiments. This study provides a valuable genomic resource of D. punctatus covering all its developmental stages, and will promote future studies on biological processes at the molecular level.
Asunto(s)
Regulación del Desarrollo de la Expresión Génica , Mariposas Nocturnas/genética , Transcriptoma , Animales , China , Femenino , Perfilación de la Expresión Génica , Biblioteca de Genes , Genoma , Proteínas de Insectos/genética , Larva/genética , Masculino , Metamorfosis Biológica , Anotación de Secuencia Molecular , Mariposas Nocturnas/crecimiento & desarrollo , Pupa/genética , Análisis de Secuencia de ARN , Factores de TiempoRESUMEN
With the recent development of molecular approaches to species delimitation, a growing number of cryptic species have been discovered in what had previously been thought to be single morpho-species. Molecular methods, such as DNA barcoding, have greatly enhanced our knowledge of taxonomy, but taxonomy remains incomplete and needs a formal species nomenclature and description to facilitate its use in other scientific fields. A previous study using DNA barcoding, geometric morphometrics and mating tests revealed at least two cryptic species in the Encyrtus sasakii complex. (Hymenoptera: Encyrtidae). To describe these two new species formally (Encyrtus eulecaniumiae sp. nov. and Encyrtus rhodococcusiae sp. nov.), a detailed morphometric study of Encyrtus spp. was performed in addition to the molecular analysis and evaluation of biological data. Morphometric analyses, a multivariate ratio analysis (MRA) and a geometric morphometric analysis (GMA) revealed a great number of differences between the species, but reliable characteristics were not observed for diagnosing the cryptic species. We thus diagnosed these three Encyrtus species on the basis of the characteristics that resulted from genetic markers (mitochondrial cytochrome c oxidase subunit I and nuclear 28S rRNA) and biological data. A formal nomenclature and description of cryptic species was provided on the basis of an integrated taxonomy.