Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 38
Filtrar
1.
An Acad Bras Cienc ; 95(suppl 2): e20230173, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-38055564

RESUMEN

An integrated approach considering facies, isotopic, and palynological analyses of lake sediments from the Serra Norte de Carajás, southeastern Amazonia, is presented in this work to refine paleoclimate and paleohydrological changes based on upland lake sediments during the late Quaternary. The sediments have a fining-upward deposition cycle typical of upland swamps/lakes. The origin of organic matter is autochthonous mainly related to C3 terrestrial plants, macrophytes and algae. The pollen records of Hedyosmum during the Early Pleistocene suggest lower temperatures than those observed along Holocene. In the transitional period between the Pleistocene and the Holocene, rainfall decreased, causing the retraction of the flooded area, favoring the development of marshy conditions. The Late and Middle Holocene were marked by higher temperatures and lower humidity. Afterward, the increased pollen concentration from canga and forest vegetation, macrophytes, palms, and algae suggested increased humidity in the Early Holocene. The relative contribution of forest pollen along the records indicated that drier conditions were not strong enough for an extensive expansion of canga over forested areas.


Asunto(s)
Sedimentos Geológicos , Lagos , Sedimentos Geológicos/análisis , Plantas , Polen , Bosques
2.
Sci Rep ; 13(1): 18464, 2023 Oct 27.
Artículo en Inglés | MEDLINE | ID: mdl-37891221

RESUMEN

In this paper we explore the reliability of contexts of machine learning (ML) models. There are several evaluation procedures commonly used to validate a model (precision, F1 Score and others); However, these procedures are not linked to the evaluation of learning itself, but only to the number of correct answers presented by the model. This characteristic makes it impossible to assess whether a model was able to learn through elements that make sense of the context in which it is inserted. Therefore, the model could achieves good results in the training stage but poor results when the model needs to be generalized. When there are many different models that achieve similar performance, the model that presented the highest number of hits in training does not mean that this model is the best. Therefore, we created a methodology based on Item Response Theory that allows us to identify whether an ML context is unreliable, providing an extra and different validation for ML models.

3.
Neuroradiology ; 65(11): 1665-1668, 2023 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-37311984

RESUMEN

Chagas disease is an infection caused by Trypanosoma cruzi, a parasite endemic in Latin America. Acute involvement of the CNS by Chagas has been considered rare, but presumed reactivation of chronic disease in immunosuppressed patients has been the subject of recent reports. Our objective is to describe the clinical and imaging characteristics of four patients with Chagas disease and CNS involvement, and the patients had to have available MRI and a diagnosis confirmed by biopsy. The imaging findings were similar, highlighting the presence of focal cerebral lesions with hypointensity on T2-WI, and these lesions assume a "bunch of acai berries appearance", a fruit involved in the transmission of T. cruzi. The post Gd T1-WI shows punctate enhancement. Knowledge of this pattern may be crucial to recognize this disease in immunocompromised patients from endemic areas.


Asunto(s)
Neoplasias del Sistema Nervioso Central , Enfermedad de Chagas , Euterpe , Trypanosoma cruzi , Humanos , Euterpe/parasitología , Enfermedad de Chagas/diagnóstico por imagen , Enfermedad de Chagas/epidemiología , Enfermedad de Chagas/parasitología , Radiografía
4.
Microbiol Resour Announc ; 11(6): e0014922, 2022 Jun 16.
Artículo en Inglés | MEDLINE | ID: mdl-35575485

RESUMEN

We report the draft genome sequence of the Firmicute strain Y002, a facultatively anaerobic, acidophilic bacterium that catalyzes the dissimilatory oxidation of iron and sulfur and the reduction of ferric iron. Analysis of the genome (2.9 Mb; G+C content, 46 mol%) provided insights into its ability to grow in extremely acidic geothermal environments.

5.
PeerJ ; 10: e13300, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35437474

RESUMEN

Motivation: Since the identification of the novel coronavirus (SARS-CoV-2), the scientific community has made a huge effort to understand the virus biology and to develop vaccines. Next-generation sequencing strategies have been successful in understanding the evolution of infectious diseases as well as facilitating the development of molecular diagnostics and treatments. Thousands of genomes are being generated weekly to understand the genetic characteristics of this virus. Efficient pipelines are needed to analyze the vast amount of data generated. Here we present a new pipeline designed for genomic analysis and variant identification of the SARS-CoV-2 virus. Results: PipeCoV shows better performance when compared to well-established SARS-CoV-2 pipelines, with a lower content of Ns and higher genome coverage when compared to the Wuhan reference. It also provides a variant report not offered by other tested pipelines. Availability: https://github.com/alvesrco/pipecov.


Asunto(s)
COVID-19 , Virus , Humanos , SARS-CoV-2/genética , COVID-19/genética , Genoma Viral/genética , Genómica , Virus/genética
6.
PLoS One ; 17(3): e0265449, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35298523

RESUMEN

Ipomoea is a large pantropical genus globally distributed, which importance goes beyond the economic value as food resources or ornamental crops. This highly diverse genus has been the focus of a great number of studies, enriching the plant genomics knowledge, and challenging the plant evolution models. In the Carajás mountain range, located in Eastern Amazon, the savannah-like ferruginous ecosystem known as canga harbors highly specialized plant and animal populations, and Ipomoea is substantially representative in such restrictive habitat. Thus, to provide genetic data and insights into whole plastome phylogenetic relationships among key Ipomoea species from Eastern Amazon with little to none previously available data, we present the complete plastome sequences of twelve lineages of the genus, including the canga microendemic I. cavalcantei, the closely related I. marabaensis, and their putative hybrids. The twelve plastomes presented similar gene content as most publicly available Ipomoea plastomes, although the putative hybrids were correctly placed as closely related to the two parental species. The cavalcantei-marabaensis group was consistently grouped between phylogenetic methods. The closer relationship of the I. carnea plastome with the cavalcantei-marabaensis group, as well as the branch formed by I. quamoclit, I. asarifolia and I. maurandioides, were probably a consequence of insufficient taxonomic representativity, instead of true genetic closeness, reinforcing the importance of new plastome assemblies to resolve inconsistencies and boost statistical confidence, especially the case for South American clades of Ipomoea. The search for k-mers presenting high dispersion among the frequency distributions pointed to highly variable coding and intergenic regions, which may potentially contribute to the genetic diversity observed at species level. Our results contribute to the resolution of uncertain clades within Ipomoea and future phylogenomic studies, bringing unprecedented results to Ipomoea species with restricted distribution, such as I. cavalcantei.


Asunto(s)
Ipomoea , Animales , ADN Intergénico , Ecosistema , Genoma de Planta , Ipomoea/genética , Filogenia
7.
Ecol Evol ; 11(19): 13348-13362, 2021 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-34646474

RESUMEN

The canga of the Serra dos Carajás, in Eastern Amazon, is home to a unique open plant community, harboring several endemic and rare species. Although a complete flora survey has been recently published, scarce to no genetic information is available for most plant species of the ironstone outcrops of the Serra dos Carajás. In this scenario, DNA barcoding appears as a fast and effective approach to assess the genetic diversity of the Serra dos Carajás flora, considering the growing need for robust biodiversity conservation planning in such an area with industrial mining activities. Thus, after testing eight different DNA barcode markers (matK, rbcL, rpoB, rpoC1, atpF-atpH, psbK-psbI, trnH-psbA, and ITS2), we chose rbcL and ITS2 as the most suitable markers for a broad application in the regional flora. Here we describe DNA barcodes for 1,130 specimens of 538 species, 323 genera, and 115 families of vascular plants from a highly diverse flora in the Amazon basin, with a total of 344 species being barcoded for the first time. In addition, we assessed the potential of using DNA metabarcoding of bulk samples for surveying plant diversity in the canga. Upon achieving the first comprehensive DNA barcoding effort directed to a complete flora in the Brazilian Amazon, we discuss the relevance of our results to guide future conservation measures in the Serra dos Carajás.

8.
Ecol Evol ; 11(15): 10119-10132, 2021 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-34367563

RESUMEN

The quillwort Isoëtes cangae is a critically endangered species occurring in a single lake in Serra dos Carajás, Eastern Amazon. Low genetic diversity and small effective population sizes (N e) are expected for narrow endemic species (NES). Conservation biology studies centered in a single species show some limitations, but they are still useful considering the limited time and resources available for protection of species at risk of extinction. Here, we evaluated the genetic diversity, population structure, N e, and minimum viable population (MVP) of I. cangae to provide information for effective conservation programs. Our analyses were based on 55 individuals collected from the Amendoim Lake and 35,638 neutral SNPs. Our results indicated a single panmictic population, moderate levels of genetic diversity, and N e in the order of thousands, contrasting the expected for NES. Negative FIS values were also found, suggesting that I. cangae is not under risk of inbreeding depression. Our findings imply that I. cangae contains enough genetic diversity to ensure evolutionary potential and that all individuals should be treated as one demographic unit. These results provide essential information to optimize ex situ conservation efforts and genetic diversity monitoring, which are currently applied to guide I. cangae conservation plans.

9.
BMC Bioinformatics ; 22(1): 87, 2021 Feb 25.
Artículo en Inglés | MEDLINE | ID: mdl-33632132

RESUMEN

BACKGROUND: Microbes perform a fundamental economic, social, and environmental role in our society. Metagenomics makes it possible to investigate microbes in their natural environments (the complex communities) and their interactions. The way they act is usually estimated by looking at the functions they play in those environments and their responsibility is measured by their genes. The advances of next-generation sequencing technology have facilitated metagenomics research however it also creates a heavy computational burden. Large and complex biological datasets are available as never before. There are many gene predictors available that can aid the gene annotation process though they lack handling appropriately metagenomic data complexities. There is no standard metagenomic benchmark data for gene prediction. Thus, gene predictors may inflate their results by obfuscating low false discovery rates. RESULTS: We introduce geneRFinder, an ML-based gene predictor able to outperform state-of-the-art gene prediction tools across this benchmark by using only one pre-trained Random Forest model. Average prediction rates of geneRFinder differed in percentage terms by 54% and 64%, respectively, against Prodigal and FragGeneScan while handling high complexity metagenomes. The specificity rate of geneRFinder had the largest distance against FragGeneScan, 79 percentage points, and 66 more than Prodigal. According to McNemar's test, all percentual differences between predictors performances are statistically significant for all datasets with a 99% confidence interval. CONCLUSIONS: We provide geneRFinder, an approach for gene prediction in distinct metagenomic complexities, available at gitlab.com/r.lorenna/generfinder and https://osf.io/w2yd6/ , and also we provide a novel, comprehensive benchmark data for gene prediction-which is based on The Critical Assessment of Metagenome Interpretation (CAMI) challenge, and contains labeled data from gene regions-available at https://sourceforge.net/p/generfinder-benchmark .


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento , Metagenoma , Metagenómica , Algoritmos , Benchmarking , Anotación de Secuencia Molecular
10.
Mol Ecol Resour ; 21(1): 44-58, 2021 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-32419278

RESUMEN

Despite the importance of climate-adjusted provenancing to mitigate the effects of environmental change, climatic considerations alone are insufficient when restoring highly degraded sites. Here we propose a comprehensive landscape genomic approach to assist the restoration of moderately disturbed and highly degraded sites. To illustrate it we employ genomic data sets comprising thousands of single nucleotide polymorphisms from two plant species suitable for the restoration of iron-rich Amazonian Savannas. We first use a subset of neutral loci to assess genetic structure and determine the genetic neighbourhood size. We then identify genotype-phenotype-environment associations, map adaptive genetic variation, and predict adaptive genotypes for restoration sites. Whereas local provenances were found optimal to restore a moderately disturbed site, a mixture of genotypes seemed the most promising strategy to recover a highly degraded mining site. We discuss how our results can help define site-adjusted provenancing strategies, and argue that our methods can be more broadly applied to assist other restoration initiatives.


Asunto(s)
Restauración y Remediación Ambiental , Genómica , Genotipo , Fenotipo , Adaptación Fisiológica , Estudios de Asociación Genética , Polimorfismo de Nucleótido Simple
11.
Biota Neotrop. (Online, Ed. ingl.) ; 21(1): e20201004, 2021. graf
Artículo en Inglés | LILACS-Express | LILACS | ID: biblio-1153210

RESUMEN

Abstract: Honey pollen samples of Melipona seminigra pernigraMoure & Kerr 1950 sampled between 2017 and 2019 from experimental apiaries installed in campo rupestre on canga (CRC) vegetation of the Serra dos Carajás aimed to evaluated seasonal floral availability of undisturbed and mining-influenced areas. Around one hundred pollen types were identified mainly belonging to Fabaceae, Myrtaceae and Euphorbiaceae (31, 6 and 5 species, respectively). Mining area presented the highest pollen richness, almost twice those identified in the undisturbed areas. 80% of the pollen types are rare with concentrations ≤ 2,000 pollen grains/10 g, while the remaining were the most abundant, frequent and the primary bee sources. These latter correspond mostly to native plants species such as Tapirira guianensis Aubl., Protium spp., Aparisthmium cordatum (A.Juss.) Baill., Mimosa acutistipula var. ferrea Barneby, Periandra mediterranea (Vell.) Taub., Miconia spp., Pleroma carajasense K.Rocha, Myrcia splendens (Sw.) DC., Serjania spp. and Solanum crinitum Lam. All pollen types were identified during both seasons, but higher concentration values are related to the dry period (June-September). The statistical analysis of the pollen data indicated that there was no significant difference between undisturbed and mining-influenced areas, since primary bee sources of this study are widespread used in revegetation of mined areas.


Resumo: O conteúdo polínico de amostras de mel coletadas nos anos de 2017 e 2019 de apiários experimentais de Melipona seminigra pernigraMoure & Kerr 1950, instalado dentro de uma vegetação de campo rupestre em um afloramento de canga na Serra dos Carajás, sudeste da Amazônia, foi analisado para entender a variabilidade local dos recursos florais em áreas naturais e perturbadas. Aproximadamente 100% dos tipos polínicos foram identificados e pertencem principalmente às famílias Fabaceae, Myrtaceae e Euphorbiaceae (31, 6 e 5 espécies, respectivamente). Áreas de mineração apresentaram a maior riqueza de pólen, quase o dobro daquelas identificadas em áreas perturbadas. 80% dos tipos de pólen são raros com concentrações ≤ 2.000 grãos de pólen/10g, enquanto que os restantes foram os mais abundantes, frequentes e fontes primárias para as abelhas. Este últimos correspondem principalmente a plantas nativas como Tapirira guianensis Aubl., Protium spp., Aparisthmium cordatum (A.Juss.) Baill., Mimosa acutistipula var. ferrea Barneby, Periandra mediterrânea (Vell.) Taub., Miconia spp., Pleroma carajasense K.Rocha, Myrcia splendens (Sw.) DC., Serjania spp. e Solanum crinitum Lam. Todos os tipos polínicos foram identificados durante ambas as estações, mas altas concentrações estão relacionadas ao período seco (junho-setembro). A análise estatística indicou que não houve diferença significativa nos dados de pólen de mel entre áreas naturais e áreas anteriormente degradadas, uma vez que as fontes primárias das abelhas deste estudo são amplamente utilizadas na revegetação de áreas mineradas.

14.
BioData Min ; 12: 13, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31320927

RESUMEN

BACKGROUND: Fraudulent milk adulteration is a dangerous practice in the dairy industry that is harmful to consumers since milk is one of the most consumed food products. Milk quality can be assessed by Fourier Transformed Infrared Spectroscopy (FTIR), a simple and fast method for obtaining its compositional information. The spectral data produced by this technique can be explored using machine learning methods, such as neural networks and decision trees, in order to create models that represent the characteristics of pure and adulterated milk samples. RESULTS: Thousands of milk samples were collected, some of them were manually adulterated with five different substances and subjected to infrared spectroscopy. This technique produced spectral data from the milk samples composition, which were used for training different machine learning algorithms, such as deep and ensemble decision tree learners. The proposed method is used to predict the presence of adulterants in a binary classification problem and also the specific assessment of which of five adulterants was found through multiclass classification. In deep learning, we propose a Convolutional Neural Network architecture that needs no preprocessing on spectral data. Classifiers evaluated show promising results, with classification accuracies up to 98.76%, outperforming commonly used classical learning methods. CONCLUSIONS: The proposed methodology uses machine learning techniques on milk spectral data. It is able to predict common adulterations that occur in the dairy industry. Both deep and ensemble tree learners were evaluated considering binary and multiclass classifications and the results were compared. The proposed neural network architecture is able to outperform the composition recognition made by the FTIR equipment and by commonly used methods in the dairy industry.

15.
Evol Appl ; 12(6): 1164-1177, 2019 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-31293629

RESUMEN

Habitat degradation and climate change are currently threatening wild pollinators, compromising their ability to provide pollination services to wild and cultivated plants. Landscape genomics offers powerful tools to assess the influence of landscape modifications on genetic diversity and functional connectivity, and to identify adaptations to local environmental conditions that could facilitate future bee survival. Here, we assessed range-wide patterns of genetic structure, genetic diversity, gene flow, and local adaptation in the stingless bee Melipona subnitida, a tropical pollinator of key biological and economic importance inhabiting one of the driest and hottest regions of South America. Our results reveal four genetic clusters across the species' full distribution range. All populations were found to be under a mutation-drift equilibrium, and genetic diversity was not influenced by the amount of reminiscent natural habitats. However, genetic relatedness was spatially autocorrelated and isolation by landscape resistance explained range-wide relatedness patterns better than isolation by geographic distance, contradicting earlier findings for stingless bees. Specifically, gene flow was enhanced by increased thermal stability, higher forest cover, lower elevations, and less corrugated terrains. Finally, we detected genomic signatures of adaptation to temperature, precipitation, and forest cover, spatially distributed in latitudinal and altitudinal patterns. Taken together, our findings shed important light on the life history of M. subnitida and highlight the role of regions with large thermal fluctuations, deforested areas, and mountain ranges as dispersal barriers. Conservation actions such as restricting long-distance colony transportation, preserving local adaptations, and improving the connectivity between highlands and lowlands are likely to assure future pollination services.

16.
Sci Data ; 6: 190008, 2019 02 12.
Artículo en Inglés | MEDLINE | ID: mdl-30747914

RESUMEN

Microorganisms are useful environmental indicators, able to deliver essential insights to processes regarding mine land rehabilitation. To compare microbial communities from a chronosequence of mine land rehabilitation to pre-disturbance levels from references sites covered by native vegetation, we sampled non-rehabilitated, rehabilitating and reference study sites from the Urucum Massif, Southwestern Brazil. From each study site, three composed soil samples were collected for chemical, physical, and metagenomics analysis. We used a paired-end library sequencing technology (NextSeq 500 Illumina); the reads were assembled using MEGAHIT. Coding DNA sequences (CDS) were identified using Kaiju in combination with non-redundant NCBI BLAST reference sequences containing archaea, bacteria, and viruses. Additionally, a functional classification was performed by EMG v2.3.2. Here, we provide the raw data and assembly (reads and contigs), followed by initial functional and taxonomic analysis, as a base-line for further studies of this kind. Further investigation is needed to fully understand the mechanisms of environmental rehabilitation in tropical regions, inspiring further researchers to explore this collection for hypothesis testing.


Asunto(s)
Monitoreo del Ambiente/métodos , Metagenómica/métodos , Microbiota , Microbiología del Suelo , Archaea/genética , Bacterias/genética , Brasil , Secuenciación de Nucleótidos de Alto Rendimiento , Hierro , Microbiota/genética , Minería , Virus/genética
17.
Brief Bioinform ; 20(6): 2116-2129, 2019 11 27.
Artículo en Inglés | MEDLINE | ID: mdl-30137230

RESUMEN

MOTIVATION: With the recent advances in DNA sequencing technologies, the study of the genetic composition of living organisms has become more accessible for researchers. Several advances have been achieved because of it, especially in the health sciences. However, many challenges which emerge from the complexity of sequencing projects remain unsolved. Among them is the task of assembling DNA fragments from previously unsequenced organisms, which is classified as an NP-hard (nondeterministic polynomial time hard) problem, for which no efficient computational solution with reasonable execution time exists. However, several tools that produce approximate solutions have been used with results that have facilitated scientific discoveries, although there is ample room for improvement. As with other NP-hard problems, machine learning algorithms have been one of the approaches used in recent years in an attempt to find better solutions to the DNA fragment assembly problem, although still at a low scale. RESULTS: This paper presents a broad review of pioneering literature comprising artificial intelligence-based DNA assemblers-particularly the ones that use machine learning-to provide an overview of state-of-the-art approaches and to serve as a starting point for further study in this field.


Asunto(s)
Genoma , Aprendizaje Automático , Algoritmos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ADN
18.
Sci Rep ; 8(1): 14799, 2018 10 04.
Artículo en Inglés | MEDLINE | ID: mdl-30287878

RESUMEN

Freshwater fungi are key decomposers of organic material and play important roles in nutrient cycling, bio-remediation and ecosystem functioning. Although aquatic fungal communities respond to pollution, few studies have quantitatively assessed the effect of freshwater contamination on fungal diversity and composition; and knowledge is scarcer for tropical systems. Here we help fill this knowledge gap by studying a heavily-contaminated South American river spanning a biodiversity hotspot. We collected 30 water samples scattered across a quality gradient over two seasons and analyzed them using Terminal Restriction Fragment Length Polymorphisms (T-RFLP) coupled with 454 Pyrosequencing. Using T-RFLP we identified 451 and 442 Operational Taxonomy Units (OTUs) in the dry and rainy seasons respectively, whereas Pyrosequencing revealed 48,553 OTUs from which 11% were shared between seasons. Although 68% of all identified OTUs and 51% of all identified phyla remained unidentified, dominant fungal phyla included the Ascomycota, Basidiomycota, Chytridiomycota, Glomeromycota, Zygomycota and Neocallimastigomycota, while Calcarisporiella, Didymosphaeria, Mycosphaerella (Ascomycota) and Rhodotorula (Basidiomycota) were the most abundant genera. Fungal diversity was affected by pH and dissolved iron, while community composition was influenced by dissolved oxygen, pH, nitrate, biological oxygen demand, total aluminum, total organic carbon, total iron and seasonality. The presence of potentially pathogenic species was associated with high pH. Furthermore, geographic distance was positively associated with community dissimilarity, suggesting that local conditions allowed divergence among fungal communities. Overall, our findings raise potential concerns for human health and the functioning of tropical river ecosystems and they call for improved water sanitation systems.


Asunto(s)
Hongos/clasificación , Hongos/aislamiento & purificación , Micobioma , Ríos/microbiología , Contaminantes Químicos del Agua/análisis , Calidad del Agua , Análisis de la Demanda Biológica de Oxígeno , Secuenciación de Nucleótidos de Alto Rendimiento , Concentración de Iones de Hidrógeno , Polimorfismo de Longitud del Fragmento de Restricción , Estaciones del Año , América del Sur , Clima Tropical
19.
PLoS One ; 13(8): e0201417, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30089144

RESUMEN

Isoetes are ancient quillworts members of the only genus of the order Isoetales. The genus is slow evolving but is resilient, and widespread worldwide. Two recently described species occur in the Eastern Brazilian Amazon, Isoetes serracarajensis and Isoetes cangae. They are found in the ironstone grasslands known as Canga. While I. serracarajensis is present mostly in seasonal water bodies, I. cangae is known to occur in a single permanent lake at the South mountain range. In this work, we undertake an extensive morphological, physiological and genetic characterization of both species to establish species boundaries and better understand the morphological and genetic features of these two species. Our results indicate that the morphological differentiation of the species is subtle and requires a quantitative assessment of morphological elements of the megaspore for diagnosis. We did not detect differences in microspore output, but morphological peculiarities may establish a reproductive barrier. Additionally, genetic analysis using DNA barcodes and whole chloroplast genomes indicate that although the plants are genetically very similar both approaches provide diagnostic characters. There was no indication of population structuring I. serracarajensis. These results set the basis for a deeper understanding of the evolution of the Isoetes genus.


Asunto(s)
Código de Barras del ADN Taxonómico , Genoma del Cloroplasto , Lycopodiaceae , Lycopodiaceae/clasificación , Lycopodiaceae/genética , Lycopodiaceae/crecimiento & desarrollo , América del Sur
20.
BMC Bioinformatics ; 19(1): 297, 2018 08 08.
Artículo en Inglés | MEDLINE | ID: mdl-30089465

RESUMEN

BACKGROUND: Taxonomic identification of plants and insects is a hard process that demands expert taxonomists and time, and it's often difficult to distinguish on morphology only. DNA barcodes allow a rapid species discovery and identification and have been widely used for taxonomic identification by targeting known gene regions that permit to discriminate these species. DNA barcode sequence analysis is usually carried out with processes and tools that still demand a high interaction with the user or researcher. To reduce at most such interaction, we proposed PIPEBAR, a pipeline for DNA chromatograms analysis of Sanger platform sequencing, ensuring high quality consensus sequences along with efficient running time. We also proposed a paired-end reads assembly tool, OverlapPER, which is used in sequence or independently of PIPEBAR. RESULTS: PIPEBAR is a command line tool to automatize the processing of large number of trace files. It is accurate as the proprietary Geneious tool and faster than most popular software for barcoding analysis. It is 7 times faster than Geneious and 14 times faster than SeqTrace for processing hundreds of barcoding sequences. OverlapPER is a novel tool for overlapping paired-end reads accurately that accepts both substitution and indel errors and returns both overlapped and non-overlapped regions between a pair of reads. OverlapPER obtained the best results compared to currently used tools when merging 1,000,000 simulated paired-end reads. CONCLUSIONS: PIPEBAR and OverlapPER run on most operating systems and are freely available, along with supporting code and documentation, at https://sourceforge.net/projects/PIPEBAR / and https://sourceforge.net/projects/overlapper-reads /.


Asunto(s)
Código de Barras del ADN Taxonómico/métodos , Programas Informáticos , Secuencia de Bases , Codón de Terminación/genética , Secuencia de Consenso , Mutación del Sistema de Lectura/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...