RESUMO
Understanding the origins of biodiversity has been an aspiration since the days of early naturalists. The immense complexity of ecological, evolutionary, and spatial processes, however, has made this goal elusive to this day. Computer models serve progress in many scientific fields, but in the fields of macroecology and macroevolution, eco-evolutionary models are comparatively less developed. We present a general, spatially explicit, eco-evolutionary engine with a modular implementation that enables the modeling of multiple macroecological and macroevolutionary processes and feedbacks across representative spatiotemporally dynamic landscapes. Modeled processes can include species' abiotic tolerances, biotic interactions, dispersal, speciation, and evolution of ecological traits. Commonly observed biodiversity patterns, such as α, ß, and γ diversity, species ranges, ecological traits, and phylogenies, emerge as simulations proceed. As an illustration, we examine alternative hypotheses expected to have shaped the latitudinal diversity gradient (LDG) during the Earth's Cenozoic era. Our exploratory simulations simultaneously produce multiple realistic biodiversity patterns, such as the LDG, current species richness, and range size frequencies, as well as phylogenetic metrics. The model engine is open source and available as an R package, enabling future exploration of various landscapes and biological processes, while outputs can be linked with a variety of empirical biodiversity patterns. This work represents a key toward a numeric, interdisciplinary, and mechanistic understanding of the physical and biological processes that shape Earth's biodiversity.
Assuntos
Evolução Biológica , Simulação por Computador , Planeta Terra , Biodiversidade , Ecologia , Pesquisa Empírica , Especiação GenéticaRESUMO
Environmental DNA (eDNA) metabarcoding provides an efficient approach for documenting biodiversity patterns in marine and terrestrial ecosystems. The complexity of these data prevents current methods from extracting and analyzing all the relevant ecological information they contain, and new methods may provide better dimensionality reduction and clustering. Here we present two new deep learning-based methods that combine different types of neural networks (NNs) to ordinate eDNA samples and visualize ecosystem properties in a two-dimensional space: the first is based on variational autoencoders and the second on deep metric learning. The strength of our new methods lies in the combination of two inputs: the number of sequences found for each molecular operational taxonomic unit (MOTU) detected and their corresponding nucleotide sequence. Using three different datasets, we show that our methods accurately represent several biodiversity indicators in a two-dimensional latent space: MOTU richness per sample, sequence α-diversity per sample, Jaccard's and sequence ß-diversity between samples. We show that our nonlinear methods are better at extracting features from eDNA datasets while avoiding the major biases associated with eDNA. Our methods outperform traditional dimension reduction methods such as Principal Component Analysis, t-distributed Stochastic Neighbour Embedding, Nonmetric Multidimensional Scaling and Uniform Manifold Approximation and Projection for dimension reduction. Our results suggest that NNs provide a more efficient way of extracting structure from eDNA metabarcoding data, thereby improving their ecological interpretation and thus biodiversity monitoring.
Assuntos
DNA Ambiental , Aprendizado Profundo , Ecossistema , Código de Barras de DNA Taxonômico/métodos , Monitoramento Ambiental/métodos , BiodiversidadeRESUMO
High-throughput DNA sequencing is becoming an increasingly important tool to monitor and better understand biodiversity responses to environmental changes in a standardized and reproducible way. Environmental DNA (eDNA) from organisms can be captured in ecosystem samples and sequenced using metabarcoding, but processing large volumes of eDNA data and annotating sequences to recognized taxa remains computationally expensive. Speed and accuracy are two major bottlenecks in this critical step. Here, we evaluated the ability of convolutional neural networks (CNNs) to process short eDNA sequences and associate them with taxonomic labels. Using a unique eDNA data set collected in highly diverse Tropical South America, we compared the speed and accuracy of CNNs with that of a well-known bioinformatic pipeline (OBITools) in processing a small region (60 bp) of the 12S ribosomal DNA targeting freshwater fishes. We found that the taxonomic labels from the CNNs were comparable to those from OBITools, with high correlation levels for the composition of the regional fish fauna. The CNNs enabled the processing of raw fastq files at a rate of approximately 1 million sequences per minute, which was about 150 times faster than with OBITools. Given the good performance of CNNs in the highly diverse ecosystem considered here, the development of more elaborate CNNs promises fast deployment for future biodiversity inventories using eDNA.