RESUMO
Genomic data are being produced and archived at a prodigious rate, and current studies could become historical baselines for future global genetic diversity analyses and monitoring programs. However, when we evaluated the potential utility of genomic data from wild and domesticated eukaryote species in the world's largest genomic data repository, we found that most archived genomic datasets (86%) lacked the spatiotemporal metadata necessary for genetic biodiversity surveillance. Labor-intensive scouring of a subset of published papers yielded geospatial coordinates and collection years for only 33% (39% if place names were considered) of these genomic datasets. Streamlined data input processes, updated metadata deposition policies, and enhanced scientific community awareness are urgently needed to preserve these irreplaceable records of today's genetic biodiversity and to plug the growing metadata gap.
Assuntos
Biodiversidade , Confiabilidade dos Dados , Eucariotos/genética , Variação Genética , Genoma , Genômica/métodos , Dinâmica PopulacionalRESUMO
Genetic diversity within species represents a fundamental yet underappreciated level of biodiversity. Because genetic diversity can indicate species resilience to changing climate, its measurement is relevant to many national and global conservation policy targets. Many studies produce large amounts of genome-scale genetic diversity data for wild populations, but most (87%) do not include the associated spatial and temporal metadata necessary for them to be reused in monitoring programs or for acknowledging the sovereignty of nations or Indigenous peoples. We undertook a distributed datathon to quantify the availability of these missing metadata and to test the hypothesis that their availability decays with time. We also worked to remediate missing metadata by extracting them from associated published papers, online repositories, and direct communication with authors. Starting with 848 candidate genomic data sets (reduced representation and whole genome) from the International Nucleotide Sequence Database Collaboration, we determined that 561 contained mostly samples from wild populations. We successfully restored spatiotemporal metadata for 78% of these 561 data sets (n = 440 data sets with data on 45,105 individuals from 762 species in 17 phyla). Examining papers and online repositories was much more fruitful than contacting 351 authors, who replied to our email requests 45% of the time. Overall, 23% of our email queries to authors unearthed useful metadata. The probability of retrieving spatiotemporal metadata declined significantly as age of the data set increased. There was a 13.5% yearly decrease in metadata associated with published papers or online repositories and up to a 22% yearly decrease in metadata that were only available from authors. This rapid decay in metadata availability, mirrored in studies of other types of biological data, should motivate swift updates to data-sharing policies and researcher practices to ensure that the valuable context provided by metadata is not lost to conservation science forever.
Importancia de la curación oportuna de metadatos para la vigilancia mundial de la diversidad genética Resumen La diversidad genética intraespecífica representa un nivel fundamental, pero a la vez subvalorado de la biodiversidad. La diversidad genética puede indicar la resiliencia de una especie ante el clima cambiante, por lo que su medición es relevante para muchos objetivos de la política de conservación mundial y nacional. Muchos estudios producen una gran cantidad de datos sobre la diversidad a nivel genético de las poblaciones silvestres, aunque la mayoría (87%) no incluye los metadatos espaciales y temporales asociados para que sean reutilizados en los programas de monitoreo o para reconocer la soberanía de las naciones o los pueblos indígenas. Realizamos un "datatón" distribuido para cuantificar la disponibilidad de estos metadatos faltantes y para probar la hipótesis que supone que esta disponibilidad se deteriora con el tiempo. También trabajamos para reparar los metadatos faltantes al extraerlos de los artículos asociados publicados, los repositorios en línea y la comunicación directa con los autores. Iniciamos con 838 candidatos de conjuntos de datos genómicos (representación reducida y genoma completo) tomados de la colaboración internacional para la base de datos de secuencias de nucleótidos y determinamos que 561 incluían en su mayoría muestras tomadas de poblaciones silvestres. Restauramos con éxito los metadatos espaciotemporales en el 78% de estos 561 conjuntos de datos (n = 440 conjuntos de datos con información sobre 45,105 individuos de 762 especies en 17 filos). El análisis de los artículos y los repositorios virtuales fue mucho más productivo que contactar a los 351 autores, quienes tuvieron un 45% de respuesta a nuestros correos. En general, el 23% de nuestras consultas descubrieron metadatos útiles. La probabilidad de recuperar metadatos espaciotemporales declinó de manera significativa conforme incrementó la antigüedad del conjunto de datos. Hubo una disminución anual del 13.5% en los metadatos asociados con los artículos publicados y los repositorios virtuales y hasta una disminución anual del 22% en los metadatos que sólo estaban disponibles mediante la comunicación con los autores. Este rápido deterioro en la disponibilidad de los metadatos, duplicado en estudios de otros tipos de datos biológicos, debería motivar la pronta actualización de las políticas del intercambio de datos y las prácticas de los investigadores para asegurar que en las ciencias de la conservación no se pierda para siempre el contexto valioso proporcionado por los metadatos.
Assuntos
Conservação dos Recursos Naturais , Metadados , Humanos , Biodiversidade , Probabilidade , Variação GenéticaRESUMO
The response of ectotherms to temperature stress is complex, non-linear, and is influenced by life stage and previous thermal exposure. Mortality is higher under constant low temperatures than under a fluctuating thermal regime (FTR) that maintains the same low temperature but adds a brief, daily pulse of increased temperature. Long term exposure to FTR has been shown to increase transcription of genes involved in oxidative stress, immune function, and metabolic pathways, which may aid in recovery from chill injury and oxidative damage. Previous research suggests the transcriptional response that protects against sub-lethal damage occurs rapidly under exposure to fluctuating temperatures. However, existing studies have only examined gene expression after a week or over many months. Here we characterize gene expression during a single temperature cycle under FTR. Development of pupating alfalfa leafcutting bees (Megachile rotundata) was interrupted at the red-eye stage and were transferred to 6°C with a 1-h pulse to 20°C and returned to 6°C. RNA was collected before, during, and after the temperature pulse and compared to pupae maintained at a static 6°C. The warm pulse is sufficient to cause expression of transcripts that repair cell membrane damage, modify membrane composition, produce antifreeze proteins, restore ion homeostasis, and respond to oxidative stress. This pattern of expression indicates that even brief exposure to warm temperatures has significant protective effects on insects exposed to stressful cold temperatures that persist beyond the warm pulse. Megachile rotundata's sensitivity to temperature fluctuations indicates that short exposures to temperature changes affect development and physiology. Genes associated with developmental patterning are expressed after the warm pulse, suggesting that 1 h at 20°C was enough to resume development in the pupae. The greatest difference in gene expression occurred between pupae collected after the warm pulse and at constant low temperatures. Although both were collected at the same time and temperature, the transcriptional response to one FTR cycle included multiple transcripts previously identified under long-term FTR exposure associated with recovery from chill injury, indicating that the effects of FTR occur rapidly and are persistent.