RESUMO
The early twenty-first century has witnessed massive expansions in availability and accessibility of digital data in virtually all domains of the biodiversity sciences. Led by an array of asynchronous digitization activities spanning ecological, environmental, climatological, and biological collections data, these initiatives have resulted in a plethora of mostly disconnected and siloed data, leaving to researchers the tedious and time-consuming manual task of finding and connecting them in usable ways, integrating them into coherent data sets, and making them interoperable. The focus to date has been on elevating analog and physical records to digital replicas in local databases prior to elevating them to ever-growing aggregations of essentially disconnected discipline-specific information. In the present article, we propose a new interconnected network of digital objects on the Internet-the Digital Extended Specimen (DES) network-that transcends existing aggregator technology, augments the DES with third-party data through machine algorithms, and provides a platform for more efficient research and robust interdisciplinary discovery.
RESUMO
BACKGROUND: Making forecasts about biodiversity and giving support to policy relies increasingly on large collections of data held electronically, and on substantial computational capability and capacity to analyse, model, simulate and predict using such data. However, the physically distributed nature of data resources and of expertise in advanced analytical tools creates many challenges for the modern scientist. Across the wider biological sciences, presenting such capabilities on the Internet (as "Web services") and using scientific workflow systems to compose them for particular tasks is a practical way to carry out robust "in silico" science. However, use of this approach in biodiversity science and ecology has thus far been quite limited. RESULTS: BioVeL is a virtual laboratory for data analysis and modelling in biodiversity science and ecology, freely accessible via the Internet. BioVeL includes functions for accessing and analysing data through curated Web services; for performing complex in silico analysis through exposure of R programs, workflows, and batch processing functions; for on-line collaboration through sharing of workflows and workflow runs; for experiment documentation through reproducibility and repeatability; and for computational support via seamless connections to supporting computing infrastructures. We developed and improved more than 60 Web services with significant potential in many different kinds of data analysis and modelling tasks. We composed reusable workflows using these Web services, also incorporating R programs. Deploying these tools into an easy-to-use and accessible 'virtual laboratory', free via the Internet, we applied the workflows in several diverse case studies. We opened the virtual laboratory for public use and through a programme of external engagement we actively encouraged scientists and third party application and tool developers to try out the services and contribute to the activity. CONCLUSIONS: Our work shows we can deliver an operational, scalable and flexible Internet-based virtual laboratory to meet new demands for data processing and analysis in biodiversity science and ecology. In particular, we have successfully integrated existing and popular tools and practices from different scientific disciplines to be used in biodiversity and ecological research.
Assuntos
Biodiversidade , Ecologia/métodos , Ecologia/instrumentação , Internet , Modelos Biológicos , Software , Fluxo de TrabalhoRESUMO
The Taverna workflow tool suite (http://www.taverna.org.uk) is designed to combine distributed Web Services and/or local tools into complex analysis pipelines. These pipelines can be executed on local desktop machines or through larger infrastructure (such as supercomputers, Grids or cloud environments), using the Taverna Server. In bioinformatics, Taverna workflows are typically used in the areas of high-throughput omics analyses (for example, proteomics or transcriptomics), or for evidence gathering methods involving text mining or data mining. Through Taverna, scientists have access to several thousand different tools and resources that are freely available from a large range of life science institutions. Once constructed, the workflows are reusable, executable bioinformatics protocols that can be shared, reused and repurposed. A repository of public workflows is available at http://www.myexperiment.org. This article provides an update to the Taverna tool suite, highlighting new features and developments in the workbench and the Taverna Server.
Assuntos
Biologia Computacional , Software , Mineração de Dados , Perfilação da Expressão Gênica , Internet , Filogenia , Proteômica , Ferramenta de Busca , Fluxo de TrabalhoRESUMO
Biodiversity informatics plays a central enabling role in the research community's efforts to address scientific conservation and sustainability issues. Great strides have been made in the past decade establishing a framework for sharing data, where taxonomy and systematics has been perceived as the most prominent discipline involved. To some extent this is inevitable, given the use of species names as the pivot around which information is organised. To address the urgent questions around conservation, land-use, environmental change, sustainability, food security and ecosystem services that are facing Governments worldwide, we need to understand how the ecosystem works. So, we need a systems approach to understanding biodiversity that moves significantly beyond taxonomy and species observations. Such an approach needs to look at the whole system to address species interactions, both with their environment and with other species.It is clear that some barriers to progress are sociological, basically persuading people to use the technological solutions that are already available. This is best addressed by developing more effective systems that deliver immediate benefit to the user, hiding the majority of the technology behind simple user interfaces. An infrastructure should be a space in which activities take place and, as such, should be effectively invisible.This community consultation paper positions the role of biodiversity informatics, for the next decade, presenting the actions needed to link the various biodiversity infrastructures invisibly and to facilitate understanding that can support both business and policy-makers. The community considers the goal in biodiversity informatics to be full integration of the biodiversity research community, including citizens' science, through a commonly-shared, sustainable e-infrastructure across all sub-disciplines that reliably serves science and society alike.
Assuntos
Biodiversidade , Biologia Computacional/instrumentação , Biologia Computacional/métodos , Animais , Ecossistema , Humanos , Disseminação de InformaçãoRESUMO
BACKGROUND: More and more herbaria are digitising their collections. Images of specimens are made available online to facilitate access to them and allow extraction of information from them. Transcription of the data written on specimens is critical for general discoverability and enables incorporation into large aggregated research datasets. Different methods, such as crowdsourcing and artificial intelligence, are being developed to optimise transcription, but herbarium specimens pose difficulties in data extraction for many reasons. NEW INFORMATION: To provide developers of transcription methods with a means of optimisation, we have compiled a benchmark dataset of 1,800 herbarium specimen images with corresponding transcribed data. These images originate from nine different collections and include specimens that reflect the multiple potential obstacles that transcription methods may encounter, such as differences in language, text format (printed or handwritten), specimen age and nomenclatural type status. We are making these specimens available with a Creative Commons Zero licence waiver and with permanent online storage of the data. By doing this, we are minimising the obstacles to the use of these images for transcription training. This benchmark dataset of images may also be used where a defined and documented set of herbarium specimens is needed, such as for the extraction of morphological traits, handwriting recognition and colour analysis of specimens.
RESUMO
Much biodiversity data is collected worldwide, but it remains challenging to assemble the scattered knowledge for assessing biodiversity status and trends. The concept of Essential Biodiversity Variables (EBVs) was introduced to structure biodiversity monitoring globally, and to harmonize and standardize biodiversity data from disparate sources to capture a minimum set of critical variables required to study, report and manage biodiversity change. Here, we assess the challenges of a 'Big Data' approach to building global EBV data products across taxa and spatiotemporal scales, focusing on species distribution and abundance. The majority of currently available data on species distributions derives from incidentally reported observations or from surveys where presence-only or presence-absence data are sampled repeatedly with standardized protocols. Most abundance data come from opportunistic population counts or from population time series using standardized protocols (e.g. repeated surveys of the same population from single or multiple sites). Enormous complexity exists in integrating these heterogeneous, multi-source data sets across space, time, taxa and different sampling methods. Integration of such data into global EBV data products requires correcting biases introduced by imperfect detection and varying sampling effort, dealing with different spatial resolution and extents, harmonizing measurement units from different data sources or sampling methods, applying statistical tools and models for spatial inter- or extrapolation, and quantifying sources of uncertainty and errors in data and models. To support the development of EBVs by the Group on Earth Observations Biodiversity Observation Network (GEO BON), we identify 11 key workflow steps that will operationalize the process of building EBV data products within and across research infrastructures worldwide. These workflow steps take multiple sequential activities into account, including identification and aggregation of various raw data sources, data quality control, taxonomic name matching and statistical modelling of integrated data. We illustrate these steps with concrete examples from existing citizen science and professional monitoring projects, including eBird, the Tropical Ecology Assessment and Monitoring network, the Living Planet Index and the Baltic Sea zooplankton monitoring. The identified workflow steps are applicable to both terrestrial and aquatic systems and a broad range of spatial, temporal and taxonomic scales. They depend on clear, findable and accessible metadata, and we provide an overview of current data and metadata standards. Several challenges remain to be solved for building global EBV data products: (i) developing tools and models for combining heterogeneous, multi-source data sets and filling data gaps in geographic, temporal and taxonomic coverage, (ii) integrating emerging methods and technologies for data collection such as citizen science, sensor networks, DNA-based techniques and satellite remote sensing, (iii) solving major technical issues related to data product structure, data storage, execution of workflows and the production process/cycle as well as approaching technical interoperability among research infrastructures, (iv) allowing semantic interoperability by developing and adopting standards and tools for capturing consistent data and metadata, and (v) ensuring legal interoperability by endorsing open data or data that are free from restrictions on use, modification and sharing. Addressing these challenges is critical for biodiversity research and for assessing progress towards conservation policy targets and sustainable development goals.
Assuntos
Distribuição Animal/fisiologia , Biodiversidade , Monitoramento Ambiental/métodos , Animais , Modelos BiológicosRESUMO
Marine biological invasions have increased with the development of global trading, causing the homogenization of communities and the decline of biodiversity. A main vector is ballast water exchange from shipping. This study evaluates the use of ecological niche modelling (ENM) to predict the spread of 18 non-indigenous species (NIS) along shipping routes and their potential habitat suitability (hot/cold spots) in the Baltic Sea and Northeast Atlantic. Results show that, contrary to current risk assessment methods, temperature and sea ice concentration determine habitat suitability for 61% of species, rather than salinity (11%). We show high habitat suitability for NIS in the Skagerrak and Kattegat, a transitional area for NIS entering or leaving the Baltic Sea. As many cases of NIS introduction in the marine environment are associated with shipping pathways, we explore how ENM can be used to provide valuable information on the potential spread of NIS for ballast water risk assessment.
Assuntos
Organismos Aquáticos/crescimento & desenvolvimento , Espécies Introduzidas , Modelos Biológicos , Navios , Distribuição Animal , Animais , Biodiversidade , Ecossistema , Mar do Norte , Medição de Risco , SalinidadeRESUMO
OBJECTIVES: To examine the evidence base for telemonitoring designed for patients who have chronic obstructive pulmonary disease and heart failure, and to assess whether telemonitoring fulfils the principles of monitoring and is ready for implementation into routine settings. DESIGN: Qualitative data collection using interviews and participation in a multi-path mapping process. PARTICIPANTS: Twenty-six purposively selected informants completed semi-structured interviews and 24 individuals with expertise in the relevant clinical and informatics domains from academia, industry, policy and provider organizations and participated in a multi-path mapping workshop. RESULTS: The evidence base for the effectiveness of telemonitoring is weak and inconsistent, with insufficient cost-effectiveness studies. When considered against an accepted definition of monitoring, telemonitoring is found wanting. Telemonitoring has not been able so far to ensure that the technologies fit into the life world of the patient and into the clinical and organizational milieu of health service delivery systems. CONCLUSIONS: To develop effective telemonitoring for patients with chronic disease, more attention needs to be given to agreeing the central aim of early detection and, to ensure potential implementation, engaging a wide range of stakeholders in the design process, especially patients and clinicians.
Assuntos
Progressão da Doença , Insuficiência Cardíaca/fisiopatologia , Doença Pulmonar Obstrutiva Crônica/fisiopatologia , Telemetria/métodos , Doença Crônica , Difusão de Inovações , Pessoal de Saúde/psicologia , Humanos , Pesquisa Qualitativa , Consulta Remota/métodos , Reino UnidoRESUMO
Patients with chronic disease may suffer frequent acute deteriorations and associated increased risk of hospitalisation. Earlier detection of these could enable successful intervention, improving patients' well-being and reducing costs; however, current telemonitoring systems do not achieve this effectively. We conducted a qualitative study using stakeholder interviews to define current standards of care and user requirements for improved early detection telemonitoring. We determined that early detection is not a concept that has informed technology or service design and that telemonitoring is driven by the available technology rather than by users' needs. We have described a set of requirements questions to inform the design and implementation of telemonitoring systems and suggested the research needed to develop successful early detection telemonitoring. User-centred design and genuine interdisciplinary approaches are needed to create solutions that are fit for purpose, sustainable and address the real needs of patients, clinicians and healthcare organisations.
Assuntos
Doença Crônica , Progressão da Doença , Diagnóstico Precoce , Monitorização Ambulatorial/métodos , Telemedicina/métodos , Doença Crônica/psicologia , Pessoal de Saúde , Humanos , Comunicação Interdisciplinar , Entrevistas como Assunto , Aceitação pelo Paciente de Cuidados de Saúde , Assistência Centrada no Paciente , Relações Médico-Paciente , Consulta Remota/métodos , Consulta Remota/normas , Telemedicina/normas , Reino UnidoRESUMO
OBJECTIVE: To propose a research agenda that addresses technological and other knowledge gaps in developing telemonitoring solutions for patients with chronic diseases, with particular focus on detecting deterioration early enough to intervene effectively. DESIGN: A mixed methods approach incorporating literature review, key informant, and focus group interviews to gain an in-depth, multidisciplinary understanding of current approaches, and a roadmapping process to synthesise a research agenda. RESULTS: Counter to intuition, the research agenda for early detection of deterioration in patients with chronic diseases is not only primarily about advances in sensor technology but also much more about the problems of clinical specification, translation, and interfacing. The ultimate aim of telemonitoring is not fully agreed between the actors (patients, clinicians, technologists, and service providers). This leads to unresolved issues such as: (1) How are sensors used by patients as part of daily routines? (2) What are the indicators of early deterioration and how might they be used to trigger alerts? (3) How should alerts lead to appropriate levels of responses across different agencies and sectors? CONCLUSION: Attempts to use telemonitoring to improve the care of patients with chronic diseases over the last two decades have so far failed to lead to systems that are embedded in routine clinical practice. Attempts at implementation have paid insufficient attention to understanding patient and clinical needs and the complex dynamics and accountabilities that arise at the level of service models. A suggested way ahead is to co-design technology and services collaboratively with all stakeholders.
Assuntos
Doença Crônica , Pesquisa sobre Serviços de Saúde , Monitorização Fisiológica , Telemedicina , Comportamento Cooperativo , Sistemas de Apoio a Decisões Clínicas , Medicina Baseada em Evidências , Humanos , Projetos de PesquisaRESUMO
The research aim underpinning the Healthcare@Home (HH) information system described here was to enable 'near real time' risk analysis for disease early detection and prevention. To this end, we are implementing a family of prototype web services to 'push' or 'pull' individual's health-related data via an system of clinical hubs, mobile communication devices and/or dedicated home-based network computers. We are examining more efficient methods for ethical use of such data in timeline-based (i.e. 'longitudinal') data analysis systems. A consistent data collation infrastructure is being created for use along the 'patient path'--accessible wherever patients happen to be. This 'patient-centred' infrastructure can be applied in the evaluation of disease progression risk (in the light of clinical understanding of disease processes). In this paper we describe the requirements for making multi-data trend management 'scale-up', together with some requirements of an 'end-to-end' functioning data collection system. A Service-Oriented Architecture (SOA) approach is used to maximise benefits from (1) clinical evidence and (2) computational models of disease progression that can be made available elsewhere on the SOA. We discuss the implications of this so-called 'closed loop' approach for improving healthcare intervention outcomes, patient safety, decision support, objective measurement of service quality and in providing inputs for quantitative healthcare (predictive) modelling.