Búsqueda | Portal Regional de la BVS

The benefits and struggles of FAIR data: the case of reusing plant phenotyping data.

Papoutsoglou, Evangelia A; Athanasiadis, Ioannis N; Visser, Richard G F; Finkers, Richard.

Sci Data ; 10(1): 457, 2023 07 13.

Artículo en Inglés | MEDLINE | ID: mdl-37443110

RESUMEN

Plant phenotyping experiments are conducted under a variety of experimental parameters and settings for diverse purposes. The data they produce is heterogeneous, complicated, often poorly documented and, as a result, difficult to reuse. Meeting societal needs (nutrition, crop adaptation and stability) requires more efficient methods toward data integration and reuse. In this work, we examine what "making data FAIR" entails, and investigate the benefits and the struggles not only of reusing FAIR data, but also making data FAIR using genotype by environment and QTL by environment interactions for developmental traits in potato as a case study. We assume the role of a scientist discovering a phenotypic dataset on a FAIR data point, verifying the existence of related datasets with environmental data, acquiring both and integrating them. We report and discuss the challenges and the potential for reusability and reproducibility of FAIRifying existing datasets, using metadata standards such as MIAPPE, that were encountered in this process.

Asunto(s)

Fitomejoramiento , Plantas , Genotipo , Fenotipo , Plantas/genética , Reproducibilidad de los Resultados , Conjuntos de Datos como Asunto

Extracting knowledge networks from plant scientific literature: potato tuber flesh color as an exemplary trait.

Singh, Gurnoor; Papoutsoglou, Evangelia A; Keijts-Lalleman, Frederique; Vencheva, Bilyana; Rice, Mark; Visser, Richard G F; Bachem, Christian W B; Finkers, Richard.

BMC Plant Biol ; 21(1): 198, 2021 Apr 24.

Artículo en Inglés | MEDLINE | ID: mdl-33894758

RESUMEN

BACKGROUND: Scientific literature carries a wealth of information crucial for research, but only a fraction of it is present as structured information in databases and therefore can be analyzed using traditional data analysis tools. Natural language processing (NLP) is often and successfully employed to support humans by distilling relevant information from large corpora of free text and structuring it in a way that lends itself to further computational analyses. For this pilot, we developed a pipeline that uses NLP on biological literature to produce knowledge networks. We focused on the flesh color of potato, a well-studied trait with known associations, and we investigated whether these knowledge networks can assist us in formulating new hypotheses on the underlying biological processes. RESULTS: We trained an NLP model based on a manually annotated corpus of 34 full-text potato articles, to recognize relevant biological entities and relationships between them in text (genes, proteins, metabolites and traits). This model detected the number of biological entities with a precision of 97.65% and a recall of 88.91% on the training set. We conducted a time series analysis on 4023 PubMed abstract of plant genetics-based articles which focus on 4 major Solanaceous crops (tomato, potato, eggplant and capsicum), to determine that the networks contained both previously known and contemporaneously unknown leads to subsequently discovered biological phenomena relating to flesh color. A novel time-based analysis of these networks indicates a connection between our trait and a candidate gene (zeaxanthin epoxidase) already two years prior to explicit statements of that connection in the literature. CONCLUSIONS: Our time-based analysis indicates that network-assisted hypothesis generation shows promise for knowledge discovery, data integration and hypothesis generation in scientific research.

Asunto(s)

Minería de Datos , Procesamiento de Lenguaje Natural , Tubérculos de la Planta/fisiología , Solanum tuberosum/fisiología , Color , Pigmentos Biológicos

Enabling reusability of plant phenomic datasets with MIAPPE 1.1.

Papoutsoglou, Evangelia A; Faria, Daniel; Arend, Daniel; Arnaud, Elizabeth; Athanasiadis, Ioannis N; Chaves, Inês; Coppens, Frederik; Cornut, Guillaume; Costa, Bruno V; Cwiek-Kupczynska, Hanna; Droesbeke, Bert; Finkers, Richard; Gruden, Kristina; Junker, Astrid; King, Graham J; Krajewski, Pawel; Lange, Matthias; Laporte, Marie-Angélique; Michotey, Célia; Oppermann, Markus; Ostler, Richard; Poorter, Hendrik; Rami Rez-Gonzalez, Ricardo; Ramsak, Ziva; Reif, Jochen C; Rocca-Serra, Philippe; Sansone, Susanna-Assunta; Scholz, Uwe; Tardieu, François; Uauy, Cristobal; Usadel, Björn; Visser, Richard G F; Weise, Stephan; Kersey, Paul J; Miguel, Célia M; Adam-Blondon, Anne-Françoise; Pommier, Cyril.

New Phytol ; 227(1): 260-273, 2020 07.

Artículo en Inglés | MEDLINE | ID: mdl-32171029

RESUMEN

Enabling data reuse and knowledge discovery is increasingly critical in modern science, and requires an effort towards standardising data publication practices. This is particularly challenging in the plant phenotyping domain, due to its complexity and heterogeneity. We have produced the MIAPPE 1.1 release, which enhances the existing MIAPPE standard in coverage, to support perennial plants, in structure, through an explicit data model, and in clarity, through definitions and examples. We evaluated MIAPPE 1.1 by using it to express several heterogeneous phenotyping experiments in a range of different formats, to demonstrate its applicability and the interoperability between the various implementations. Furthermore, the extended coverage is demonstrated by the fact that one of the datasets could not have been described under MIAPPE 1.0. MIAPPE 1.1 marks a major step towards enabling plant phenotyping data reusability, thanks to its extended coverage, and especially the formalisation of its data model, which facilitates its implementation in different formats. Community feedback has been critical to this development, and will be a key part of ensuring adoption of the standard.

Asunto(s)

Fenómica , Plantas , Plantas/genética

BrAPI-an application programming interface for plant breeding applications.

Selby, Peter; Abbeloos, Rafael; Backlund, Jan Erik; Basterrechea Salido, Martin; Bauchet, Guillaume; Benites-Alfaro, Omar E; Birkett, Clay; Calaminos, Viana C; Carceller, Pierre; Cornut, Guillaume; Vasques Costa, Bruno; Edwards, Jeremy D; Finkers, Richard; Yanxin Gao, Star; Ghaffar, Mehmood; Glaser, Philip; Guignon, Valentin; Hok, Puthick; Kilian, Andrzej; König, Patrick; Lagare, Jack Elendil B; Lange, Matthias; Laporte, Marie-Angélique; Larmande, Pierre; LeBauer, David S; Lyon, David A; Marshall, David S; Matthews, Dave; Milne, Iain; Mistry, Naymesh; Morales, Nicolas; Mueller, Lukas A; Neveu, Pascal; Papoutsoglou, Evangelia; Pearce, Brian; Perez-Masias, Ivan; Pommier, Cyril; Ramírez-González, Ricardo H; Rathore, Abhishek; Raquel, Angel Manica; Raubach, Sebastian; Rife, Trevor; Robbins, Kelly; Rouard, Mathieu; Sarma, Chaitanya; Scholz, Uwe; Sempéré, Guilhem; Shaw, Paul D; Simon, Reinhard; Soldevilla, Nahuel.

Bioinformatics ; 35(20): 4147-4155, 2019 10 15.

Artículo en Inglés | MEDLINE | ID: mdl-30903186

RESUMEN

MOTIVATION: Modern genomic breeding methods rely heavily on very large amounts of phenotyping and genotyping data, presenting new challenges in effective data management and integration. Recently, the size and complexity of datasets have increased significantly, with the result that data are often stored on multiple systems. As analyses of interest increasingly require aggregation of datasets from diverse sources, data exchange between disparate systems becomes a challenge. RESULTS: To facilitate interoperability among breeding applications, we present the public plant Breeding Application Programming Interface (BrAPI). BrAPI is a standardized web service API specification. The development of BrAPI is a collaborative, community-based initiative involving a growing global community of over a hundred participants representing several dozen institutions and companies. Development of such a standard is recognized as critical to a number of important large breeding system initiatives as a foundational technology. The focus of the first version of the API is on providing services for connecting systems and retrieving basic breeding data including germplasm, study, observation, and marker data. A number of BrAPI-enabled applications, termed BrAPPs, have been written, that take advantage of the emerging support of BrAPI by many databases. AVAILABILITY AND IMPLEMENTATION: More information on BrAPI, including links to the specification, test suites, BrAPPs, and sample implementations is available at https://brapi.org/. The BrAPI specification and the developer tools are provided as free and open source.

Asunto(s)

Fitomejoramiento , Programas Informáticos , Interfaz Usuario-Computador , Genómica

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA