A corpus of CO<sub>2</sub> electrocatalytic reduction process extracted from the scientific literature.

Wang, Ludi; Gao, Yang; Chen, Xueqing; Cui, Wenjuan; Zhou, Yuanchun; Luo, Xinying; Xu, Shuaishuai; Du, Yi; Wang, Bin

A corpus of CO₂ electrocatalytic reduction process extracted from the scientific literature.

Wang, Ludi; Gao, Yang; Chen, Xueqing; Cui, Wenjuan; Zhou, Yuanchun; Luo, Xinying; Xu, Shuaishuai; Du, Yi; Wang, Bin.

Afiliação

Wang L; Laboratory of Big Data Knowledge, Computer Network Information Center, Chinese Academy of Sciences, Beijing, 100083, China.
Gao Y; CAS Key Laboratory of Nanosystem and Hierarchical Fabrication, National Center for Nanoscience and Technology (NCNST), Beijing, 100190, China.
Chen X; Laboratory of Big Data Knowledge, Computer Network Information Center, Chinese Academy of Sciences, Beijing, 100083, China.
Cui W; University of Chinese Academy of Sciences, Beijing, 100049, China.
Zhou Y; Laboratory of Big Data Knowledge, Computer Network Information Center, Chinese Academy of Sciences, Beijing, 100083, China.
Luo X; University of Chinese Academy of Sciences, Beijing, 100049, China.
Xu S; Laboratory of Big Data Knowledge, Computer Network Information Center, Chinese Academy of Sciences, Beijing, 100083, China.
Du Y; University of Chinese Academy of Sciences, Beijing, 100049, China.
Wang B; CAS Key Laboratory of Nanosystem and Hierarchical Fabrication, National Center for Nanoscience and Technology (NCNST), Beijing, 100190, China.

Sci Data ; 10(1): 175, 2023 03 29.

Article em En | MEDLINE | ID: mdl-36991006

RESUMO

The electrocatalytic CO2 reduction process has gained enormous attention for both environmental protection and chemicals production. Thereinto, the design of new electrocatalysts with high activity and selectivity can draw inspiration from the abundant scientific literature. An annotated and verified corpus made from massive literature can assist the development of natural language processing (NLP) models, which can offer insight to help guide the understanding of these underlying mechanisms. To facilitate data mining in this direction, we present a benchmark corpus of 6,086 records manually extracted from 835 electrocatalytic publications, along with an extended corpus with 145,179 records in this article. In this corpus, nine types of knowledge such as material, regulation method, product, faradaic efficiency, cell setup, electrolyte, synthesis method, current density, and voltage are provided by either annotating or extracting. Machine learning algorithms can be applied to the corpus to help scientists find new and effective electrocatalysts. Furthermore, researchers familiar with NLP can use this corpus to design domain-specific named entity recognition (NER) models.

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: Sci Data Ano de publicação: 2023 Tipo de documento: Article País de afiliação: China

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google