Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 7.049
Filtrar
Mais filtros

Intervalo de ano de publicação
1.
Nature ; 612(7939): 323-327, 2022 12.
Artigo em Inglês | MEDLINE | ID: mdl-36450984

RESUMO

Newly generated excitatory synapses in the mammalian cortex lack sufficient AMPA-type glutamate receptors to mediate neurotransmission, resulting in functionally silent synapses that require activity-dependent plasticity to mature. Silent synapses are abundant in early development, during which they mediate circuit formation and refinement, but they are thought to be scarce in adulthood1. However, adults retain a capacity for neural plasticity and flexible learning that suggests that the formation of new connections is still prevalent. Here we used super-resolution protein imaging to visualize synaptic proteins at 2,234 synapses from layer 5 pyramidal neurons in the primary visual cortex of adult mice. Unexpectedly, about 25% of these synapses lack AMPA receptors. These putative silent synapses were located at the tips of thin dendritic protrusions, known as filopodia, which were more abundant by an order of magnitude than previously believed (comprising about 30% of all dendritic protrusions). Physiological experiments revealed that filopodia do indeed lack AMPA-receptor-mediated transmission, but they exhibit NMDA-receptor-mediated synaptic transmission. We further showed that functionally silent synapses on filopodia can be unsilenced through Hebbian plasticity, recruiting new active connections into a neuron's input matrix. These results challenge the model that functional connectivity is largely fixed in the adult cortex and demonstrate a new mechanism for flexible control of synaptic wiring that expands the learning capabilities of the mature brain.


Assuntos
Mamíferos , Registros , Animais , Camundongos
2.
Proc Natl Acad Sci U S A ; 120(25): e2219564120, 2023 06 20.
Artigo em Inglês | MEDLINE | ID: mdl-37307470

RESUMO

The daily activities of ≈8 billion people occupy exactly 24 h per day, placing a strict physical limit on what changes can be achieved in the world. These activities form the basis of human behavior, and because of the global integration of societies and economies, many of these activities interact across national borders. Yet, there is no comprehensive overview of how the finite resource of time is allocated at the global scale. Here, we estimate how all humans spend their time using a generalized, physical outcome-based categorization that facilitates the integration of data from hundreds of diverse datasets. Our compilation shows that most waking hours are spent on activities intended to achieve direct outcomes for human minds and bodies (9.4 h/d), while 3.4 h/d are spent modifying our inhabited environments and the world beyond. The remaining 2.1 h/d are devoted to organizing social processes and transportation. We distinguish activities that vary strongly with GDP per capita, including the time allocated to food provision and infrastructure, vs. those that do not vary consistently, such as meals and transportation time. Globally, the time spent directly extracting materials and energy from the Earth system is small, on the order of 5 min per average human day, while the time directly dealing with waste is on the order of 1 min per day, suggesting a large potential scope to modify the allocation of time to these activities. Our results provide a baseline quantification of the temporal composition of global human life that can be expanded and applied to multiple fields of research.


Assuntos
Planeta Terra , Cabeça , Humanos , Refeições , Registros , Meios de Transporte
3.
Proc Natl Acad Sci U S A ; 120(24): e2207029120, 2023 06 13.
Artigo em Inglês | MEDLINE | ID: mdl-37279275

RESUMO

The question of how cooperation evolves and is maintained among nonkin is central to the biological, social, and behavioral sciences. Previous research has focused on explaining how cooperation in social dilemmas can be maintained by direct and indirect reciprocity among the participants of the social dilemma. However, in complex human societies, both modern and ancient, cooperation is frequently maintained by means of specialized third-party enforcement. We provide an evolutionary-game-theoretic model that explains how specialized third-party enforcement of cooperation (specialized reciprocity) can emerge. A population consists of producers and enforcers. First, producers engage in a joint undertaking represented by a prisoner's dilemma. They are paired randomly and receive no information about their partner's history, which precludes direct and indirect reciprocity. Then, enforcers tax producers and may punish their clients. Finally, the enforcers are randomly paired and may try to grab resources from each other. In order to sustain producer cooperation, enforcers must punish defecting producers, but punishing is costly to enforcers. We show that the threat of potential intraenforcer conflict can incentivize enforcers to engage in costly punishment of producers, provided they are sufficiently informed to maintain a reputation system. That is, the "guards" are guarded by the guards themselves. We demonstrate the key mechanisms analytically and corroborate our results with numerical simulations.


Assuntos
Comportamento Cooperativo , Modelos Psicológicos , Humanos , Punição , Evolução Biológica , Registros , Teoria dos Jogos
4.
Pharmacol Rev ; 75(4): 714-738, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-36931724

RESUMO

Natural language processing (NLP) is an area of artificial intelligence that applies information technologies to process the human language, understand it to a certain degree, and use it in various applications. This area has rapidly developed in the past few years and now employs modern variants of deep neural networks to extract relevant patterns from large text corpora. The main objective of this work is to survey the recent use of NLP in the field of pharmacology. As our work shows, NLP is a highly relevant information extraction and processing approach for pharmacology. It has been used extensively, from intelligent searches through thousands of medical documents to finding traces of adversarial drug interactions in social media. We split our coverage into five categories to survey modern NLP: methodology, commonly addressed tasks, relevant textual data, knowledge bases, and useful programming libraries. We split each of the five categories into appropriate subcategories, describe their main properties and ideas, and summarize them in a tabular form. The resulting survey presents a comprehensive overview of the area, useful to practitioners and interested observers. SIGNIFICANCE STATEMENT: The main objective of this work is to survey the recent use of NLP in the field of pharmacology in order to provide a comprehensive overview of the current state in the area after the rapid developments that occurred in the past few years. The resulting survey will be useful to practitioners and interested observers in the domain.


Assuntos
Inteligência Artificial , Processamento de Linguagem Natural , Humanos , Armazenamento e Recuperação da Informação , Registros Eletrônicos de Saúde , Registros
5.
Proc Natl Acad Sci U S A ; 119(52): e2213633119, 2022 12 27.
Artigo em Inglês | MEDLINE | ID: mdl-36538478

RESUMO

Understanding the nature and formation of band gaps associated with the propagation of electromagnetic, electronic, or elastic waves in disordered materials as a function of system size presents fundamental and technological challenges. In particular, a basic question is whether band gaps in disordered systems exist in the thermodynamic limit. To explore this issue, we use a two-stage ensemble approach to study the formation of complete photonic band gaps (PBGs) for a sequence of increasingly large systems spanning a broad range of two-dimensional photonic network solids with varying degrees of local and global order, including hyperuniform and nonhyperuniform types. We discover that the gap in the density of states exhibits exponential tails and the apparent PBGs rapidly close as the system size increases for nearly all disordered networks considered. The only exceptions are sufficiently stealthy hyperuniform cases for which the band gaps remain open and the band tails exhibit a desirable power-law scaling reminiscent of the PBG behavior of photonic crystals in the thermodynamic limit.


Assuntos
Eletrônica , Memória , Fótons , Registros , Termodinâmica
6.
Natl Vital Stat Rep ; 71(7): 1-20, 2022 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-36301230

RESUMO

Objectives-This report presents data on fetal cause of death by maternal age, maternal race and Hispanic origin, fetal sex, period of gestation, birthweight, and plurality.


Assuntos
Morte Fetal , Hispânico ou Latino , Gravidez , Feminino , Humanos , Estados Unidos/epidemiologia , Morte Fetal/etiologia , Idade Materna , Peso ao Nascer , Registros
7.
PLoS Comput Biol ; 19(9): e1011444, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37695793

RESUMO

Different genes form complex networks within cells to carry out critical cellular functions, while network alterations in this process can potentially introduce downstream transcriptome perturbations and phenotypic variations. Therefore, developing efficient and interpretable methods to quantify network changes and pinpoint driver genes across conditions is crucial. We propose a hierarchical graph representation learning method, called iHerd. Given a set of networks, iHerd first hierarchically generates a series of coarsened sub-graphs in a data-driven manner, representing network modules at different resolutions (e.g., the level of signaling pathways). Then, it sequentially learns low-dimensional node representations at all hierarchical levels via efficient graph embedding. Lastly, iHerd projects separate gene embeddings onto the same latent space in its graph alignment module to calculate a rewiring index for driver gene prioritization. To demonstrate its effectiveness, we applied iHerd on a tumor-to-normal GRN rewiring analysis and cell-type-specific GCN analysis using single-cell multiome data of the brain. We showed that iHerd can effectively pinpoint novel and well-known risk genes in different diseases. Distinct from existing models, iHerd's graph coarsening for hierarchical learning allows us to successfully classify network driver genes into early and late divergent genes (EDGs and LDGs), emphasizing genes with extensive network changes across and within signaling pathway levels. This unique approach for driver gene classification can provide us with deeper molecular insights. The code is freely available at https://github.com/aicb-ZhangLabs/iHerd. All other relevant data are within the manuscript and supporting information files.


Assuntos
Aprendizado Profundo , Encéfalo , Aprendizagem , Registros
8.
PLoS Comput Biol ; 19(3): e1010879, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36893146

RESUMO

Clinical trial data-sharing is seen as an imperative for research integrity and is becoming increasingly encouraged or even required by funders, journals, and other stakeholders. However, early experiences with data-sharing have been disappointing because they are not always conducted properly. Health data is indeed sensitive and not always easy to share in a responsible way. We propose 10 rules for researchers wishing to share their data. These rules cover the majority of elements to be considered in order to start the commendable process of clinical trial data-sharing: Rule 1: Abide by local legal and regulatory data protection requirementsRule 2: Anticipate the possibility of clinical trial data-sharing before obtaining fundingRule 3: Declare your intent to share data in the registration stepRule 4: Involve research participantsRule 5: Determine the method of data accessRule 6: Remember there are several other elements to shareRule 7: Do not proceed aloneRule 8: Deploy optimal data management to ensure that the data shared is usefulRule 9: Minimize risksRule 10: Strive for excellence.


Assuntos
Disseminação de Informação , Registros , Humanos , Pesquisadores
9.
PLoS Comput Biol ; 19(3): e1010944, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36913405

RESUMO

We introduce a self-describing serialized format for bulk biomedical data called the Portable Format for Biomedical (PFB) data. The Portable Format for Biomedical data is based upon Avro and encapsulates a data model, a data dictionary, the data itself, and pointers to third party controlled vocabularies. In general, each data element in the data dictionary is associated with a third party controlled vocabulary to make it easier for applications to harmonize two or more PFB files. We also introduce an open source software development kit (SDK) called PyPFB for creating, exploring and modifying PFB files. We describe experimental studies showing the performance improvements when importing and exporting bulk biomedical data in the PFB format versus using JSON and SQL formats.


Assuntos
Software , Vocabulário Controlado , Registros
10.
PLoS Comput Biol ; 19(9): e1011477, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37669275

RESUMO

Here, we introduce Trackplot, a Python package for generating publication-quality visualization by a programmable and interactive web-based approach. Compared to the existing versions of programs generating sashimi plots, Trackplot offers a versatile platform for visually interpreting genomic data from a wide variety of sources, including gene annotation with functional domain mapping, isoform expression, isoform structures identified by scRNA-seq and long-read sequencing, as well as chromatin accessibility and architecture without any preprocessing, and also offers a broad degree of flexibility for formats of output files that satisfy the requirements of major journals. The Trackplot package is an open-source software which is freely available on Bioconda (https://anaconda.org/bioconda/trackplot), Docker (https://hub.docker.com/r/ygidtu/trackplot), PyPI (https://pypi.org/project/trackplot/) and GitHub (https://github.com/ygidtu/trackplot), and a built-in web server for local deployment is also provided.


Assuntos
Cromatina , Genômica , Anotação de Sequência Molecular , Registros , Software
11.
PLoS Comput Biol ; 19(9): e1011454, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37669309

RESUMO

Sedimentation velocity analytical ultracentrifugation (SV-AUC) is an indispensable tool for the study of particle size distributions in biopharmaceutical industry, for example, to characterize protein therapeutics and vaccine products. In particular, the diffusion-deconvoluted sedimentation coefficient distribution analysis, in the software SEDFIT, has found widespread applications due to its relatively high resolution and sensitivity. However, a lack of suitable software compatible with Good Manufacturing Practices (GMP) has hampered the use of SV-AUC in this regulatory environment. To address this, we have created an interface for SEDFIT so that it can serve as an automatically spawned module with controlled data input through command line parameters and output of key results in files. The interface can be integrated in custom GMP compatible software, and in scripts that provide documentation and meta-analyses for replicate or related samples, for example, to streamline analysis of large families of experimental data, such as binding isotherm analyses in the study of protein interactions. To test and demonstrate this approach we provide a MATLAB script mlSEDFIT.


Assuntos
Comércio , Documentação , Difusão , Registros , Software
12.
PLoS Comput Biol ; 19(8): e1011393, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37643178

RESUMO

Forecast evaluation is essential for the development of predictive epidemic models and can inform their use for public health decision-making. Common scores to evaluate epidemiological forecasts are the Continuous Ranked Probability Score (CRPS) and the Weighted Interval Score (WIS), which can be seen as measures of the absolute distance between the forecast distribution and the observation. However, applying these scores directly to predicted and observed incidence counts may not be the most appropriate due to the exponential nature of epidemic processes and the varying magnitudes of observed values across space and time. In this paper, we argue that transforming counts before applying scores such as the CRPS or WIS can effectively mitigate these difficulties and yield epidemiologically meaningful and easily interpretable results. Using the CRPS on log-transformed values as an example, we list three attractive properties: Firstly, it can be interpreted as a probabilistic version of a relative error. Secondly, it reflects how well models predicted the time-varying epidemic growth rate. And lastly, using arguments on variance-stabilizing transformations, it can be shown that under the assumption of a quadratic mean-variance relationship, the logarithmic transformation leads to expected CRPS values which are independent of the order of magnitude of the predicted quantity. Applying a transformation of log(x + 1) to data and forecasts from the European COVID-19 Forecast Hub, we find that it changes model rankings regardless of stratification by forecast date, location or target types. Situations in which models missed the beginning of upward swings are more strongly emphasised while failing to predict a downturn following a peak is less severely penalised when scoring transformed forecasts as opposed to untransformed ones. We conclude that appropriate transformations, of which the natural logarithm is only one particularly attractive option, should be considered when assessing the performance of different models in the context of infectious disease incidence.


Assuntos
COVID-19 , Epidemias , Humanos , COVID-19/epidemiologia , Saúde Pública , Probabilidade , Registros
13.
BMC Med Res Methodol ; 24(1): 55, 2024 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-38429658

RESUMO

BACKGROUND: Research Electronic Data CAPture (REDCap) is a web application for creating and managing online surveys and databases. Clinical data management is an essential process before performing any statistical analysis to ensure the quality and reliability of study information. Processing REDCap data in R can be complex and often benefits from automation. While there are several R packages available for specific tasks, none offer an expansive approach to data management. RESULTS: The REDCapDM is an R package for accessing and managing REDCap data. It imports data from REDCap to R using either an API connection or the files in R format exported directly from REDCap. It has several functions for data processing and transformation, and it helps to generate and manage queries to clarify or resolve discrepancies found in the data. CONCLUSION: The REDCapDM package is a valuable tool for data scientists and clinical data managers who use REDCap and R. It assists in tasks such as importing, processing, and quality-checking data from their research studies.


Assuntos
Gerenciamento de Dados , Software , Humanos , Reprodutibilidade dos Testes , Inquéritos e Questionários , Registros
14.
J Biomed Inform ; 149: 104551, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38000765

RESUMO

The development and deployment of machine learning (ML) models for biomedical research and healthcare currently lacks standard methodologies. Although tools for model replication are numerous, without a unifying blueprint it remains difficult to scientifically reproduce predictive ML models for any number of reasons (e.g., assumptions regarding data distributions and preprocessing, unclear test metrics, etc.) and ultimately, questions around generalizability and transportability are not readily answered. To facilitate scientific reproducibility, we built upon the Predictive Model Markup Language (PMML) to capture essential information. As a key component of the PREdictive Model Index and Exchange REpository (PREMIERE) platform, we present the Automated Metadata Pipeline (AMP) for conversion of a given predictive ML model into an extended PMML file that autocompletes an ML-based checklist, assessing model elements for interoperability and reproducibility. We demonstrate this pipeline on multiple test cases with three different ML algorithms and health-related datasets, providing a foundation for future predictive model reproducibility, sharing, and comparison.


Assuntos
Pesquisa Biomédica , Reprodutibilidade dos Testes , Algoritmos , Registros , Metadados
15.
J Biomed Inform ; 150: 104605, 2024 02.
Artigo em Inglês | MEDLINE | ID: mdl-38331082

RESUMO

OBJECTIVE: Physicians and clinicians rely on data contained in electronic health records (EHRs), as recorded by health information technology (HIT), to make informed decisions about their patients. The reliability of HIT systems in this regard is critical to patient safety. Consequently, better tools are needed to monitor the performance of HIT systems for potential hazards that could compromise the collected EHRs, which in turn could affect patient safety. In this paper, we propose a new framework for detecting anomalies in EHRs using sequence of clinical events. This new framework, EHR-Bidirectional Encoder Representations from Transformers (BERT), is motivated by the gaps in the existing deep-learning related methods, including high false negatives, sub-optimal accuracy, higher computational cost, and the risk of information loss. EHR-BERT is an innovative framework rooted in the BERT architecture, meticulously tailored to navigate the hurdles in the contemporary BERT method; thus, enhancing anomaly detection in EHRs for healthcare applications. METHODS: The EHR-BERT framework was designed using the Sequential Masked Token Prediction (SMTP) method. This approach treats EHRs as natural language sentences and iteratively masks input tokens during both training and prediction stages. This method facilitates the learning of EHR sequence patterns in both directions for each event and identifies anomalies based on deviations from the normal execution models trained on EHR sequences. RESULTS: Extensive experiments on large EHR datasets across various medical domains demonstrate that EHR-BERT markedly improves upon existing models. It significantly reduces the number of false positives and enhances the detection rate, thus bolstering the reliability of anomaly detection in electronic health records. This improvement is attributed to the model's ability to minimize information loss and maximize data utilization effectively. CONCLUSION: EHR-BERT showcases immense potential in decreasing medical errors related to anomalous clinical events, positioning itself as an indispensable asset for enhancing patient safety and the overall standard of healthcare services. The framework effectively overcomes the drawbacks of earlier models, making it a promising solution for healthcare professionals to ensure the reliability and quality of health data.


Assuntos
Registros Eletrônicos de Saúde , Sistemas de Informação em Saúde , Humanos , Reprodutibilidade dos Testes , Registros , Pessoal de Saúde
16.
Proc Natl Acad Sci U S A ; 118(30)2021 07 27.
Artigo em Inglês | MEDLINE | ID: mdl-34301899

RESUMO

Individuals with depression are prone to maladaptive patterns of thinking, known as cognitive distortions, whereby they think about themselves, the world, and the future in overly negative and inaccurate ways. These distortions are associated with marked changes in an individual's mood, behavior, and language. We hypothesize that societies can undergo similar changes in their collective psychology that are reflected in historical records of language use. Here, we investigate the prevalence of textual markers of cognitive distortions in over 14 million books for the past 125 y and observe a surge of their prevalence since the 1980s, to levels exceeding those of the Great Depression and both World Wars. This pattern does not seem to be driven by changes in word meaning, publishing and writing standards, or the Google Books sample. Our results suggest a recent societal shift toward language associated with cognitive distortions and internalizing disorders.


Assuntos
Transtornos Cognitivos/epidemiologia , Idioma/história , Registros/estatística & dados numéricos , Feminino , Alemanha/epidemiologia , História do Século XIX , História do Século XX , História do Século XXI , Humanos , Masculino , Espanha/epidemiologia , Estados Unidos/epidemiologia
17.
J Child Sex Abus ; 33(1): 102-125, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-37994404

RESUMO

This critique alerts practicing professionals of the multiple misleading statements in the recently published article entitled, "A compendium of risk and needs tools for assessing male youths at-risk to and/or who have engaged in sexually abusive behaviors." This critique corrects the erroneous information contained in Jung and Thomas' article, providing current accurate information related to the important distinct differences of available standardized risk assessment tools used in forensic settings with youths who have engaged in sexually abusive behaviors. Erroneous statements by other researchers and authors in the field are also discussed. Forensic cases are distinctively different from others seen in clinical settings, requiring specific knowledge and skill set, a notable distinction not often mentioned in research literature.


Assuntos
Abuso Sexual na Infância , Criança , Humanos , Masculino , Adolescente , Medição de Risco , Agressão , Registros , Comportamento Sexual
18.
Bull World Health Organ ; 101(10): 637-648, 2023 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-37772197

RESUMO

Objective: To evaluate the precision and dependability of road traffic mortality data recorded in the World Health Organization Mortality Database and investigate how uncorrected data influence vital mortality statistics used in traffic safety programmes worldwide. Methods: We assessed country and territory-specific data quality from 2015 to 2020 by calculating the proportions of five types of nonspecific cause of death codes related to road traffic mortality. We compared age-adjusted road traffic mortality and changes in the average annual mortality rate before and after correcting the deaths with nonspecific codes. We generated road traffic mortality projections with both corrected and uncorrected codes, and redistributed the data using the proportionate method. Findings: We analysed data from 124 countries and territories with at least one year of mortality data from 2015 to 2020. The number of countries and territories reporting more than 20% of deaths with ill-defined or unknown cause was 2; countries reporting injury deaths with undetermined intent was 3; countries reporting unspecified unintentional injury deaths was 21; countries reporting unspecified transport crash deaths was 3; and countries reporting unspecified unintentional road traffic deaths was 30. After redistributing deaths with nonspecific codes, road traffic mortality changed by greater than 50% in 7% (5/73) to 18% (9/51) of countries and territories. Conclusion: Nonspecific codes led to inaccurate mortality estimates in many countries. We recommend that injury researchers and policy-makers acknowledge the potential pitfalls of relying on raw or uncorrected road traffic mortality data and instead use corrected data to ensure more accurate estimates when improving road traffic safety programmes.


Assuntos
Estatísticas Vitais , Ferimentos e Lesões , Humanos , Acidentes de Trânsito , Bases de Dados Factuais , Organização Mundial da Saúde , Registros
19.
Bull World Health Organ ; 101(3): 179-190, 2023 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-36865603

RESUMO

Objective: To describe the changes in tuberculosis case notifications by the private sector after implementation of the Joint Effort for Elimination of Tuberculosis project in India in 2018. Methods: We retrieved data from the project recorded in India's national tuberculosis surveillance system. We analysed data on 95 project districts in six states (Andhra Pradesh, Himachal Pradesh, Karnataka, Punjab including Chandigarh, Telangana and West Bengal) to assess changes in the number of tuberculosis notifications, private provider notifiers and microbiological confirmations of cases from 2017 (baseline) to 2019. We compared case notification rates in districts where the project was implemented with the rates in districts where it was not. Findings: From 2017 to 2019, tuberculosis notifications increased by 138.1% (from 44 695 to 106 404), and case notification rates more than doubled from 20 to 44 per 100 000 population. The number of private notifiers increased by over threefold, from 2912 to 9525, during this period. The number of microbiologically confirmed pulmonary and extra-pulmonary tuberculosis cases notified increased by more than two times (from 10 780 to 25 384) and nearly three times (from 1477 to 4096), respectively. The districts where the project was implemented showed a 150.3% increase in case notification rates per 100 000 population from 2017 to 2019 (from 16.8 to 41.9) while in non-project districts, this increase was only 89.8% (from 6.1 to 11.6). Conclusion: The substantial increase in tuberculosis notifications demonstrate the value of the project in engaging the private sector. Scaling up these interventions is important to consolidate and extend these gains towards tuberculosis elimination.


Assuntos
Tuberculose Extrapulmonar , Tuberculose , Humanos , Índia/epidemiologia , Tuberculose/epidemiologia , Tuberculose/prevenção & controle , Setor Privado , Registros
20.
PLoS Comput Biol ; 18(9): e1010356, 2022 09.
Artigo em Inglês | MEDLINE | ID: mdl-36107931

RESUMO

The ubiquitous use of computational work for data generation, processing, and modeling increased the importance of digital documentation in improving research quality and impact. Computational notebooks are files that contain descriptive text, as well as code and its outputs, in a single, dynamic, and visually appealing file that is easier to understand by nonspecialists. Traditionally used by data scientists when producing reports and informing decision-making, the use of this tool in research publication is not common, despite its potential to increase research impact and quality. For a single study, the content of such documentation partially overlaps with that of classical lab notebooks and that of the scientific manuscript reporting the study. Therefore, to minimize the amount of work required to manage all the files related to these contents and optimize their production, we present a starter kit to facilitate the implementation of computational notebooks in the research process, including publication. The kit contains the template of a computational notebook integrated into a research project that employs R, Python, or Julia. Using examples of ecological studies, we show how computational notebooks also foster the implementation of principles of Open Science, such as reproducibility and traceability. The kit is designed for beginners, but at the end we present practices that can be gradually implemented to develop a fully digital research workflow. Our hope is that such minimalist yet effective starter kit will encourage researchers to adopt this practice in their workflow, regardless of their computational background.


Assuntos
Documentação , Registros , Reprodutibilidade dos Testes , Fluxo de Trabalho
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA