Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
3.
PeerJ ; 6: e4375, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29456894

RESUMO

Despite growing interest in Open Access (OA) to scholarly literature, there is an unmet need for large-scale, up-to-date, and reproducible studies assessing the prevalence and characteristics of OA. We address this need using oaDOI, an open online service that determines OA status for 67 million articles. We use three samples, each of 100,000 articles, to investigate OA in three populations: (1) all journal articles assigned a Crossref DOI, (2) recent journal articles indexed in Web of Science, and (3) articles viewed by users of Unpaywall, an open-source browser extension that lets users find OA articles using oaDOI. We estimate that at least 28% of the scholarly literature is OA (19M in total) and that this proportion is growing, driven particularly by growth in Gold and Hybrid. The most recent year analyzed (2015) also has the highest percentage of OA (45%). Because of this growth, and the fact that readers disproportionately access newer articles, we find that Unpaywall users encounter OA quite frequently: 47% of articles they view are OA. Notably, the most common mechanism for OA is not Gold, Green, or Hybrid OA, but rather an under-discussed category we dub Bronze: articles made free-to-read on the publisher website, without an explicit Open license. We also examine the citation impact of OA articles, corroborating the so-called open-access citation advantage: accounting for age and discipline, OA articles receive 18% more citations than average, an effect driven primarily by Green and Hybrid OA. We encourage further research using the free oaDOI service, as a way to inform OA policy and practice.

5.
PeerJ ; 1: e175, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24109559

RESUMO

Background. Attribution to the original contributor upon reuse of published data is important both as a reward for data creators and to document the provenance of research findings. Previous studies have found that papers with publicly available datasets receive a higher number of citations than similar studies without available data. However, few previous analyses have had the statistical power to control for the many variables known to predict citation rate, which has led to uncertain estimates of the "citation benefit". Furthermore, little is known about patterns in data reuse over time and across datasets. Method and Results. Here, we look at citation rates while controlling for many known citation predictors and investigate the variability of data reuse. In a multivariate regression on 10,555 studies that created gene expression microarray data, we found that studies that made data available in a public repository received 9% (95% confidence interval: 5% to 13%) more citations than similar studies for which the data was not made available. Date of publication, journal impact factor, open access status, number of authors, first and last author publication history, corresponding author country, institution citation history, and study topic were included as covariates. The citation benefit varied with date of dataset deposition: a citation benefit was most clear for papers published in 2004 and 2005, at about 30%. Authors published most papers using their own datasets within two years of their first publication on the dataset, whereas data reuse papers published by third-party investigators continued to accumulate for at least six years. To study patterns of data reuse directly, we compiled 9,724 instances of third party data reuse via mention of GEO or ArrayExpress accession numbers in the full text of papers. The level of third-party data use was high: for 100 datasets deposited in year 0, we estimated that 40 papers in PubMed reused a dataset by year 2, 100 by year 4, and more than 150 data reuse papers had been published by year 5. Data reuse was distributed across a broad base of datasets: a very conservative estimate found that 20% of the datasets deposited between 2003 and 2007 had been reused at least once by third parties. Conclusion. After accounting for other factors affecting citation rate, we find a robust citation benefit from open data, although a smaller one than previously reported. We conclude there is a direct effect of third-party data reuse that persists for years beyond the time when researchers have published most of the papers reusing their own data. Other factors that may also contribute to the citation benefit are considered. We further conclude that, at least for gene expression microarray data, a substantial fraction of archived datasets are reused, and that the intensity of dataset reuse has been steadily increasing since 2003.

6.
PLoS One ; 6(7): e18657, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-21765886

RESUMO

Many initiatives encourage investigators to share their raw datasets in hopes of increasing research efficiency and quality. Despite these investments of time and money, we do not have a firm grasp of who openly shares raw research data, who doesn't, and which initiatives are correlated with high rates of data sharing. In this analysis I use bibliometric methods to identify patterns in the frequency with which investigators openly archive their raw gene expression microarray datasets after study publication. Automated methods identified 11,603 articles published between 2000 and 2009 that describe the creation of gene expression microarray data. Associated datasets in best-practice repositories were found for 25% of these articles, increasing from less than 5% in 2001 to 30%-35% in 2007-2009. Accounting for sensitivity of the automated methods, approximately 45% of recent gene expression studies made their data publicly available. First-order factor analysis on 124 diverse bibliometric attributes of the data creation articles revealed 15 factors describing authorship, funding, institution, publication, and domain environments. In multivariate regression, authors were most likely to share data if they had prior experience sharing or reusing data, if their study was published in an open access journal or a journal with a relatively strong data sharing policy, or if the study was funded by a large number of NIH grants. Authors of studies on cancer and human subjects were least likely to make their datasets available. These results suggest research data sharing levels are still low and increasing only slowly, and data is least available in areas where it could make the biggest impact. Let's learn from those with high rates of sharing to embrace the full potential of our research output.


Assuntos
Arquivos , Comportamento Cooperativo , Disseminação de Informação , Pesquisa/estatística & dados numéricos , Bases de Dados Genéticas , Humanos , Análise Multivariada , Razão de Chances , Publicações Periódicas como Assunto
7.
PLoS One ; 6(4): e18537, 2011 Apr 22.
Artigo em Inglês | MEDLINE | ID: mdl-21526115

RESUMO

Human functional magnetic resonance imaging (fMRI) informs the understanding of the neural basis of mental function and is a key domain of ethical enquiry. It raises questions about the practice and implications of research, and reflexively informs ethics through the empirical investigation of moral judgments. It is at the centre of debate surrounding the importance of neuroscience findings for concepts such as personhood and free will, and the extent of their practical consequences. Here, we map the landscape of fMRI and neuroethics, using citation analysis to uncover salient topics. We find that this landscape is sparsely populated: despite previous calls for debate, there are few articles that discuss both fMRI and ethical, legal, or social implications (ELSI), and even fewer direct citations between the two literatures. Recognizing that practical barriers exist to integrating ELSI discussion into the research literature, we argue nonetheless that the ethical challenges of fMRI, and controversy over its conceptual and practical implications, make this essential.


Assuntos
Imageamento por Ressonância Magnética/ética , Neurociências/ética , Análise por Conglomerados , Bases de Dados como Assunto , Humanos , Neurociências/legislação & jurisprudência , Publicações Periódicas como Assunto
8.
J Biomed Discov Collab ; 5: 7-20, 2010 Mar 28.
Artigo em Inglês | MEDLINE | ID: mdl-20349403

RESUMO

BACKGROUND: The ability to locate publicly available gene expression microarray datasets effectively and efficiently facilitates the reuse of these potentially valuable resources. Centralized biomedical databases allow users to query dataset metadata descriptions, but these annotations are often too sparse and diverse to allow complex and accurate queries. In this study we examined the ability of PubMed article identifiers to locate publicly available gene expression microarray datasets, and investigated whether the retrieved datasets were representative of publicly available datasets found through statements of data sharing in the associated research articles. RESULTS: In a recent article, Ochsner and colleagues identified 397 studies that had generated gene expression microarray data. Their search of the full text of each publication for statements of data sharing revealed 203 publicly available datasets, including 179 in the Gene Expression Omnibus (GEO) or ArrayExpress databases. Our scripted search of GEO and ArrayExpress for PubMed identifiers of the same 397 studies returned 160 datasets, including six not found by the original search for data sharing statements. As a proportion of datasets found by either method, the search for data sharing statements identified 91.4% of the 209 publicly available datasets, compared to only 76.6% found by our search carried out using PubMed identifiers. Searching GEO or ArrayExpress alone retrieved 63.2% and 46.9% of all available datasets, respectively. There was no difference in the type of datasets found by PubMed identifier searches in terms of research theme or the technology used. However, the studies identified were more likely to have larger sample sizes, were more frequently cited, and published in higher impact journals. CONCLUSIONS: Searching database entries using PubMed identifiers can identify the majority of publicly available datasets, but caution is required when this method is used to collect data for policy evaluation since studies in low impact journals are disproportionately excluded. We urge authors of all datasets to complete the citation fields for their dataset submissions once publication details are known, thereby ensuring their work has maximum visibility and can contribute to subsequent studies.

9.
J Informetr ; 4(2): 148-156, 2010 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-21339841

RESUMO

The public sharing of primary research datasets potentially benefits the research community but is not yet common practice. In this pilot study, we analyzed whether data sharing frequency was associated with funder and publisher requirements, journal impact factor, or investigator experience and impact. Across 397 recent biomedical microarray studies, we found investigators were more likely to publicly share their raw dataset when their study was published in a high-impact journal and when the first or last authors had high levels of career experience and impact. We estimate the USA's National Institutes of Health (NIH) data sharing policy applied to 19% of the studies in our cohort; being subject to the NIH data sharing plan requirement was not found to correlate with increased data sharing behavior in multivariate logistic regression analysis. Studies published in journals that required a database submission accession number as a condition of publication were more likely to share their data, but this trend was not statistically significant. These early results will inform our ongoing larger analysis, and hopefully contribute to the development of more effective data sharing initiatives.

10.
AMIA Annu Symp Proc ; : 1097, 2008 Nov 06.
Artigo em Inglês | MEDLINE | ID: mdl-18998885

RESUMO

Repurposing research data holds many benefits for the advancement of biomedicine, yet is very difficult to measure and evaluate. We propose a data reuse registry to maintain links between primary research datasets and studies that reuse this data. Such a resource could help recognize investigators whose work is reused, illuminate aspects of reusability, and evaluate policies designed to encourage data sharing and reuse.


Assuntos
Pesquisa Biomédica/métodos , Sistemas de Gerenciamento de Base de Dados , Bases de Dados Factuais , Armazenamento e Recuperação da Informação/métodos , Sistema de Registros , Projetos de Pesquisa , Pennsylvania
11.
AMIA Annu Symp Proc ; : 596-600, 2008 Nov 06.
Artigo em Inglês | MEDLINE | ID: mdl-18998887

RESUMO

Many policies and projects now encourage investigators to share their raw research data with other scientists. Unfortunately, it is difficult to measure the effectiveness of these initiatives because data can be shared in such a variety of mechanisms and locations. We propose a novel approach to finding shared datasets: using NLP techniques to identify declarations of dataset sharing within the full text of primary research articles. Using regular expression patterns and machine learning algorithms on open access biomedical literature, our system was able to identify 61% of articles with shared datasets with 80% precision. A simpler version of our classifier achieved higher recall (86%), though lower precision (49%). We believe our results demonstrate the feasibility of this approach and hope to inspire further study of dataset retrieval techniques and policy evaluation.


Assuntos
Comportamento Cooperativo , Disseminação de Informação/métodos , Armazenamento e Recuperação da Informação/métodos , Processamento de Linguagem Natural , Reconhecimento Automatizado de Padrão/métodos , Publicações Periódicas como Assunto/classificação , Descritores , Algoritmos , Inteligência Artificial
12.
PLoS One ; 2(3): e308, 2007 Mar 21.
Artigo em Inglês | MEDLINE | ID: mdl-17375194

RESUMO

BACKGROUND: Sharing research data provides benefit to the general scientific community, but the benefit is less obvious for the investigator who makes his or her data available. PRINCIPAL FINDINGS: We examined the citation history of 85 cancer microarray clinical trial publications with respect to the availability of their data. The 48% of trials with publicly available microarray data received 85% of the aggregate citations. Publicly available data was significantly (p = 0.006) associated with a 69% increase in citations, independently of journal impact factor, date of publication, and author country of origin using linear regression. SIGNIFICANCE: This correlation between publicly available data and increased literature impact may further motivate investigators to share their detailed research data.


Assuntos
Pesquisa Biomédica/tendências , Disseminação de Informação/métodos , Pesquisa Biomédica/economia , Ensaios Clínicos como Assunto , Análise Custo-Benefício , Humanos , Internet , Fator de Impacto de Revistas , Neoplasias/terapia , Análise de Sequência com Séries de Oligonucleotídeos , Publicações Periódicas como Assunto/estatística & dados numéricos , Publicações/estatística & dados numéricos , Análise de Regressão
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA