Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 36
Filter
1.
Proc Natl Acad Sci U S A ; 121(6): e2306549121, 2024 Feb 06.
Article in English | MEDLINE | ID: mdl-38300861

ABSTRACT

Understanding and predicting the emergence and evolution of cultural tastes manifested in consumption patterns is of central interest to social scientists, analysts of culture, and purveyors of content. Prior research suggests that taste preferences relate to personality traits, values, shifts in mood, and immigration destination. Understanding everyday patterns of listening and the function music plays in life has remained elusive, however, despite speculation that musical nostalgia may compensate for local disruption. Using more than one hundred million streams of four million songs by tens of thousands of international listeners from a global music service, we show that breaches in personal routine are systematically associated with personal musical exploration. As people visited new cities and countries, their preferences diversified, converging toward their travel destinations. As people experienced the very different disruptions associated with COVID-19 lockdowns, their preferences diversified further. Personal explorations did not tend to veer toward the global listening average, but away from it, toward distinctive regional musical content. Exposure to novel music explored during periods of routine disruption showed a persistent influence on listeners' future consumption patterns. Across all of these settings, musical preference reflected rather than compensated for life's surprises, leaving a lasting legacy on tastes. We explore the relationship between these findings and global patterns of behavior and cultural consumption.


Subject(s)
Music , Humans , Affect , Forecasting
2.
Nature ; 582(7813): E16, 2020 Jun.
Article in English | MEDLINE | ID: mdl-32499659

ABSTRACT

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

3.
Nature ; 575(7781): 190-194, 2019 11.
Article in English | MEDLINE | ID: mdl-31666706

ABSTRACT

Human achievements are often preceded by repeated attempts that fail, but little is known about the mechanisms that govern the dynamics of failure. Here, building on previous research relating to innovation1-7, human dynamics8-11 and learning12-17, we develop a simple one-parameter model that mimics how successful future attempts build on past efforts. Solving this model analytically suggests that a phase transition separates the dynamics of failure into regions of progression or stagnation and predicts that, near the critical threshold, agents who share similar characteristics and learning strategies may experience fundamentally different outcomes following failures. Above the critical point, agents exploit incremental refinements to systematically advance towards success, whereas below it, they explore disjoint opportunities without a pattern of improvement. The model makes several empirically testable predictions, demonstrating that those who eventually succeed and those who do not may initially appear similar, but can be characterized by fundamentally distinct failure dynamics in terms of the efficiency and quality associated with each subsequent attempt. We collected large-scale data from three disparate domains and traced repeated attempts by investigators to obtain National Institutes of Health (NIH) grants to fund their research, innovators to successfully exit their startup ventures, and terrorist organizations to claim casualties in violent attacks. We find broadly consistent empirical support across all three domains, which systematically verifies each prediction of our model. Together, our findings unveil detectable yet previously unknown early signals that enable us to identify failure dynamics that will lead to ultimate success or failure. Given the ubiquitous nature of failure and the paucity of quantitative approaches to understand it, these results represent an initial step towards the deeper understanding of the complex dynamics underlying failure.


Subject(s)
Achievement , Entrepreneurship/statistics & numerical data , Financing, Organized/statistics & numerical data , Learning , Science , Security Measures/statistics & numerical data , Terrorism/statistics & numerical data , Datasets as Topic , Entrepreneurship/economics , Financing, Organized/economics , Humans , Inventions , Investments/economics , Models, Theoretical , National Institutes of Health (U.S.) , Research Personnel/psychology , Research Personnel/standards , Research Personnel/statistics & numerical data , Science/economics , Security Measures/economics , United States
4.
Nature ; 566(7744): 378-382, 2019 02.
Article in English | MEDLINE | ID: mdl-30760923

ABSTRACT

One of the most universal trends in science and technology today is the growth of large teams in all areas, as solitary researchers and small teams diminish in prevalence1-3. Increases in team size have been attributed to the specialization of scientific activities3, improvements in communication technology4,5, or the complexity of modern problems that require interdisciplinary solutions6-8. This shift in team size raises the question of whether and how the character of the science and technology produced by large teams differs from that of small teams. Here we analyse more than 65 million papers, patents and software products that span the period 1954-2014, and demonstrate that across this period smaller teams have tended to disrupt science and technology with new ideas and opportunities, whereas larger teams have tended to develop existing ones. Work from larger teams builds on more-recent and popular developments, and attention to their work comes immediately. By contrast, contributions by smaller teams search more deeply into the past, are viewed as disruptive to science and technology and succeed further into the future-if at all. Observed differences between small and large teams are magnified for higher-impact work, with small teams known for disruptive work and large teams for developing work. Differences in topic and research design account for a small part of the relationship between team size and disruption; most of the effect occurs at the level of the individual, as people move between smaller and larger teams. These results demonstrate that both small and large teams are essential to a flourishing ecology of science and technology, and suggest that, to achieve this, science policies should aim to support a diversity of team sizes.


Subject(s)
Diffusion of Innovation , Group Processes , Interdisciplinary Research/organization & administration , Science/organization & administration , Science/statistics & numerical data , Technology/organization & administration , Technology/statistics & numerical data , Cooperative Behavior , Databases, Factual , Interdisciplinary Research/statistics & numerical data , Interdisciplinary Research/trends , Nobel Prize , Patents as Topic/statistics & numerical data , Research Support as Topic , Science/trends , Software/supply & distribution , Technology/trends
5.
Proc Natl Acad Sci U S A ; 118(41)2021 10 12.
Article in English | MEDLINE | ID: mdl-34607941

ABSTRACT

In many academic fields, the number of papers published each year has increased significantly over time. Policy measures aim to increase the quantity of scientists, research funding, and scientific output, which is measured by the number of papers produced. These quantitative metrics determine the career trajectories of scholars and evaluations of academic departments, institutions, and nations. Whether and how these increases in the numbers of scientists and papers translate into advances in knowledge is unclear, however. Here, we first lay out a theoretical argument for why too many papers published each year in a field can lead to stagnation rather than advance. The deluge of new papers may deprive reviewers and readers the cognitive slack required to fully recognize and understand novel ideas. Competition among many new ideas may prevent the gradual accumulation of focused attention on a promising new idea. Then, we show data supporting the predictions of this theory. When the number of papers published per year in a scientific field grows large, citations flow disproportionately to already well-cited papers; the list of most-cited papers ossifies; new papers are unlikely to ever become highly cited, and when they do, it is not through a gradual, cumulative process of attention gathering; and newly published papers become unlikely to disrupt existing work. These findings suggest that the progress of large scientific fields may be slowed, trapped in existing canon. Policy measures shifting how scientific work is produced, disseminated, consumed, and rewarded may be called for to push fields into new, more fertile areas of study.

6.
Gastroenterology ; 162(4): 1197-1209.e13, 2022 04.
Article in English | MEDLINE | ID: mdl-34973296

ABSTRACT

BACKGROUND & AIMS: Barrett's esophagus (BE) is a risk factor for esophageal adenocarcinoma but our understanding of how it evolves is poorly understood. We investigated BE gland phenotype distribution, the clonal nature of phenotypic change, and how phenotypic diversity plays a role in progression. METHODS: Using immunohistochemistry and histology, we analyzed the distribution and the diversity of gland phenotype between and within biopsy specimens from patients with nondysplastic BE and those who had progressed to dysplasia or had developed postesophagectomy BE. Clonal relationships were determined by the presence of shared mutations between distinct gland types using laser capture microdissection sequencing of the mitochondrial genome. RESULTS: We identified 5 different gland phenotypes in a cohort of 51 nondysplastic patients where biopsy specimens were taken at the same anatomic site (1.0-2.0 cm superior to the gastroesophageal junction. Here, we observed the same number of glands with 1 and 2 phenotypes, but 3 phenotypes were rare. We showed a common ancestor between parietal cell-containing, mature gastric (oxyntocardiac) and goblet cell-containing, intestinal (specialized) gland phenotypes. Similarly, we have shown a clonal relationship between cardiac-type glands and specialized and mature intestinal glands. Using the Shannon diversity index as a marker of gland diversity, we observed significantly increased phenotypic diversity in patients with BE adjacent to dysplasia and predysplasia compared to nondysplastic BE and postesophagectomy BE, suggesting that diversity develops over time. CONCLUSIONS: We showed that the range of BE phenotypes represents an evolutionary process and that changes in gland diversity may play a role in progression. Furthermore, we showed a common ancestry between gastric and intestinal-type glands in BE.


Subject(s)
Barrett Esophagus , Esophageal Neoplasms , Barrett Esophagus/pathology , Esophageal Neoplasms/pathology , Esophagogastric Junction/pathology , Humans , Phenotype
7.
Proc Natl Acad Sci U S A ; 115(13): 3308-3313, 2018 03 27.
Article in English | MEDLINE | ID: mdl-29531061

ABSTRACT

Assessing scholarly influence is critical for understanding the collective system of scholarship and the history of academic inquiry. Influence is multifaceted, and citations reveal only part of it. Citation counts exhibit preferential attachment and follow a rigid "news cycle" that can miss sustained and indirect forms of influence. Building on dynamic topic models that track distributional shifts in discourse over time, we introduce a variant that incorporates features, such as authorship, affiliation, and publication venue, to assess how these contexts interact with content to shape future scholarship. We perform in-depth analyses on collections of physics research (500,000 abstracts; 102 years) and scholarship generally (JSTOR repository: 2 million full-text articles; 130 years). Our measure of document influence helps predict citations and shows how outcomes, such as winning a Nobel Prize or affiliation with a highly ranked institution, boost influence. Analysis of citations alongside discursive influence reveals that citations tend to credit authors who persist in their fields over time and discount credit for works that are influential over many topics or are "ahead of their time." In this way, our measures provide a way to acknowledge diverse contributions that take longer and travel farther to achieve scholarly appreciation, enabling us to correct citation biases and enhance sensitivity to the full spectrum of scholarly impact.

8.
Proc Natl Acad Sci U S A ; 115(50): 12630-12637, 2018 12 11.
Article in English | MEDLINE | ID: mdl-30530667

ABSTRACT

Rapid research progress in science and technology (S&T) and continuously shifting workforce needs exert pressure on each other and on the educational and training systems that link them. Higher education institutions aim to equip new generations of students with skills and expertise relevant to workforce participation for decades to come, but their offerings sometimes misalign with commercial needs and new techniques forged at the frontiers of research. Here, we analyze and visualize the dynamic skill (mis-)alignment between academic push, industry pull, and educational offerings, paying special attention to the rapidly emerging areas of data science and data engineering (DS/DE). The visualizations and computational models presented here can help key decision makers understand the evolving structure of skills so that they can craft educational programs that serve workforce needs. Our study uses millions of publications, course syllabi, and job advertisements published between 2010 and 2016. We show how courses mediate between research and jobs. We also discover responsiveness in the academic, educational, and industrial system in how skill demands from industry are as likely to drive skill attention in research as the converse. Finally, we reveal the increasing importance of uniquely human skills, such as communication, negotiation, and persuasion. These skills are currently underexamined in research and undersupplied through education for the labor market. In an increasingly data-driven economy, the demand for "soft" social skills, like teamwork and communication, increase with greater demand for "hard" technical skills and tools.


Subject(s)
Data Science/education , Employment , Research , Expert Testimony , Humans , Job Description , Social Skills , Surveys and Questionnaires , Workforce
9.
Gastroenterology ; 153(5): 1230-1239, 2017 11.
Article in English | MEDLINE | ID: mdl-28734832

ABSTRACT

BACKGROUND & AIMS: Little is known about the causes of heartburn in patients with gastro-esophageal reflux disease. Visible epithelial damage is seldom associated with symptom severity, evidenced by the significant symptom burden in patients with nonerosive reflux disease (NERD) compared with patients with erosive reflux disease (ERD) or Barrett's esophagus (BE). We studied the distribution of mucosal nerve fibers in patients with NERD, ERD, and BE, and compared the results with those of healthy subjects. METHODS: We performed a prospective study of 13 patients with NERD, 11 patients with ERD, and 16 patients with BE undergoing endoscopic evaluation in the United Kingdom or Greece. Biopsies were obtained from the proximal and distal esophageal mucosa of patients with NERD, from the distal esophageal mucosa of patients with ERD, and the distal-most squamous epithelium of patients with BE. These were examined for the presence and location of nerve fibers that reacted with a labeled antibody against calcitonin gene-related peptide (CGRP), a marker of nociceptive sensory nerves. The results were compared with those from 10 healthy volunteers (controls). RESULTS: The distribution of CGRP-positive nerves did not differ significantly between the distal esophageal mucosa of controls (median, 25.5 cell layers to surface; interquartile range [IQR], 21.4-28.8) vs patients with ERD (median, 23 cell layers to surface; IQR, 16-27.5), or patients with BE (median, 21.5 cell layers to surface; IQR, 16.1-27.5). However, CGRP-positive nerves were significantly more superficial in mucosa from patients with NERD-both distal (median, 9.5 cell layers to surface; IQR, 1.5-13.3; P < .0001 vs ERD, BE, and controls) and proximal (median, 5.0 cell layers to surface; IQR, 2.5-9.3 vs median 10.4 cell layers to surface; IQR, 8.0-16.9; P = .0098 vs controls). CONCLUSIONS: Proximal and distal esophageal mucosa of patients with NERD have more superficial afferent nerves compared with controls or patients with ERD or BE. Acid hypersensitivity in patients with NERD might be partially explained by the increased proximity of their afferent nerves to the esophageal lumen, and therefore greater exposure to noxious substances in refluxate.


Subject(s)
Barrett Esophagus/pathology , Esophageal Mucosa/innervation , Gastroesophageal Reflux/pathology , Heartburn/pathology , Hyperalgesia/pathology , Sensory Receptor Cells/pathology , Adult , Aged , Barrett Esophagus/physiopathology , Biomarkers/analysis , Biopsy , Calcitonin Gene-Related Peptide/analysis , Case-Control Studies , Female , Gastroesophageal Reflux/physiopathology , Greece , Heartburn/physiopathology , Humans , Hyperalgesia/physiopathology , Immunohistochemistry , Male , Middle Aged , Prospective Studies , Sensory Receptor Cells/chemistry , United Kingdom , Young Adult
10.
Proc Natl Acad Sci U S A ; 112(47): 14569-74, 2015 Nov 24.
Article in English | MEDLINE | ID: mdl-26554009

ABSTRACT

A scientist's choice of research problem affects his or her personal career trajectory. Scientists' combined choices affect the direction and efficiency of scientific discovery as a whole. In this paper, we infer preferences that shape problem selection from patterns of published findings and then quantify their efficiency. We represent research problems as links between scientific entities in a knowledge network. We then build a generative model of discovery informed by qualitative research on scientific problem selection. We map salient features from this literature to key network properties: an entity's importance corresponds to its degree centrality, and a problem's difficulty corresponds to the network distance it spans. Drawing on millions of papers and patents published over 30 years, we use this model to infer the typical research strategy used to explore chemical relationships in biomedicine. This strategy generates conservative research choices focused on building up knowledge around important molecules. These choices become more conservative over time. The observed strategy is efficient for initial exploration of the network and supports scientific careers that require steady output, but is inefficient for science as a whole. Through supercomputer experiments on a sample of the network, we study thousands of alternatives and identify strategies much more efficient at exploring mature knowledge networks. We find that increased risk-taking and the publication of experimental failures would substantially improve the speed of discovery. We consider institutional shifts in grant making, evaluation, and publication that would help realize these efficiencies.


Subject(s)
Research , Science , Humans , Publications , Qualitative Research , Risk-Taking
11.
Adv Exp Med Biol ; 908: 27-40, 2016.
Article in English | MEDLINE | ID: mdl-27573766

ABSTRACT

Barrett's esophagus (BO) is a preneoplastic condition described as the replacement of the stratified squamous epithelium of the distal esophagus with one that histologically presents as a diverse mixture of metaplastic glands resembling gastric or intestinal-type columnar epithelium. The clonal origins of BO are still unclear. More recently, we have begun to investigate the relationship between the various metaplastic gland phenotypes observed in BO, how they evolve, and the cancer risk they bestow. Studies have revealed that glands along the BO segment are clonal units containing a single stem cell clone that can give rise to all the differentiated epithelial cell types in glands. Clonal lineage tracing analysis has revealed that Barrett's glands are capable of bifurcation and this facilitates clonal expansion and competition. In fact, BO in some patients appears to consist of multiple, independently initiated clones that compete with each other for space and possibly resources. This chapter discusses the concepts of clonal competition and expansion in BO and sets out to query what we know about the role of gland diversity and phenotypic evolution within this complex columnar metaplasia.


Subject(s)
Barrett Esophagus/pathology , Clonal Evolution , Esophageal Neoplasms/pathology , Esophagus/pathology , Barrett Esophagus/metabolism , Biomarkers, Tumor/genetics , Biomarkers, Tumor/metabolism , Cell Lineage/genetics , Clone Cells/metabolism , Clone Cells/pathology , Esophageal Neoplasms/genetics , Esophagus/metabolism , Humans , Intestinal Mucosa/metabolism , Intestinal Mucosa/pathology , Mucins/genetics , Mucins/metabolism
12.
PLoS Comput Biol ; 10(9): e1003799, 2014 Sep.
Article in English | MEDLINE | ID: mdl-25255227

ABSTRACT

Synonymous relationships among biomedical terms are extensively annotated within specialized terminologies, implying that synonymy is important for practical computational applications within this field. It remains unclear, however, whether text mining actually benefits from documented synonymy and whether existing biomedical thesauri provide adequate coverage of these linguistic relationships. In this study, we examine the impact and extent of undocumented synonymy within a very large compendium of biomedical thesauri. First, we demonstrate that missing synonymy has a significant negative impact on named entity normalization, an important problem within the field of biomedical text mining. To estimate the amount synonymy currently missing from thesauri, we develop a probabilistic model for the construction of synonym terminologies that is capable of handling a wide range of potential biases, and we evaluate its performance using the broader domain of near-synonymy among general English words. Our model predicts that over 90% of these relationships are currently undocumented, a result that we support experimentally through "crowd-sourcing." Finally, we apply our model to biomedical terminologies and predict that they are missing the vast majority (>90%) of the synonymous relationships they intend to document. Overall, our results expose the dramatic incompleteness of current biomedical thesauri and suggest the need for "next-generation," high-coverage lexical terminologies.


Subject(s)
Computational Biology/methods , Data Mining , Models, Statistical , Vocabulary, Controlled , Humans , Models, Theoretical
13.
J Transl Med ; 12: 124, 2014 May 12.
Article in English | MEDLINE | ID: mdl-24886400

ABSTRACT

BACKGROUND: Methicillin-resistant Staphylococcus aureus (MRSA) has been a deadly pathogen in healthcare settings since the 1960s, but MRSA epidemiology changed since 1990 with new genetically distinct strain types circulating among previously healthy people outside healthcare settings. Community-associated (CA) MRSA strains primarily cause skin and soft tissue infections, but may also cause life-threatening invasive infections. First seen in Australia and the U.S., it is a growing problem around the world. The U.S. has had the most widespread CA-MRSA epidemic, with strain type USA300 causing the great majority of infections. Individuals with either asymptomatic colonization or infection may transmit CA-MRSA to others, largely by skin-to-skin contact. Control measures have focused on hospital transmission. Limited public health education has focused on care for skin infections. METHODS: We developed a fine-grained agent-based model for Chicago to identify where to target interventions to reduce CA-MRSA transmission. An agent-based model allows us to represent heterogeneity in population behavior, locations and contact patterns that are highly relevant for CA-MRSA transmission and control. Drawing on nationally representative survey data, the model represents variation in sociodemographics, locations, behaviors, and physical contact patterns. Transmission probabilities are based on a comprehensive literature review. RESULTS: Over multiple 10-year runs with one-hour ticks, our model generates temporal and geographic trends in CA-MRSA incidence similar to Chicago from 2001 to 2010. On average, a majority of transmission events occurred in households, and colonized rather than infected agents were the source of the great majority (over 95%) of transmission events. The key findings are that infected people are not the primary source of spread. Rather, the far greater number of colonized individuals must be targeted to reduce transmission. CONCLUSIONS: Our findings suggest that current paradigms in MRSA control in the United States cannot be very effective in reducing the incidence of CA-MRSA infections. Furthermore, the control measures that have focused on hospitals are unlikely to have much population-wide impact on CA-MRSA rates. New strategies need to be developed, as the incidence of CA-MRSA is likely to continue to grow around the world.


Subject(s)
Methicillin-Resistant Staphylococcus aureus/isolation & purification , Models, Theoretical , Staphylococcal Infections/transmission , Disease Outbreaks , Humans , Staphylococcal Infections/epidemiology , Staphylococcal Infections/microbiology
14.
Nat Hum Behav ; 8(4): 644-656, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38366103

ABSTRACT

Human languages vary widely in how they encode information within circumscribed semantic domains (for example, time, space, colour, human body parts and activities), but little is known about the global structure of semantic information and nothing about its relation to human communication. We first show that across a sample of ~1,000 languages, there is broad variation in how densely languages encode information into words. Second, we show that this language information density is associated with a denser configuration of semantic information. Finally, we trace the relationship between language information density and patterns of communication, showing that informationally denser languages tend towards faster communication but conceptually narrower conversations or expositions within which topics are discussed at greater depth. These results highlight an important source of variation across the human communicative channel, revealing that the structure of language shapes the nature and texture of human engagement, with consequences for human behaviour across levels of society.


Subject(s)
Communication , Language , Semantics , Humans
15.
Nat Hum Behav ; 7(10): 1682-1696, 2023 Oct.
Article in English | MEDLINE | ID: mdl-37443269

ABSTRACT

Artificial intelligence (AI) models trained on published scientific findings have been used to invent valuable materials and targeted therapies, but they typically ignore the human scientists who continually alter the landscape of discovery. Here we show that incorporating the distribution of human expertise by training unsupervised models on simulated inferences that are cognitively accessible to experts dramatically improves (by up to 400%) AI prediction of future discoveries beyond models focused on research content alone, especially when relevant literature is sparse. These models succeed by predicting human predictions and the scientists who will make them. By tuning human-aware AI to avoid the crowd, we can generate scientifically promising 'alien' hypotheses unlikely to be imagined or pursued without intervention until the distant future, which hold promise to punctuate scientific advance beyond questions currently pursued. By accelerating human discovery or probing its blind spots, human-aware AI enables us to move towards and beyond the contemporary scientific frontier.

16.
J Biol Chem ; 286(27): 23659-66, 2011 Jul 08.
Article in English | MEDLINE | ID: mdl-21566119

ABSTRACT

Life scientists today cannot hope to read everything relevant to their research. Emerging text-mining tools can help by identifying topics and distilling statements from books and articles with increased accuracy. Researchers often organize these statements into ontologies, consistent systems of reality claims. Like scientific thinking and interchange, however, text-mined information (even when accurately captured) is complex, redundant, sometimes incoherent, and often contradictory: it is rooted in a mixture of only partially consistent ontologies. We review work that models scientific reason and suggest how computational reasoning across ontologies and the broader distribution of textual statements can assess the certainty of statements and the process by which statements become certain. With the emergence of digitized data regarding networks of scientific authorship, institutions, and resources, we explore the possibility of accounting for social dependences and cultural biases in reasoning models. Computational reasoning is starting to fill out ontologies and flag internal inconsistencies in several areas of bioscience. In the not too distant future, scientists may be able to use statements and rich models of the processes that produced them to identify underexplored areas, resurrect forgotten findings and ideas, deconvolute the spaghetti of underlying ontologies, and synthesize novel knowledge and hypotheses.


Subject(s)
Computational Biology/methods , Models, Biological , Animals , Computational Biology/trends , Humans
17.
PLoS Comput Biol ; 7(9): e1002191, 2011 Sep.
Article in English | MEDLINE | ID: mdl-21980276

ABSTRACT

The use of structured knowledge representations-ontologies and terminologies-has become standard in biomedicine. Definitions of ontologies vary widely, as do the values and philosophies that underlie them. In seeking to make these views explicit, we conducted and summarized interviews with a dozen leading ontologists. Their views clustered into three broad perspectives that we summarize as mathematics, computer code, and Esperanto. Ontology as mathematics puts the ultimate premium on rigor and logic, symmetry and consistency of representation across scientific subfields, and the inclusion of only established, non-contradictory knowledge. Ontology as computer code focuses on utility and cultivates diversity, fitting ontologies to their purpose. Like computer languages C++, Prolog, and HTML, the code perspective holds that diverse applications warrant custom designed ontologies. Ontology as Esperanto focuses on facilitating cross-disciplinary communication, knowledge cross-referencing, and computation across datasets from diverse communities. We show how these views align with classical divides in science and suggest how a synthesis of their concerns could strengthen the next generation of biomedical ontologies.


Subject(s)
Computational Biology , Language , Mathematical Concepts , Programming Languages , Computer Simulation , Data Mining , Humans , Knowledge , Software , User-Computer Interface , Vocabulary, Controlled
18.
PLoS Comput Biol ; 7(1): e1001055, 2011 Jan 13.
Article in English | MEDLINE | ID: mdl-21249231

ABSTRACT

A scientific ontology is a formal representation of knowledge within a domain, typically including central concepts, their properties, and relations. With the rise of computers and high-throughput data collection, ontologies have become essential to data mining and sharing across communities in the biomedical sciences. Powerful approaches exist for testing the internal consistency of an ontology, but not for assessing the fidelity of its domain representation. We introduce a family of metrics that describe the breadth and depth with which an ontology represents its knowledge domain. We then test these metrics using (1) four of the most common medical ontologies with respect to a corpus of medical documents and (2) seven of the most popular English thesauri with respect to three corpora that sample language from medicine, news, and novels. Here we show that our approach captures the quality of ontological representation and guides efforts to narrow the breach between ontology and collective discourse within a domain. Our results also demonstrate key features of medical ontologies, English thesauri, and discourse from different domains. Medical ontologies have a small intersection, as do English thesauri. Moreover, dialects characteristic of distinct domains vary strikingly as many of the same words are used quite differently in medicine, news, and novels. As ontologies are intended to mirror the state of knowledge, our methods to tighten the fit between ontology and domain will increase their relevance for new areas of biomedical science and improve the accuracy and power of inferences computed across them.


Subject(s)
Information Storage and Retrieval
19.
PLoS Comput Biol ; 7(10): e1002132, 2011 Oct.
Article in English | MEDLINE | ID: mdl-21998558

ABSTRACT

Computational models in biomedicine rely on biological and clinical assumptions. The selection of these assumptions contributes substantially to modeling success or failure. Assumptions used by experts at the cutting edge of research, however, are rarely explicitly described in scientific publications. One can directly collect and assess some of these assumptions through interviews and surveys. Here we investigate diversity in expert views about a complex biological phenomenon, the process of cancer metastasis. We harvested individual viewpoints from 28 experts in clinical and molecular aspects of cancer metastasis and summarized them computationally. While experts predominantly agreed on the definition of individual steps involved in metastasis, no two expert scenarios for metastasis were identical. We computed the probability that any two experts would disagree on k or fewer metastatic stages and found that any two randomly selected experts are likely to disagree about several assumptions. Considering the probability that two or more of these experts review an article or a proposal about metastatic cascades, the probability that they will disagree with elements of a proposed model approaches 1. This diversity of conceptions has clear consequences for advance and deadlock in the field. We suggest that strong, incompatible views are common in biomedicine but largely invisible to biomedical experts themselves. We built a formal Markov model of metastasis to encapsulate expert convergence and divergence regarding the entire sequence of metastatic stages. This model revealed stages of greatest disagreement, including the points at which cancer enters and leaves the bloodstream. The model provides a formal probabilistic hypothesis against which researchers can evaluate data on the process of metastasis. This would enable subsequent improvement of the model through Bayesian probabilistic update. Practically, we propose that model assumptions and hunches be harvested systematically and made available for modelers and scientists.


Subject(s)
Models, Biological , Neoplasm Metastasis , Computational Biology , Disease Progression , Expert Testimony , Humans , Markov Chains
20.
PLoS One ; 17(12): e0273994, 2022.
Article in English | MEDLINE | ID: mdl-36508452

ABSTRACT

Peer review is an important part of science, aimed at providing expert and objective assessment of a manuscript. Because of many factors, including time constraints, unique expertise needs, and deference, many journals ask authors to suggest peer reviewers for their own manuscript. Previous researchers have found differing effects about this practice that might be inconclusive due to sample sizes. In this article, we analyze the association between author-suggested reviewers and review invitation, review scores, acceptance rates, and subjective review quality using a large dataset of close to 8K manuscripts from 46K authors and 21K reviewers from the journal PLOS ONE's Neuroscience section. We found that all-author-suggested review panels increase the chances of acceptance by 20 percent points vs all-editor-suggested panels while agreeing to review less often. While PLOS ONE has since ended the practice of asking for suggested reviewers, many others still use them and perhaps should consider the results presented here.


Subject(s)
Neurosciences , Peer Review , Cross-Sectional Studies , Peer Review, Research
SELECTION OF CITATIONS
SEARCH DETAIL