Practicing open science can have benefits for the career prospects of individual researchers or labs through higher quality work and increased chances of publication. However, being an outspoken advocate of open science might also indirectly benefit individual scientific careers, in the form of status in a scientific community, decisions for tenure, and eligibility for certain kinds of funding. Therefore, it may be profitable for individual labs to appear to engage in open science practices, without actually putting in the associated effort or doing only the bare minimum. In this article, we explore two types of academic behavior through a dynamic computational model (cf. Smaldino & Mcelreath, 2016) of an academic community that rewards open science: (
Keywords: computational model, cultural evolution, metascience, open science, reform
Labs that practice open science (e.g., preregistration, registered reports, sharing data, materials and codes, and open access publishing) and advocate open science (e.g., through social media) thrive in a scientific community that values open science. At the same time, “quick-and-dirty” science is still prevalent, as evidenced by high false positive and false discovery rates. Based on the specific assumptions of our model, our results suggest that labs that practice and advocate open science are dominating in a scientific community that values open science. These results are encouraging to those who feel practicing open science “is not worth it”: in addition to benefits to science at large, our results suggest engaging with open science can benefit individual researchers if open science is sufficiently rewarded.
Initiatives to improve science often follow times of crisis. For example, the open science (OS) movement originated from a crisis in psychology, referred to as replication crisis, crisis of credibility, confidence, or reproducibility (Spellman et al., 2018; Pashler & Wagenmakers, 2012; Baker, 2016; Ioannidis, 2005; Open Science Collaboration, 2015; Simmons et al., 2011; Wagenmakers et al., 2012; Fiedler, 2011). Broadly speaking, the OS movement aims to make the scientific process more transparent, accessible, and reproducible. Practices associated with this movement (OS practices) include preregistration and the use of registered reports to reduce researcher’s degrees of freedom (Munafò et al., 2017; Chambers, 2013), protocol, data and code sharing to improve reproducibility and replicability (National Academies of Sciences, Engineering, and Medicine, 2019), and the use of preprints and open access publishing to increase the dissemination and accessibility of research findings (McKiernan et al., 2016; Mikki, 2017). Incentives such as “OS badges” rewarding openly sharing data or material (Kidwell et al., 2016), and preregistration ( advertise and identify “trustworthy” research (Schneider et al., 2020). More recently, additional badges for open access publication, open code, open source, and open science grants have been proposed (Guzman-Ramirez et al., 2023).
There is wide-spread agreement that adopting OS practices has advantages for both science at large as well as the individual researcher (Allen & Mehler, 2019; McKiernan et al., 2016; Markowetz, 2015). Researchers are encouraged to use OS practices to advance their career by increasing their citation count, generating media attention, attracting potential collaborators, and getting job and funding opportunities (McKiernan et al., 2016). Moreover, policy decisions aim to recognize and reward the use of open science (see, e.g., However, incentivising OS practices might bring along secondary, unintended problems. As the traditional publish-or-perish culture may have inspired questionable research practices like
In this article, we explore the benefit of practicing and advocating OS in a scientific community that rewards OS. To this end, we extended a computational model by Smaldino & McElreath (2016) who demonstrated that the current incentive structure in science, that rewards many and highly cited publications, could lead to low quality studies. Their results imply that in order to be successful, labs should favor a “quick-and-dirty” approach to conducting studies even though that would lead to a high false positive rate and a high false discovery rate. We extended this computational model by including four different lab types that are a factorial combination of practicing OS (yes/no) and advocating OS (yes/no). In this work, we (
We highlight that this work is exploratory and meant to be a proof of principle. While we ground our operationalizations and the selection of parameter values in the existing literature and our personal experiences as OS researchers, we do not claim that our results fully capture the complexity and the individuality of OS labs. Rather they are a simplification of reality and aim to illustrate how the landscape of science might change under different conditions.
The original methods to build the evolutionary model are reported in Smaldino & McElreath (2016), and a detailed explanation is provided in Box 1. In short, the model starts with a population of
Variation in these four characteristics leads to variation in fitness of the labs, which determines which labs “die” (e.g., a principal investigator no longer has any students or funding and as a result decides to leave academia) and which labs “reproduce” (e.g., a prolific PhD-student from a successful lab starts a lab of their own) to create offspring labs. Survival of the labs depends on payoffs that they receive for publishing research projects. At each time step, each lab either initiates a new investigation or not. The new investigation can be either a replication study or not. Results of a new investigation can be negative (
Box 1: Evolution in Smaldino and Mcelreath (2016) | ||
Evolution Characteristics Evolution takes place over many time steps in the model (i.e., 100,000 in Figure 3 and 1,000,000 in Figures 4 and 5 of Smaldino & McElreath, 2016). At time step Probability and Type of Investigation The probability that a lab launches a new investigation ( If the lab initiates a new study at a given time step, it is a replication study with probability Probability of Obtaining a Positive Result If the new study is a novel study, the underlying hypothesis is true with probability Probability of Publishing and Payoff A positive novel finding will be published with probability If the new study is a replication study, a hypothesis is randomly chosen from the literature (i.e., from the collection of studies that were already conducted by other labs). If the underlying hypothesis of the original study is true, the lab observes a positive replication finding with probability A positive replication finding will be published with probability Evolution Dynamics At each time step, the mean of After every time step, an evolution step takes place in which one lab “dies” and on lab is “born”. to determine the dying lab, The offspring lab inherits the characteristics from the reproducing lab. However, the inherited characteristics are allowed to mutate. All characteristics ( Lastly, if labs publish novel studies, they are added to the literature. The size of the literature is limited to |
We extended the model of Smaldino & McElreath (2016) by differentiating between labs that do or do not practice OS and between labs that do or do not advocate OS, yielding four types of labs (see Table 1).
Table 1. The four different types of labs
Practicing OS | |||
yes | no | ||
Advocating OS | yes | Practice; advocate | Practice; not advocate |
no | Not practice; advocate | Not practice; not advocate |
The four lab types were initially represented in equal proportions (i.e., at time step
We made the following assumptions about the impact of practicing OS on the survival of labs:
Practicing OS leads to higher workload (
The practice of OS requires more work and time compared to closed science (Hostler, 2023). New skills and knowledge need to be acquired and the research process involves additional steps, such as pre-registration, data and code cleaning, and additional administration (e.g., drafting openness agreements Hostler, 2023). Indeed, practicing OS is associated with an increase in workload, work-related stress, and longer time to completion of a research project (Sarafoglou et al., 2022; Toth et al., 2021). Seen as “increasing effort decreases the productivity of a lab, because it takes longer to perform rigorous research” (Smaldino & McElreath, 2016, p.~$6$), we reasoned that labs that practice OS should have a higher
Practicing OS increases the probability of publishing negative novel findings (
The proportion of published findings with statistically non-significant results is higher for registered reports ($60.5\%$; Allen & Mehler, 2019) or preregistered studies ($52\%$; Toth et al., 2021) compared to traditional research, with estimates ranging from
Practicing OS leads to papers that are rewarded more (
Citation advantages have been observed for several OS practices. In a systematic review, Langham-Putrow et al. (2021) identified
We found a
We made the following assumptions about the impact of advocating OS on the survival of labs:
Advocating OS leads to spending more time advocating (e.g., on Twitter) and less time doing research (
Advocating OS might lead to less available time for doing research because some proportion of the work time is spent on profiling oneself (e.g., posting on Twitter). Therefore, labs that advocate OS had a higher
Advocating OS leads to papers that are rewarded more (
We assume that publications from labs that advocate OS are rewarded more because they might be read and cited more often. For example, papers that are shared on Twitter (as done by many OS advocates) have a citation advantage over papers that are not shared (Ladeiras-Lopes et al., 2020; Luc et al., 2021). We did not find studies specifically focusing on the citation advantage for sharing OS papers on Twitter or other platforms, so we decided to use a heuristic of equating the payoff advantage for labs that advocate OS with the payoff advantage for labs that practice OS (see above; i.e.,
The parameter values of our model extension are summarized in Table 2. To make sure our results are robust and not contingent on specific choices for parameter values, we included two additional parameter values for each parameter and ran the factorial combination of each of these as sensitivity analyses (see Table 2). The parameter values of the sensitivity analyses are always
Table 2. Parameters for the four lab types.
Par. | Value | |||
Practice; | Practice; | Not practice; | Not practice; | |
advocate | not advocate | advocate | not advocate | |
Parameter values for main analyses are shown in bold font; parameter values for sensitivity analyses are shown in regular font. |
Cumulative proportions of lab types during the first 3, 000 out of 1, 000, 000 time steps of the first set of simulations. The two panels represent simulation runs with two qualitatively different patterns of characteristics.
As a primary result, we collected data on which lab type(s) survive over time and which die out in a world where everyone “plays the game”. Specifically, we investigated in what proportions the lab types are present over time: Which lab type(s) is/are most successful within the academic community? As a secondary result, we collected similar data as Smaldino & McElreath (2016) about the mean
To reduce the computation time of the simulations, we used a maximum literature size of
We ran an additional set of simulations that incorporated a few more changes to the computational model. First, we reasoned that the payoff for negative novel studies (i.e.,
Second, the results of the previous simulations suggest that the characteristics of the scientific community have not yet reached a steady state after
Looking at the development of the lab characteristics in the
Power W, false discovery rate FDR, false positive rate α, and replication probability r averaged over labs and simulation runs over all 1, 000, 000 time steps of the first set of simulations. The two panels differentiate between simulation runs with two qualitatively different patterns of characteristics.
Figure 1 shows the proportions of the four lab types over the first
Cumulative proportions of lab types during the first 3, 000 out of 1, 000, 000 time steps of the first set of simulations. The two panels represent simulation runs with two qualitatively different patterns of characteristics.
The observed behavior indicates that practicing OS is more important than advocating OS, but that doing both is most advantageous. This advantage of practicing over advocating holds across the entire range of parameter values we investigated (see Figures 5, 6, and 7 in Appendix 8.1). The explanation for this is that there is an additional advantage for labs that practice OS, which is the higher probability of publishing a negative novel finding (i.e.,
Figure 2 shows the development of characteristics across lab types over the whole range of
Power W, false discovery rate FDR, false positive rate α, and replication probability r averaged over labs and simulation runs over all 1, 000, 000 time steps of the second set of simulations. The two panels differentiate between simulation runs with two qualitatively different patterns of characteristics.
The right panel displays entirely different characteristics. Here, both
We further explored the proportions of lab types and the characteristics of the scientific community with some slight parameter modifications: Changing
Figure 4 shows the characteristics of the scientific community. As in the first set of simulations (see Figure 2), the characteristics reflect those of the labs that practice and advocate OS through most of the time. As in the first set of simulations, we observed two qualitatively different patterns of characteristics of the labs. The explanation for this is the same as for the first set of simulations (see previous section).
Science is not just about the academic work– it is ultimately a joint enterprise by people. People who depend on their academic position for their livelihood. As such, doing well, or at least doing better than others, on whatever metric is used to evaluate one’s success becomes important to people. Smaldino & McElreath (2016) demonstrate that if all people do is “play the game”, the scientific work their field produces over time gradually degenerates to low-effort, quick-and-dirty work with a high proportion of false positives.
The OS movement should restrict the feasibility of some of the quick-and-dirty strategies to which researchers might, inadvertently, fall prey to. For instance, preregistering one’s work makes it very difficult to employ
In an incentive structure that values OS practices, practicing OS while also advocating OS is most advantageous. Our simulation results suggest that labs that follow OS practices and engage as “OS advocates” on Twitter or related social media platforms have a survival advantage. The cost associated with both practicing and preaching in terms of a slower “rate of completion” of research projects gets outweighed by the increase in payoff for publications. Within the simulation, only labs that both practiced and advocated OS persisted, all other types quickly vanished from the scientific landscape (i.e., they were less successful in terms of attracting attention and gathering citations).
In our model, advocating OS did not have the same advantages as practicing OS. This was true even in the condition where advocating was worth more than practicing in terms of publication payoff (
Practicing and incentivizing OS practices did not eliminate quick-and-dirty science in our simulation. While we observed slightly lower values for the false discovery rate and false positive rate compared to Smaldino & McElreath (2016) and the replication (Kohrt et al., 2022), the values were still considerably high (
Although in our model, practicing and advocating OS practices translated to career advantages, some factors cast doubt on the extent to which these findings translate to the real world. Our operationalization of practicing OS involved (slightly) more work per project and a substantial increase in pay-off per publication. Although our parameter settings were grounded to some extent on previous literature, the exact size will be no more than a rough estimate. Perhaps an increase in workload of, say,
For our operationalization of advocating OS, we assumed that scientists spend two hours of their working weeks on their social media of choice in lieu of working to build and maintain their OS profile. Perhaps two hours is unrealistic and ten hours gets closer to the truth. Or perhaps there is no difference at all in hours spent working between active social media scientists and those that are not active on social media: time on social media could be spent entirely during free time.
Similarly, it can be argued that the payoff advantage for labs that practice and/or advocate OS is too high and that it is unrealistic that the payoff advantage remains constant throughout evolution. It is possible that it is more realistic for the payoff advantage to diminish over time as an increasing amount of conducted studies are OS studies.
An additional limitation is the omission of consideration regarding the potential consequences of disclosing specific research data. Opening up access to such data may facilitate the identification of inaccuracies and provide a basis for heightened scrutiny from the broader research community (Allen & Mehler, 2019). Theoretically, this increased transparency could adversely affect the sustainability of a research laboratory, particularly if (unintentional) errors are unveiled and subject to public discourse.
Another, more general, limitation of our setup was the generic classification of scientists as OS practitioners versus non-practitioners and advocators versus non-advocators. In the real world, these categories (and the payoffs they entail) will rarely be so black-and-white. In addition, there will be individual differences in academic success that are tangential to the four categories specified here due to field of interest, background, social network, and even luck. In our simulation, these natural sources of variation were all completely equated. As such, the results of this study should be thought of more as a proof of concept in a drastically simplified representation of what in reality is a very complicated academic ecosphere.
Lastly, our modeling approach to investigate what lab types survive in a community that values open science is only one of many. Different approaches may shed light on the conditions under which labs with different strategies flourish. For instance, using a game theory approach would model the individual labs as rational agents with (potentially) different strategies (e.g., affinity to OS, payoff, etc.). In such an approach, the individual labs would play the “science game”, which would allow the computation of an equilibrium distribution of different types of labs.
In our simulation, labs that practice and advocate OS thrive in a scientific community that values OS. At the same time, “quick-and-dirty” science is still prevalent, as evident by high false positive and false discovery rates. These results are encouraging to those who feel practicing open science “is not worth it”: in addition to benefits to science at large, our results suggest engaging with OS benefits the individual researcher as well.
We are grateful to Joyce M. Hoek, Jasmine Muradchanian, and Ymkje Anna de Vries for interesting and inspiring discussions.
A transparency documentation of our research process, the code for the simulations, and the data of the simulations can be found online at
For all sensitivity analyses, we did not differentiate between simulation runs with two qualitatively different patterns of characteristics (see Results section). Instead, we averaged over all simulation runs. Each of the following Figures contains various parameter combinations. One additional parameter (i.e., the payoff advantage for advocating OS
Figures 5, 6, 7, 11, 12, and 13 clearly demonstrate that the lab proportions are robust against specific choices of parameter combinations. In all cases, the “practice; advocate” lab type wins and suppresses the other lab types. Although there is more variation in the community characteristics for different parameter combinations (see Figures 8, 9, 10, 14, 15, and 16), the overall trends are still quite robust.
Cumulative proportions of lab types during the first 3, 000 out of 1, 000, 000 time steps of the first set of simulations with γ = 1.242 and different parameter combinations. Note that panels show an average of all 50 simulation runs. γ and δ are the payoff advantages for advocating and practicing OS, respectively.
Cumulative proportions of lab types during the first 3, 000 out of 1, 000, 000 time steps of the first set of simulations with γ = 1.483 and different parameter combinations. Note that panels show an average of all 50 simulation runs. γ and δ are the payoff advantages for advocating and practicing OS, respectively.
Cumulative proportions of lab types during the first 3, 000 out of 1, 000, 000 time steps of the first set of simulations with γ = 1.725 and different parameter combinations. Note that panels show an average of all 50 simulation runs. γ and δ are the payoff advantages for advocating and practicing OS, respectively.
Power W, false discovery rate FDR, false positive rate α, and replication probability r averaged over all labs over all 1, 000, 000 time steps of the first set of simulations with γ = 1.242 and different parameter combinations. Note that panels show an average of all 50 simulation runs. γ and δ are the payoff advantages for advocating and practicing OS, respectively.
Power W, false discovery rate FDR, false positive rate α, and replication probability r averaged over all labs over all 1, 000, 000 time steps of the first set of simulations with γ = 1.483 and different parameter combinations. Note that panels show an average of all 50 simulation runs. γ and δ are the payoff advantages for advocating and practicing OS, respectively.
Power W, false discovery rate FDR, false positive rate α, and replication probability r averaged over all labs over all 1, 000, 000 time steps of the first set of simulations with γ = 1.725 and different parameter combinations. Note that panels show an average of all 50 simulation runs. γ and δ are the payoff advantages for advocating and practicing OS, respectively.
Cumulative proportions of lab types during the first 3, 000 out of 1, 000, 000 time steps of the second set of simulations with γ = 1.242 and different parameter combinations. Note that panels show an average of all 50 simulation runs. γ and δ are the payoff advantages for advocating and practicing OS, respectively.
Cumulative proportions of lab types during the first 3, 000 out of 1, 000, 000 time steps of the second set of simulations with γ = 1.483 and different parameter combinations. Note that panels show an average of all 50 simulation runs. γ and δ are the payoff advantages for advocating and practicing OS, respectively.
Cumulative proportions of lab types during the first 3, 000 out of 1, 000, 000 time steps of the second set of simulations with γ = 1.725 and different parameter combinations. Note that panels show an average of all 50 simulation runs. γ and δ are the payoff advantages for advocating and practicing OS, respectively.
Power W, false discovery rate FDR, false positive rate α, and replication probability r averaged over all labs over all 1, 000, 000 time steps of the second set of simulations with γ = 1.242 and different parameter combinations. Note that panels show an average of all 50 simulation runs. γ and δ are the payoff advantages for advocating and practicing OS, respectively.
Power W, false discovery rate FDR, false positive rate α, and replication probability r averaged over all labs over all 1, 000, 000 time steps of the second set of simulations with γ = 1.483 and different parameter combinations. Note that panels show an average of all 50 simulation runs. γ and δ are the payoff advantages for advocating and practicing OS, respectively.
Power W, false discovery rate FDR, false positive rate α, and replication probability r averaged over all labs over all 1, 000, 000 time steps of the second set of simulations with γ = 1.725 and different parameter combinations. Note that panels show an average of all 50 simulation runs. γ and δ are the payoff advantages for advocating and practicing OS, respectively.
For these additional sensitivity analyses, we did not differentiate between simulation runs with two qualitatively different patterns of characteristics (see Results section). Instead, we averaged over all simulation runs. Figure 17 shows the lab proportions for the first set of simulations as a function of
Cumulative proportions of lab types as a function of V0− during the first 3, 000 out of 1, 000, 000 time steps of the first set of simulations. The panels correspond to different values of V0−. All other parameter values are fixed at the values that were used in the main analyses (see Table 2). Note that panels show an average of all 50 simulation runs.
Power W, false discovery rate FDR, false positive rate α, and replication probability r averaged over all labs as a function of V0− over all 1, 000, 000 time steps of the first set of simulations. The panels correspond to different values of V0−. All other parameter values are fixed at the values that were used in the main analyses (see Table 2). Note that panels show an average of all 50 simulation runs.
