Anatomy of open access publishing: a study of longitudinal development and internal structure
© Laakso and Björk; licensee BioMed Central Ltd. 2012
Received: 27 July 2012
Accepted: 28 September 2012
Published: 22 October 2012
Open access (OA) is a revolutionary way of providing access to the scholarly journal literature made possible by the Internet. The primary aim of this study was to measure the volume of scientific articles published in full immediate OA journals from 2000 to 2011, while observing longitudinal internal shifts in the structure of OA publishing concerning revenue models, publisher types and relative distribution among scientific disciplines. The secondary aim was to measure the share of OA articles of all journal articles, including articles made OA by publishers with a delay and individual author-paid OA articles in subscription journals (hybrid OA), as these subsets of OA publishing have mostly been ignored in previous studies.
Stratified random sampling of journals in the Directory of Open Access Journals (n = 787) was performed. The annual publication volumes spanning 2000 to 2011 were retrieved from major publication indexes and through manual data collection.
An estimated 340,000 articles were published by 6,713 full immediate OA journals during 2011. OA journals requiring article-processing charges have become increasingly common, publishing 166,700 articles in 2011 (49% of all OA articles). This growth is related to the growth of commercial publishers, who, despite only a marginal presence a decade ago, have grown to become key actors on the OA scene, responsible for 120,000 of the articles published in 2011. Publication volume has grown within all major scientific disciplines, however, biomedicine has seen a particularly rapid 16-fold growth between 2000 (7,400 articles) and 2011 (120,900 articles). Over the past decade, OA journal publishing has steadily increased its relative share of all scholarly journal articles by about 1% annually. Approximately 17% of the 1.66 million articles published during 2011 and indexed in the most comprehensive article-level index of scholarly articles (Scopus) are available OA through journal publishers, most articles immediately (12%) but some within 12 months of publication (5%).
OA journal publishing is disrupting the dominant subscription-based model of scientific publishing, having rapidly grown in relative annual share of published journal articles during the last decade.
KeywordsOpen access scientific publishing
Open access (OA) has expanded the possibilities for disseminating one's own research and accessing that of others [1, 2]. OA, in the context of scholarly publishing, is a term widely used to refer to unrestricted online access to articles published in scholarly journals. There are two distinct ways for scholarly articles to become available OA, either directly provided by the journal publisher (gold OA), or indirectly by being uploaded and made freely available somewhere else on the Web (green OA). Both options increase the potential readership of any article to over a billion individuals with Internet access and indirectly speed up the spread of new research ideas. While the majority of OA journals do not charge authors anything for the services provided, a growing minority of professionally operating journals charge authors fees ranging from 20 to 3800 USD, with an estimated average of 900 USD .
OA is closely related to developments in other media content delivery businesses, and its ethos is well aligned with the fundamental openness principle of science itself as well as the ideologies behind Wikipedia and open source software. However, what makes scientific publishing distinct is the influence journal prestige and rankings have on journal selection for authors submitting article manuscripts . There are also vested interests to preserve the status quo of the current subscription market among stakeholders, with dominant publishers seeing OA as a potential threat to the bottom-line. Friction caused by these and other factors can be argued to slow down the process of OA adoption because journals are not direct substitutes for each other and subscription-based journal copyright agreements can prohibit parallel distribution of published content. However, following in the footsteps of the National Institutes of Health in the US, public research funders in the UK have recently launched strategies to increase OA to publicly funded research . While the ultimate goal of increasing access to publicly funded research is known and widely accepted it is difficult to reach compromises that balance the long-and short-term interests of the stakeholders involved .
Important changes in policy facilitating growth of OA happen on many levels, influencing research publishing both upstream and downstream. The examples from the public funders in the US and UK are merely the most ambitious movements so far: public and private research funders large and small, universities, publishers and research institutes all contribute to forming the evolving OA landscape. The problem that has persisted with OA since the start is the lack of readily available data for how this particular subset of journal publishing is developing over time, an aspect which is described in closer detail in the Methods section. Policymakers should have an interest in knowing how common OA is today, how fast the share of OA has increased and what proportion of journal articles are currently OA? The purpose of this study is to provide answers to these types of questions.
Aim of the study
This study focuses on providing measurement of the longitudinal development gold OA publication volume for the years 2000 to 2011 as a whole and by subtype: full immediate journal OA, delayed OA and hybrid OA. As will be described in more detail further on, earlier studies have mostly ignored the subset of delayed OA journals. This is partly because there is no comprehensive index of such journals similar to the service the Directory of Open Access Journals (DOAJ) provides for immediate OA journals, and partly because of the divisive acceptance of delayed OA as a valid form of OA. However, the subset of delayed OA journals is both substantial in volume and is populated with many high-quality journals; five of the 10 most-cited journals within Thomson Reuters Web of Knowledge in the period from 1999 to 2009 are currently delayed OA while the others are subscription-access only . Hybrid OA is the term commonly used for describing individual articles being provided openly within subscription-only journals through an optional author payment; it is only recently that this type of OA has been properly studied .
The chosen research aim is related to some existing areas of OA research that warrant mention to clarify the specific contribution of this study. Green OA is not part of the scope of this study as that is a wholly different research problem and one that requires its own set of methods, as different versions of articles are scattered around on the Web. Furthermore, this study does not extensively discuss or evaluate the pros or cons of OA, since there is already a well-developed body of literature focusing on issues such as relationships between OA and readership, citation or impact [9–12]. In summary, the aim is to provide comprehensive and up-to-date quantitative measurement of gold OA journals and articles. The results and data of this study can then potentially act as a foundation for more targeted research enquiries.
Researchers have applied different methods to cope with the lack of readily available quantitative data to study the OA phenomenon, ranging from labor-intensive manual article-counting [13–15] to automated Web-crawling [16, 17]. What is known about the early years of OA, both gold and green, is mostly through a series of independent studies providing snapshots for individual years based on sampling various publication indexes. The fact that studies have been based around OA prevalence within different publication indexes and the diverse adopted sampling methods makes comparisons or composition of longitudinal development inexact. Nevertheless, these are the best figures currently available. The earliest comprehensive study suggests the 2003 share for gold OA to have been 2.9% for articles included in the Thomson Reuters Web of Knowledge . The next study was performed for the 2006 publication volume based on data from UlrichsWeb  and the DOAJ , where a gold OA share of 8.1% and a green OA share of 11.3% resulted in a combined OA share of 19.4% . For 2008 articles, the Thomson Reuters Web of Knowledge gold OA share was measured to be 6.6% and green OA 14%, resulting in a figure of total OA of 20.6% . Also for 2008, a large-scale study based on English-language journals listed in the DOAJ calculated that 120,000 articles were published OA either through full immediate OA journals or as individual hybrid OA articles . The first comprehensive longitudinal study on the volume of articles published by full immediate OA journals in the DOAJ resulted in an average annual year-on-year growth rate of 30% from 2000 to 2009, with some 191,000 articles published during 2009 . Another longitudinal study, including both gold and green OA, produced a total OA share of 23.1% for Thomson Reuters Web of Knowledge indexed articles published during 2010 . Outside of this 2010 study of Thomson Reuters Web of Knowledge, there are no comprehensive measurements for OA volume since 2009. This study is designed to provide a longitudinal study implementing a well-documented and easily replicable methodology, producing results applicable to multiple publication indexes, producing results that are easy to follow-up and compare with future measurements.
Through a previous study using identical sampling and data collection methodology , data for 565 journals spanning publication volumes for 2000 to 2009 could be re-used, with only the need to gather publication volumes for two additional years. Since the existing data material lacked coverage for journals added to the DOAJ during 2010 and 2011, an additional randomly selected sample was drawn out of the journals added within the two missing years adhering to the same sampling probability as the pre-existing sample (0.1011), with 222 new journals added to the existing sample of 565 journals.
Where journal publication volumes could be retrieved from either SCImago or Thomson Reuters Web of Knowledge, such data was used. For the majority of journals, the individual journal websites were visited and the annual entries collected manually. It is worthwhile to note that journals often include editorials, news, book reviews, obituaries and other non-research content. Such material was excluded from all measurements in this study. To pro-vide an accurate representation of retrospective OA volume, articles were not collected for subscription-only journals prior to publishing OA. Determining when a journal has initiated OA publishing often requires manual investigation as the information is not always made explicit on the webpages, and the data concerning this is often incorrect in the journal metadata available in the DOAJ. To support the analysis of the sampled journals, additional data from Scopus  and Thomson Reuters Web of Knowledge was utilized in addition to the data that is already available through the DOAJ.
Estimated annual article and journal counts in full immediate open access journals
Online-only OA journals (APC)
Online-only OA journals (no APC)
Subscription-based print journals with OA content online
All OA journals
Overall there has been growth in the annual output among all three categories since the year 2000, going from a total volume of 20,700 articles in 2000 to 340,000 in 2011. Not depicted in Figure 2 but provided in Table 1 is the number of active OA journals for each respective year (journals with at least one article published during the respective year), which has increased from 744 journals in the year 2000 to 6,713 in 2011. The average number of articles per journal has also seen a constant increase, with an average of 26 articles per journal in 2000, 33 in 2005, and 51 for 2011. However, a reminder about the skewed nature of article distribution among journals is relevant here. There is a handful of journals publishing more than 1,000 articles per year and thousands of journals publishing only a few articles annually.
Inspecting the internal structure of the total article mass reveals some major shifts that have happened over the course of a decade. Journals that also publish a parallel print version, which are often old, established journals that decided to make the online version free when they started putting their content on the Web, provided the majority of the OA content up until the year 2008 where, for the first time, online-only journals took the lead in terms of output volume. Since 2008, the online-only journals have sustained a much stronger growth while the OA output provided by journals outputting a print version has plateaued to annual volumes between 100,000 and 110,000 articles. The latter group includes a lot of society journals registered with dedicated portals like SciELO , Redalyc  and J-Stage  providing the technical platform for electronic publishing. Journals with author-processing charges have seen breakout growth during the last three years, going from 80,700 articles in 2009 to 166,700 articles in 2011.
Proportion of publisher-provided (gold) open access in major indexes
Articles indexed in Scopus
In full immediate OA journalsb
Share of articles published in full immediate OA journals
Total share OA
Articles indexed in Web of Knowledge
In full immediate OA journalsb
Share of articles published in full immediate OA journals
Total share OA
Of the 1.66 million articles indexed by Scopus in 2011, 11% were published in full immediate OA journals, 0.7% as hybrid OA and 5.2% in journals that have a maximum OA delay of 12 months. Together, these account for almost 17% of the total article volume in the whole index. The figures for articles indexed by Thomson Reuters Web of Knowledge are comparable to those of Scopus, with a total publisher-provided OA rate of 16.2% for 2011. Of the 1.29 million articles indexed by Thomson Reuters Web of Knowledge, 7.9% are available in full immediate OA journals, 0.7% as hybrid OA and 6.4% in journals that have a maximum OA delay of 12 months. Overall the results suggest that there has been an increase of about one percentage point annually in relative OA volume in both Scopus and Thomson Reuters Web of Knowledge during 2008 to 2011.
Over the course of the last decade, OA journal publishing has grown universally across diverse types of journal publishers, geographical regions and scientific disciplines. This has resulted in a continuously growing proportion of journal articles being published OA for each year that has passed, with the most recent measurement from this study being 17% when delayed OA articles with a maximum embargo of 12 months are included. However, despite all the studied dimensions showing increases in annual article output over the decade, the results of the study show that growth has not been uniform across the board. OA publishing seems to be in a very dynamic growth phase, with major shifts in the internal composition happening in a relatively short span of time.
A major strength of the study is associated with the labor-intensive manual approach to data collection, where the annual article volumes for each journal included in the sample was registered for the years 2000 to 2011. This approach reduces the risk of using incorrect, skewed or incomplete source data. The methodological transparency should also enable others to produce comparable numbers to follow-up and compare with the measurements provided here. What can be held as a weakness is the reliance on sampling rather than complete population coverage, however, such an approach is not feasible with the indexing tools currently available and manually collecting the data for over 7,000 journals is a very labor-intensive task.
In comparison with existing studies, this is not only the first study to provide comprehensive gold OA measurement for 2010 and 2011, but the results for the earlier years studied are also more accurate and representative of the actual volumes published at the time. The previous directly comparable study suggested that 191,000 articles were published by full immediate OA journals during 2009 , whereas this study suggests the volume for the same year to actually be 225,600. The discrepancy in retrospective annual volumes between these two studies, or any other earlier study using data from the DOAJ, is influenced by the time-lag between the time journals actually start publishing OA and the time they get registered to the DOAJ. In part, this is because journals have to submit a request to the DOAJ to be added, meaning that journals rarely are registered from the first issue they publish, if at all. Another issue is the time the DOAJ takes to process new addition requests; as of September 2012 the backlog of journals currently in queue for evaluation is described as being 'huge' on the DOAJ contact page . Exploring this issue more closely through the sampled journals, it appears that over half of the sampled journals added to the DOAJ during 2010 and 2011 had been publishing OA already prior to 2010, with a handful of cases publishing OA for over a decade prior to DOAJ registration. As was noted in the introduction, most other earlier studies have been limited by only looking at specific OA subsets for specific years, and are thus not directly comparable. However, despite this inability to compare our estimates directly with earlier studies because of methodological incompatibilities, all the results nevertheless speak for the notion of a strong longitudinal growth for OA, particularly so for the biomedical research field.
The results, in particular the finding that approximately 17% of scholarly journal articles are already now made openly available on the Web within a year by the publishers, should be an important input for the policy discussions on OA in venues like the US Congress, the European Commission and the UK Finch Committee that recently published its report with OA-guidelines for British research funders . This study also sheds new light on the relative contributions of the two complementary routes for achieving OA, the publisher-provided gold route and the author-provided green route, indicating that the contribution of gold (both immediate and articles withheld for short embargo periods) is much larger than many earlier estimates. The results should also be considered together with two other recent studies [3, 9]. These studies suggest that the level of article-processing charges paid is on average around 900 USD, which is lower than generally believed, and that the scientific impact of OA journals founded in the last decade, and in particular in biomedicine, is on par with similar subscription journals, as measured by average number of citations.
It no longer seems to be a question whether OA is a viable alternative to the traditional subscription model for scholarly journal publishing; the question is rather when OA publishing will become the mainstream model. What remains to be seen is whether the growth will continue at a similar rate as measured during last few years, or if it will accelerate to an even steeper part of the S-shaped adoption pattern typical of many innovations . As in many other markets where the Internet has thoroughly rewritten the rules of the game, an interesting question is if new entrants, like Public Library of Science and BioMed Central, will take over the market or if the old established actors, commercial and society publishers with subscription-based revenue models, will be able to adapt their business models and regain the ground they have so far lost. Future studies on the internal structure of OA publishing are likely to witness the anatomy transforming yet again. Most of the major internal shifts in OA journal publishing have only happened fairly recently during the last few years and, judging by the momentum at which things are moving, it is hard to imagine the internal dynamics settling down any time soon.
ML is a doctoral student in Information Systems Science at the Hanken School of Economics, Helsinki, Finland. B-CB is professor of Information Systems Science at the Hanken School of Economics, Helsinki, Finland.
- Suber P: Open Access. 2012, Cambridge: MIT PressGoogle Scholar
- Willinsky J: The Access Principle - the Case for Open Access to Research and Scholarship. 2005, Cambridge: MIT PressGoogle Scholar
- Solomon DJ, Björk B-C: A study of open access journals using article processing charges. J Am Soc Info Sci Technol. 2012, 63: 1485-1495. 10.1002/asi.22673.View ArticleGoogle Scholar
- Knight LV, Steinbach TA: Selecting an appropriate publication outlet: a comprehensive model of journal selection criteria for researchers in a broad range of academic disciplines. International Journal of Doctoral Studies. 2008, 3: 59-79.Google Scholar
- RCUK Announces New Open Access Policy, press release. 2012, [http://www.rcuk.ac.uk/media/news/2012news/Pages/120716.asp]
- Finch J: Accessibility, sustainability, excellence: how to expand access to research publications. Report of the Working Group on Expanding Access to Published Research Findings. Research Information Network. 2012, [http://www.researchinfonet.org/publish/finch/]Google Scholar
- Sciencewatch - Top Ten Most-Cited Journals (All Fields), 1999-2009. [http://sciencewatch.com/dr/sci/09/aug2-09_2/]
- Björk B-C: The hybrid model for open access publication of scholarly articles: a failed experiment?. J Am Soc Inf Sci Technol. 2012, 63: 1496-1504. 10.1002/asi.22709.View ArticleGoogle Scholar
- Björk B-C, Solomon DJ: Open access versus subscription journals - a comparison of scientific impact. BMC Med. 2012, 10: 73-10.1186/1741-7015-10-73.PubMed CentralView ArticlePubMedGoogle Scholar
- Davis PM, Walters WH: The impact of free access to the scientific literature: a review of recent research. J Med Libr Assoc. 2011, 99: 208-217. 10.3163/1536-5050.99.3.008.PubMed CentralView ArticlePubMedGoogle Scholar
- Wagner A: Open access citation advantage: an annotated bibliography. Issues in Science and Technology Librarianship. 2010, 60: doi.10.5062/F4Q81B0W [http://www.istl.org/10-winter/article2.html]Google Scholar
- Craig ID, Plume AM, McVeigh ME, Pringle J, Amin M: Do open access articles have greater citation impact? a critical review of the literature. Journal of Infometrics. 2007, 1: 239-248. 10.1016/j.joi.2007.04.001.View ArticleGoogle Scholar
- Laakso M, Welling P, Bukvova H, Nyman L, Björk B-C, Hedlund T: The development of open access journal publishing from 1993 to 2009. PLoS One. 2011, 6: e20961-10.1371/journal.pone.0020961.PubMed CentralView ArticlePubMedGoogle Scholar
- Björk B-C, Roos A, Lauri M: Scientific journal publishing: yearly volume and open access availability. Information Research. 2009, 14: e391.Google Scholar
- Crawford W: Free electronic refereed journals: getting past the arc of enthusiasm. Learned Publishing. 2002, 15: 117-123. 10.1087/09531510252848881.View ArticleGoogle Scholar
- Gargouri Y, Larivière V, Gingras Y, Carr L, Harnad S: Green and gold open access percentages and growth, by discipline. [http://arxiv.org/abs/1206.3664]
- Matsubayashi M, Kurata K, Sakai Y, Morioka T, Kato S, Mine S, Ueda S: Status of open access in the biomedical field in 2005. J Med Libr Assoc. 2009, 97: 4-11. 10.3163/1536-5050.97.1.002.PubMed CentralView ArticlePubMedGoogle Scholar
- McVeigh M: Open Access Journals in the ISI Citation Databases: Analysis of Impact Factors and Citation Patterns. 2004, [http://science.thomsonreuters.com/m/pdfs/openaccesscitations2.pdf]Google Scholar
- UlrichsWeb Serials Solutions. [http://ulrichsweb.serialssolutions.com/]
- DOAJ - Directory of Open Access Journals. [http://www.doaj.org]
- Björk B-C, Welling P, Laakso M, Majlender P, Hedlund T, Guðnason G: Open access to the scientific journal literature: situation 2009. PLoS One. 2010, 5: e11273-10.1371/journal.pone.0011273.PubMed CentralView ArticlePubMedGoogle Scholar
- Dallmeier-Tiessen S, Goerner B, Darby R, Hyppoelae J, Igo-Kemenes P, Kahn D, Lambert S, Lengenfelder A, Leonard C, Mele S, Polydoratou P, Ross D, Ruiz-Perez S, Schimmer R, Swaisland M, van der Stelt : Open access publishing - models and attributes. Max Planck Digital Library/Informationsversorgung. 2010, [http://edoc.mpg.de/478647]Google Scholar
- SCImago - SCImago Journal & Country Rank. [http://www.scimagojr.com]
- Thomson Reuters Web of Knowledge. [http://apps.webofknowledge.com]
- Scopus. [http://www.scopus.com]
- SciELO. [http://www.scielo.org/]
- Redalyc. [http://redalyc.uaemex.mx/]
- J-Stage. [http://www.jstage.jst.go.jp/]
- Rogers E: Diffusion of Innovations. 1995, New York: Free PressGoogle Scholar
- The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1741-7015/10/124/prepub
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.