To mark Open Access Week we have a guest post by Dr Martin Poulter, Wikimedian in Residence at the University of Oxford, about his work with open access journal Vestiges to expand Wikipedia coverage of topics.
Vestiges: Traces of Record is an open access journal, started in 2015, with editors from two Cameroonian universities as well as the University of Oxford. It is among the outcomes of a small project funded by the Arts & Humanities Research Council. Wikipedia has become the dominant reference site on the web and one of the main sources of incoming links to peer-reviewed literature. However, Wikipedia’s coverage of Vestiges’ topic area—archives in sub Saharan Africa—is notoriously sparse. For academics to fix this by writing introductory articles would involve an unfeasible amount of extra work. We have found a way to improve coverage with minimal extra work by repurposing journal text and figures in Wikipedia articles.
The idea is that the paper and Wikipedia article exist in parallel: the paper is a fixed part of the citable scholarly record, while the article evolves to take on new knowledge. The paper is a self-contained item while the article is part of a web of knowledge, interlinked with millions of other articles. An academic journal has a narrow audience while Wikipedia is among the top search engine hits for almost any topic.
The first journal to have a formal journal-to-wiki publishing process was Plos Computational Biology, although there have been prior efforts to reuse text from journals in Wikipedia. More than a hundred articles about ants use text extracts from peer-reviewed papers. A similar process has been used by UNESCO to share text about biosphere nature reserves.
Not all peer-reviewed papers are suitable for inclusion: they need to be thematically compatible in terms of giving an overview of a specific topic, as well as legally compatible in terms of copyright. The journal needs to be open access in the fullest sense, with a licence that allows reuse and adaptation for any purpose. Wikipedia’s licence is Attribution-ShareAlike (CC-BY-SA), so the source text needs to have CC-BY-SA or the more liberal Attribution only (CC-BY). Any content with Non-Commercial or No Derivatives clauses is ineligible for Wikipedia or its sister projects. This applies to media content (photographs, diagrams, video clips) as well as text.
An original research paper and a general-audience encyclopedia article are not exactly the same thing, and some adaptation and editing are necessary. Still, this is much faster than writing the article from scratch. From a 2,700 word paper about the Cameroon Press Photo Archive we were able to rapidly make a 1,400 word article.
English Wikipedia has a voluminous and detailed Manual of Style which specifies where external links can go, where bold text can be used, and so on. The only important style point a new contributor needs is that the text should be encyclopaedic. A Wikipedia article can only describe: it can not interpret, make recommendations or draw a conclusion. The Nsah paper contained factual background but also criticism of the institutions it mentions and recommendations for the future of the archive, so the process of converting it for Wikipedia meant cutting out evaluative material by section, by sentence or even by clause.
The Wikipedia reader needs to be told that the article they are reading is a derivative of a published paper, so two kinds of attribution are necessary. Inline citations (the familiar numbered references and footnotes) credit the factual content. These need to be done at least for every paragraph and possibly for every sentence. Wikipedia articles can be challenged sentence by sentence and it needs to be clear what each factual claim is based on. A second form of attribution is needed to credit the use of text. This is required by the journal’s copyright licence. There is a dedicated template to attribute text from journal papers: this goes in the References section of the article.
Wikipedia is a web, not just a collection of articles, so once the article is created, it needs incoming and outgoing links. As well as links from other relevant articles (for example from the article about Buea, the town where the CPPA is based) there are category links, redirects, wikiproject links and links from Wikipedia’s sister sites.
Categories are visible as links at the foot of the page in Wikipedia’s desktop view. Adding a category tag, such as “Cameroonian culture”, makes the article findable from similar articles. A way to identify relevant categories is to look to long-established articles on a similar topic. In a couple of cases, we created new categories. For example, “Archives by country” did not have a category for Cameroon, so we created one.
The left sidebar of any Wikipedia article has a “What links here” button. This shows the incoming links to the new article from other parts of Wikipedia. It can also be used on an analogous article to get suggestions of where to link from. Where something is known by multiple names—such as by a full name and an acrostic, or an official name and an informal name—it should be findable under each. Along with the main article for Cameroon Press Photo Archive, we created a redirect from CPPA.
Wikiproject templates are a crucial way to highlight the article to the Wikipedian community. These tags, added to the article’s Talk page rather than the article itself, add the article to automatically-generated lists of articles by topic, quality and importance. These in turn are monitored by Wikiprojects: groups of users interested in improving articles across a specific topic such as Cameroon or the history of photography.
Wikidata is the structured-data counterpart to Wikipedia, and maintains the links between different language versions of Wikimedia sites. We created a basic Wikidata item about the CPPA, making a few basic facts available about the archive across many languages. This gives a start to Wikipedians who want to write about the CPPA in other languages. We also created a Wikidata item for the Vestiges journal, using its entry in the Directory of Open Access Journals (DOAJ) as a source, and a Wikidata item for the journal paper. This way, the paper will turn up in queries for, for example, academic papers about things in Cameroon published by the University of Oxford.
Three Wikipedia volunteers made minor fixes to the CPPA article once it was live, and we invited another researcher familiar with the archive to improve the article. We also reused a couple of images from the paper to illustrate the article. This involved uploading them to the media archive Wikimedia Commons, again with a citation of the source paper.
This has prompted the journal to introduce a new article format. Biographical notes are intended to be short summaries of the life of a photographer, archivist or other person related to the journal’s topic. They will be based on field research and with a more rapid publication process. These will be ideal both for researchers getting a publication out of their fieldwork and for the world at large to learn about people and events that are not well-documented. The journal has taken on Oxford’s Wikimedian In Residence, Dr Martin Poulter, as an editorial adviser to help these articles conform to Wikipedia style.