Migrate, Publish, Repeat: TEI Journals in the Open Journal Systems Platform

The Indiana University (IU) Libraries have a long history of using the TEI markup standard to encode and publish electronic texts, but choosing the best publishing platform has been challenging for certain projects. Before formally launching an open access journal publishing program in 2008, the Libraries collaborated with Journal of the Text Encoding Initiative, Issue 10, 20/02/2018 Selected Papers from the 2015 TEI Conference Migrate, Publish, Repeat 16 two scholarly journals to provide open access publishing using P3 SGML and P4 XML TEI encoding delivered through the DSpace and XTF platforms. Both journals used complex encoding, transformation, and delivery workows that required copious amounts of custom software development to function properly. As these systems aged, the time and eort required to maintain them steadily increased. In 2013, the Libraries began planning to migrate these journals into the Open Journal Systems (OJS) platform while preserving the TEI markup. Both journals are now publishing using the OJS platform. The Indiana Magazine of History was successfully launched in OJS in August 2014, and The Medieval Review was launched in June 2015. Publishing in this manner leverages the IU Libraries’ strengths in electronic text projects and XML workows within an easy-to-use, exible platform that journal editors appreciate. The success of these migrations presents a new framework for future XML publishing of open access journals at Indiana University.

Concerning TEI projects at Indiana University, Dalmau, Hardesty, and Homenda presented about the challenges faced when performing large-scale TEI migrations while upgrading the underlying platform (Homenda, Dalmau, and Hardesty 2014).

5
Holmes and Romary examined the use of TEI to represent electronic journals.They identied the spectrum of le types used within electronic journal editorial and publication workows, including DOC, PDF, HTML, and XML, and advocated for the development of a customization of the TEI schema specically for encoding journal articles.They acknowledged the OJS editorial workow but noted the platform's limitations, including the need to produce and edit documents for publication on software outside of the system (Holmes and Romary 2011).At that time, OJS was a relatively new platform that had minimal XML support and no means of handling TEI les directly.The Journal of the Text Encoding Initiative has produced a highly constrained customization of TEI P5 (80 out of 600 elements) in order to be able to publish its own journal articles (Van den Branden and Holmes 2014).From the submitted TEI XML, the journal generates dierent outputs such as ODT for the editorial workow, OpenEdition XML for the journals.openedition.orgpublication platform, and a PDF, which requires conformity of submissions to the standard.There is a jTEI template for the Oxygen XML editor that makes encoding within the schema easier.

6
A study by Dias, Delno, and Silva investigated the feasibility of migrating electronic journals to OJS, and noted some of the advantages and shortcomings early in its life as an open source publishing platform (Dias, Delno, and Silva 2007).Dalmau and Schlosser recognized that Indiana University's IUScholarWorks Journals service began using OJS in a pilot project in 2008 after signicant development of an alternate platform occurred..They outlined the history of the "Indiana Magazine of History," the journal's collaborative relationship with the IU Libraries, and the resulting accessible online index and the rst iteration of its online open access platform (Dalmau and Schlosser 2010).Their work focused on developing a TEI model for the Indiana Magazine of History that featured issue-level encoding using the P4 version of the guidelines and independent header les for encoding article-level metadata.The encoding workow for this project was unique in that entire issues of the journal were encoded as single TEI documents with additional independent header les containing articlelevel metadata.Once issues and independent headers were initially encoded, they were subjected to quality control procedures involving markup inspection, additional validation, and visual inspection on a test server.After passing quality control, page image derivatives and TEI les were ingested into Fedora, and METS les were automatically generated for use by the DLP's page-turning web application, METS Navigator. 5The TEI-encoded texts were then put into a custom eXtensible Text Framework (XTF) 6 web application that provides indexing, searching, and browsing, and presents the encoded text rendered as HTML alongside page images (gure 1).

Case Study A: Indiana Magazine of History
Once the tasks of scanning and TEI encoding were completed, publishing an issue of IMH took an additional couple of weeks because of the need for software developers to interact with the various components that made up the web application.

System Limitations 9
Although the various systems that comprised the IMH online portal functioned stably and reliably, the amount of developer intervention required made it dicult to make new issues available quickly and even more dicult to make minor post-publication corrections.The underlying XTF platform natively supports XML collections and contains default stylesheets for TEI transformations; however, the level of customization needed to support the IMH impeded quick publication and indexing by search engines.A single incorrect page number in a page break element, bibliographic metadata eld, or METS le would result in the entire article's text and page images becoming inaccessible.Since these les were stored in the Fedora repository with no web-based user interface, changes to any of the numerous le types associated with each article would require a software developer to manually replace les.In addition to the complexity associated with several interlocking systems, the online portal of IMH also suered from poor Google indexing: no individual articles appeared in Google search results, a problem that would require completely re-architecting the site to x.This function had been implemented within the XTF publishing platform to provide readers the option to download single TEI les for research or analysis purposes.

Workflow Solutions
In August 2014, a new workow for publishing the IMH online was implemented using OJS.The IMH editors decided to continue conducting their peer review, editing, and typesetting processes outside of OJS.Once they release a new issue in print, they transfer PDF copies of the nal, formatted les for each article to the library via IU Box.

System Limitations 16
DSpace was developed to serve as an institutional repository, which Cliord Lynch (2003, 2) describes as "… a set of services that a university oers to the members of its community for the management and dissemination of digital materials created by the institution and its community members."Although some institutions have adapted DSpace to deliver digital collections, the software is not natively meant to support this: les stored in the repository are not viewable within the system and must be downloaded by users, and the visual styling of the repository is relatively inexible across collections stored within.

17
The process for publishing TMR TEI was error-prone since incorrect encoding of metadata and text could result in skewed deposits in DSpace.Moreover, making updates to the XML les required developer intervention and an uncomfortably elastic approach to deleting and re-depositing that is normally not used in a secure institutional repository.Furthermore, the journal had little to no control over the appearance and functionality of the online portal or their articles, the latter of which were presented as downloadable HTML les with no styling and essentially appeared as plain text.

Journal Migration
In 2014, discussions with TMR editors about the viability of migrating to OJS were sparked by a slew of problems the editors experienced in trying to correct errors in previously published articles.
Although OJS provides an "Objects for Review" plugin (formerly the "Books for Review" plugin), it is not comprehensive enough to manage TMR's complex editorial workow.Therefore, TMR editors were mainly interested in the features of OJS that provided a mechanism for publishing XML, enhanced indexability, and the ability to customize the appearance of the website.Before the journal accepted the terms of a migration to OJS, Libraries sta made prototype versions of TMR in OJS and imported articles and issues for journal sta to examine (as was done with IMH).
For TMR, which had nearly 4,000 articles at the time, DCS sta decided to simplify the journal's setup in OJS by migrating both P3 and P4 articles to the current P5 markup in order to achieve consistency across the corpus.Had the encoding migration not been performed, the XSLT within the OJS XML Galley Plugin would have had to account for two dierent markup versions, and the P3 portion of the corpus would have had to be converted from SGML to XML.In addition, because the quality of the encoding was quite low in certain places, an item-by-item inspection was already necessary in order to correct signicant metadata and encoding errors.These corrections were ultimately incorporated into the migration process to P5.

Workflow Solutions
TMR began publishing in OJS using a new workow in June 2015.Libraries sta worked with TMR's editorial sta to update their FileMaker Pro database export templates that conform and validate according to the TEI P5 Guidelines.Like the IMH editors, TMR editors decided to continue conducting their review, editing, and email distribution processes outside of OJS.Now, when a new review is ready for online publication, a TMR editorial assistant generates the TEI le from the FileMaker Pro database and enters the metadata for the article in the "Quick Submit" plugin in OJS (gure 4).This plugin allows journal sta to upload and quickly publish a single article within the system without needing to utilize the system's editorial workow steps.TMR sta use this plugin since they review and prepare their review articles within their own local database and need OJS to immediately publish the nal versions of these articles without further editorial steps.Because the "Quick Submit" plugin is not currently compatible with XML les, it is not possible to upload

Discussion
21 Common concerns about reliability, discoverability, and lack of control prompted the migration of these two journals to OJS.The custom platforms and workows developed for IMH and TMR required developer time to perform routine tasks and were prone to breaking.Moreover, having a custom-coded solution makes future maintenance more dicult when stang changes occur, a problem slightly mitigated but not solved by extensive project documentation.These platforms also employed less-than-adequate indexing architecture resulting in low discoverability by users outside of IU.Finally, while many scholarly journals lack the sta necessary to design and manage a complex online publishing system, they still hope for a certain level of control over their publishing platform, and they desire the ability to change the appearance of the site, correct errors in markup and metadata, and perform basic publishing actions with minimal overhead.The custom platforms for IMH and TMR did not allow this control.
Both journals have beneted from using OJS as a publication platform.For IMH, the time to publication has been signicantly reduced; after the TEI les are reviewed for quality, it takes just ve minutes to publish a new issue online, a process which used to take weeks.Moving IMH to OJS also improved the visibility of its content; the site is now indexed by Google and individual articles are indexed by Google Scholar.From the Libraries' perspective, moving IMH under the umbrella of IUScholarWorks Journals has helped to consolidate library journal publishing activities under one roof.Similarly, moving TMR from DSpace to OJS helped to consolidate all library-published open access journals within a single platform, making it easier to manage.This benets journal sta, who now can receive a consistent level of publishing and technical support.Finally, OJS enables both IMH and TMR editors to have a greater amount of control over the appearance of their respective platforms.In the past, editors have had to ask library sta to make even the smallest of changes, which they can now accomplish themselves by logging in directly to OJS.

Recommendations and Future Directions
These two projects illustrate that while using XML to publish electronic journals is hardly a new idea, the technical complexity involved in designing highly customized publishing platforms makes them dicult and expensive to maintain.TEI's exibility makes it a good option for encoding journal content, but eciently delivering this content requires an easy-to-use publication platform with minimal need for advanced software development skills and developer time.
OJS may be a solution for active TEI-encoded serial publications in need of a new publication platform.As an existing tool with a large community of support, OJS eliminates the need to build and maintain a boutique publication platform.While it is specically designed to accommodate journals, its use can easily be extended to other types of serial publications, such as monographic series and conference proceedings.Furthermore, it provides a user interface that enables nontechnically savvy editorial sta to easily update the content of their journal website.OJS, however, has its limitations for publishing TEI projects: it cannot interact with the TEI in any other way than simply transforming it to a dierent output format.For example, it is not possible to create browse or search facets based on the encoding without making signicant alterations to the core OJS code.These kinds of changes are not desirable, as they would make the process of upgrading the software much more challenging.
While OJS has demonstrated itself to be a exible, easy-to-use publication platform for TEI-based journals, there are several considerations to be made before adopting it locally.First, journal sta must decide on a workow for generating the TEI and secure the stang resources to implement it.Second, server space and the ability to install and maintain the software is needed.If batch uploading of article les is desired, additional, albeit minimal, server space is needed to serve as a temporary holding place to facilitate this process.
As demonstrated through the case studies of IMH and TMR, XML publishing in OJS is by no means seamless.Fortunately, the PKP team is actively developing improvements in this area.The organization is currently focused on converting Word documents to Journal Article Tag Suite (JATS) XML, which can subsequently be converted to a variety of other formats including HTML and PDF.The PKP team is also working to perfect the Open Typesetting Stack (previously the XML Parsing Service), 10 a collection of dierent software libraries that perform various steps in the conversion process.An OJS plugin that submits documents to the Parsing Service is currently available on Github, 11 but it has not yet been integrated into the main software code.PKP's eorts towards parsing minimally-structured text into XML could enable additional encoding standards and schemas to be used more easily within the platform, especially because the project is open source.Whether journals prefer to use JATS, TEI, or some other standard to encode their content, OJS as a platform is unlikely to inuence the adoption of one XML standard over another, since their plugins and workows have been built with a high degree of exibility and the potential for customization to support individual journal needs.

Conclusion
Beginning in 2006, the Indiana University Digital Library Program (DLP) began a project funded by a Library Services and Technology Act grant to digitize the back issues of the Indiana Magazine of History (IMH), 4 encode the text using TEI, and deliver the page images and encoded text through an open access web application.Digitizing the journal entailed contracting an overseas vendor to scan the journal archives and deliver TIFF master les.Next, the DLP made JPEG, PDF, and OCR text derivative les from these TIFFs according to the content model for paged media in IU's Fedora digital repository, where the derivative images were ingested.To prepare the journal for TEI encoding, optical character recognition (OCR) was used to produce plain-text les that were automatically placed into TEI les with bibliographic metadata in the header and page breaks inserted at the appropriate places.Back issues of IMH were encoded during the project years 2006 to 2008, and newly released issues continue to be encoded by IU Libraries interns and student employees seeking to gain experience with TEI.Issues are made open access with a two-year embargo.

Figure 1 .
Figure 1.The complex workflow steps for encoding an issue of Indiana Magazine of History in TEI and publishing it in XTF.
Beginning in 2013, the Indiana University Libraries' Digital Collections Services (DCS) department began investigating the feasibility of migrating IMH to OJS.DCS interns analyzed the existing XTF publishing platform, noted gaps and weaknesses, and investigated the functionality available using the OJS platform.DCS broached the idea of a migration with IMH sta and they indicated interest in transitioning to OJS to disseminate their open access issues, and possibly in exploring the use of OJS's editorial workow tools in the future.DCS and Scholarly Communication sta worked together to execute the migration of 7,000 IMH articles and to transition the journal's operational support from DCS to the Scholarly Communication department.During this process, the P4 TEI encoding was left unchanged; however, an article-level version of the encoding is required by OJS.Fortunately, the previous IMH platform contained a TEI article export function, which was used to obtain all IMH articles for the migration.This function took issue-level TEI les and automatically converted them to the articlelevel TEI les required by OJS, using encoded text from the issue TEI and article metadata from the independent header les developed by Dalmau and Schlosser during the project's initial iteration.
7 A Libraries intern or student employee copies and pastes the text of each article from the PDF into an article-level TEI template.They encode the article according to the local implementation guidelines established by the project in 2009 that were recently updated to reect changes in encoding practices due to the journal's migration.Most signicantly, they no longer require the encoding of geographic places and personal names.Although this level of encoding was a key feature of the IMH project, the originally Journal of the Text Encoding Initiative, Issue 10, 20/02/2018 Selected Papers from the 2015 TEI Conference proposed software features intended to leverage this encoding depth were never developed.Moreover, encoding these two components took the majority of the time spent on each issue, which further slowed down the publication timeline.13Oncethe initial TEI encoding is complete, Libraries sta check each encoded le for quality and return it to the intern or student employee with any necessary changes.This process is repeated until the le contains no encoding errors.Libraries sta run a prepared XQuery le on the set of encoded les to read metadata expediently from all of the XML les and to compose the single XML import le with the issue and article bibliographic metadata required by OJS.The TEI article les are stored on a local server in a location that is referenced by the import le.Next, Libraries sta upload the import le into a development instance of OJS using the "Articles & Issues" XML plugin.An XSLT le that is stored in the XML Galley plugin renders the TEI les automatically on the y as HTML (gure 2).Once the le is successfully imported into the development instance of OJS, it is saved locally for two years after the print publication date (in accordance with the journal's two-year embargo period) when it is imported into the production instance of OJS and the content becomes open access.

Figure 2 .
Figure 2. The workflow steps for encoding an "Indiana Magazine of History" issue in TEI and publishing it directly in OJS.

Figure 3 .
Figure 3.The complex workflow steps for encoding a single Medieval Review article in TEI and publishing it via deposit in the IUScholarWorks institutional repository.
Journal of the Text Encoding Initiative, Issue 10, 20/02/2018 Selected Papers from the 2015 TEI Conferencethe article le at this stage; the editorial assistant must edit the article after it has been created in OJS and attach the corresponding TEI le.The TEI le is then automatically rendered as HTML by an XSLT le (based on the one created for IMH) that is stored in the "XML Galley" plugin.

Figure 4 .
Figure 4.The workflow steps for encoding a single Medieval Review article in TEI and publishing it directly in OJS.
While TEI remains a viable option among XML journal publishing schemas, the technical challenges associated with managing the publication process are daunting to many journal publishers.At Indiana University, the successful migration of two TEI-based journals to the OJS platform has provided a model for supporting journals that have developed or are interested in developing an XML publishing workow.These projects have also helped the Libraries to forge closer working relationships among their DCS and Scholarly Communication departments, and to strengthen their collective expertise in the realm of digital publishing.The cases of IMH and TMR demonstrate OJS's ability to facilitate the publication of TEI-encoded journals, both of which previously relied upon complex platforms and workows that required a high level of technical skill to manage and maintain.More importantly, these examples open the door for current or future journals to consider using this platform with an XML-or TEI-encoding workow.two scholarly journals to provide open access publishing using P3 SGML and P4 XML TEI encoding delivered through the DSpace and XTF platforms.Both journals used complex encoding, transformation, and delivery workows that required copious amounts of custom software development to function properly.As these systems aged, the time and eort required to maintain them steadily increased.In 2013, the Libraries began planning to migrate these journals into the Open Journal Systems (OJS) 1 platform while preserving the TEI markup.Both journals are now publishing using the OJS platform.The Indiana Magazine of History 2 was successfully launched in OJS in August 2014, and The Medieval Review 3 was launched in June 2015.Publishing in this manner leverages the IU Libraries' strengths in electronic text projects and XML workows within an easy-to-use, exible platform that journal editors appreciate.The success of these migrations presents a new framework for future XML publishing of open access journals at Indiana University.INDEX Keywords: Open Journal Systems, open access publishing, electronic journals, digital project migration, XML publishing AUTHORS NICHOLAS HOMENDA Nicholas Homenda is the digital initiatives librarian at Indiana University Bloomington Libraries, where he manages digital projects, services, and initiatives in the Digital Collections Services department.Nick holds a Master of Science in Information Studies from the University of Texas at Austin and previously worked as a music librarian and an orchestral clarinetist.SHAYNA PEKALA Shayna Pekala is the discovery services librarian in the Library Information Technology Department at Georgetown University Library.Previously, she led the open access publishing program at Indiana University Bloomington Libraries as the scholarly communication librarian.Shayna earned an MLS with a specialization in digital libraries from Indiana University Bloomington and a BA in English from Duke University.