Biography as Compilation: How to Encode Georg Nikolaus Nissen’s Biographie W. A. Mozart’s (1828) in TEI P5

The project of editing the early Biographie W. A. Mozart’s (1828) by Georg Nikolaus Nissen (Nissen Online) began as part of the Digital Mozart-Edition (DME) at the Mozarteum Foundation Salzburg. The aim of the edition is to reveal the structure of the text by identifying the diverse sources Nissen relied on when writing the biography. These include primary sources such as original letters and documents from the Mozart family, secondary sources such as contemporary literature about Wolfgang Amadeus Mozart, and original text written by the author and later editors. Considering the challenges that arise when creating an edition that tries to dene the dierent strands of a text, this paper describes how XML/TEI markup was applied to 1. encode text passages which often do not correlate with common text structures (paragraphs, chapters); Journal of the Text Encoding Initiative, Issue 11, 16/01/2020 Selected Papers from the 2016 TEI Conference Biography as Compilation 2 2. document dierent types of sources and their authors or editors; and 3. integrate a detailed bibliography of the sources as well as critical annotations for each single text passage.

both decided to write their own biographies of Mozart. In 1842-43 Ulybyšev published a biography in French in three volumes, and between 1856 and 1858 Jahn published a biography in German in four volumes. 7 Over the next 150 years, this critical assessment of the biography was forgotten, and the editors of the Neue Mozart-Ausgabe used it for the documentary volume Mozart. Die Dokumente seines Lebens (Deutsch 1961), without critical annotation. The importance accorded to Nissen's biography was based on the circumstance that the author was married to the widow of Mozart. The opinion that Nissen had rsthand information was expressed again by Rudolph Angermüller in his new edition of the biography (Nissen 2010). It contains many annotations on people and works, but completely abstains from analyzing and commenting on the secondary sources used by Nissen on a large scale, even though in 1997 Dieter Demuth had already stated that Nissen's biography of Mozart wasapart from the letters-a true plagiarism of early Mozart literature and that the detection of all of Nissen's sources remained a desideratum (Demuth 1997, 83).

Research Results 8
The detailed analysis of Nissen's biography revealed that the hypothesis of rsthand information is by no means valid. Nissen used secondary sources even for biographical details about Mozart. All aesthetic considerations of Mozart's music are unattributed quotations taken from other authors.
Nissen sometimes put together contradictory statements or contrasting aesthetical positions, such as classical and romantic ideas about Mozart's music, without critical reection. Constanze Mozart's contribution to the biography consists only of details of little importance (Morgenstern 2014, 103-9).

The Digital Edition 10
Our project is part of the Digital Mozart-Edition at the Salzburg Mozarteum Foundation. Its goal is to ll a gap in Mozart research by analyzing the Nissen biography to distinguish and elucidate its various sources: • primary sources (such as Mozart letters and original documents), • secondary sources (early Mozart literature, newspapers, encyclopedias, etc.), and • original texts (texts written originally by Nissen, Jähndl, or Feuerstein).

11
The choice of an appropriate encoding method was determined by the aim of covering as many results of this analysis as possible. Regarding the individual text passages, the results of the research-or rather the information to be conveyed by the online edition-can be outlined by the following questions: Is the text passage in question based on a primary or a secondary source, or was it originally written by one of the biography's editors? For primary and secondary sources: What is the exact source or reference? And, nally: Which one of the editors is responsible for inserting each passage into the biography?

12
The base TEI XML le, containing the text of the biography as well as basic structural markup-for instance, for headings, paragraphs, and graphically highlighted text-was created by converting a Microsoft Word document via the TEI's OxGarage tool (https://oxgarage.tei-c.org). 3 The encoding mode, hereafter described in detail, is based on the TEI P5 and on categories determined and described by ourselves in compliance with the TEI, either within the XML document itself or in the corresponding schema le. 4 In order to assign and add certain information, we reference resources provided externally, like digitized texts or authority les.

13
Our schema is a subset of the TEI. We started with an all-encompassing TEI schema, 5 customizing it mainly by deleting elements and attributes, and by adding closed lists of attribute values to other elements. The values are described in prose within the schema le or in the document itself. This was the case with the subsections of the bibliography (see section 2.2) and the listing of the editors (see section 2.1.3). While the former merely introduces project-specic terms (e.g., "primary" and "secondary") to classify sources, the latter covers the circumstance of editors contributing to a text in a manner not explicitly taken into account by the TEI, namely by tacit text reuse. 6 The referenced descriptions are also meant to be displayed, for example in order to provide additional information when a certain passage is clicked, and thus also serve a purpose besides the documentation. The rst step in the process of source-related encoding was the segmentation of the text in accordance with the biography's composition process. Since the biography is made up of passages from many dierent sources, each passage from a distinct source should be marked up and be recognizable as such. However, these passages do not always correlate with paragraphs; in some cases, a passage of this kind might even stretch from inside a paragraph to the middle of the next one. The tree structure of the format-related markup is overlapped by a stream of text passages as illustrated in gure 1: Since the need to deal with overlapping structures in XML is obvious and indeed frequently encountered in text encoding, a great number of possible solutions have already been proposed.
The TEI Guidelines list and discuss in detail multiple approaches, some of them requiring extensions of the TEI, while also mentioning non-XML-based and nonconformant techniques (TEI Consortium 2016, 20: "Non-hierarchical Structures"). 7 This does not mean that complex structures could not be handled with "pure" TEI, and we never considered an extension or a non-XML-based approach necessary because we mainly deal with a relatively simple case of just two overlapping hierarchies, one being merely a series of passages without any nesting or further overlapping. might result in a lot more markup. As can be seen in example 1, every boundary point between the dierent passages is marked by an empty milestone-like element (<anchor>). Each <anchor> element is given an @xml:id comprising the incipit of the following passage. Often, but not always, the boundary point of a passage matches an already existing boundary, for example between two paragraphs. If that is the case, the <anchor> element is placed in between the paragraphs.

16
After the marking of the boundary points, another empty element (<span>) is inserted for every passage (see example 1). Each <span> references a text passage by pointing to the <anchor> before it (using @from for this purpose) and to the <anchor> after it (using @to): 11 Example 1. <span> elements defining text passages by referencing <anchor> elements (excerpt from page 652).
<anchor xml:id="anch1002_p651_Die_Musik"/> The <span> elements do not necessarily need to be positioned as shown above; another possibility is putting them into a <spanGrp> elsewhere within the document. However, it helps the encoder to have the <span> elements as close as possible to the text passages they refer to, since more information regarding a given passage will be added there later. 18 The TEI Guidelines do express some concern about processing data encoded with empty elements to mark boundary points between passages: "since the elements of the analysis … [in our case the text passages] are not uniformly represented by nodes in the document tree, they must be reconstituted by software in an ad hoc fashion, which is likely to be dicult and may be error prone" (TEI Consortium 2016, 20.2: "Boundary Marking with Empty Elements"). 12 However, the <span> elements referencing the text passages can easily be addressed. Diculties arise when the passages-that is, the text streams themselves-need to be addressed, but the <span> elements, which reference unambiguously the beginning and end point of a particular portion of text, are of great advantage. The need to address a text passage as a stream arises, for example, during a search for all passages with a certain quality or attribute as text, or if every text passage on a certain page of the book is to be displayed in a dierent color when rendering the text with an appropriate web application. At this point, the development of the tool for online presentation is just beginning, so it remains to be seen how easy the markup will be to process.

Documentation of Specifications 19
Not only does the <span> element, as it is used in our encoding scheme, represent and reference passages or parts of text within a document; it also serves the purpose of associating "an interpretative annotation directly with a span of text" (TEI Consortium 2016, 17.3: "Spans and Interpretations"). 13 Although <span> elements can have textual content, we decided to use <span> as an empty element with attributes since the annotations we want to add either are limited to only a few options (e.g., indicating which editor is responsible for the insertion of a given passage) or simply consist of a reference to an object encoded elsewhere (the bibliographic records of the sources that have been used). Therefore, there is no need for a prose description in either case since both specications of the <span> element can easily be addressed using attributes and their values: Example 2. <span> element with all attributes.

Editors 21
The value of @ana targets a <category> element within a <taxonomy> element. The <taxonomy> element is enclosed by a <classDecl> element, which is part of the encoding description (<encodingDesc>). The categories were established during the preceding examination and the analysis of the editors' preserved working materials as well as the text of the biography itself.  Each <citedRange> element indicates a page (or range of pages) containing text material reused by the editors of the biography. One coherent piece of reused text corresponds to exactly one <citedRange> element, which means that in cases where a certain work is cited (or rather plagiarized) more than once, several <citedRange> elements are listed. Each <citedRange> element is associated with the corresponding text passage within the biography. The mechanisms of associating 14 elements and texts can be explained by looking at two examples given earlier: in example 2, the <span> element representing a reused text passage has a @source attribute, and this specic attribute points to a complete <citedRange> element in example 4.

24
This approach to encoding the original source of a text passage follows the standard procedure of citing from a traditional prose text: citation by page number. As such, it suciently serves its purpose. It could rightfully be argued, though, that in comparison to the exact identication of the passage within the biography itself, the rather generic reference to the source is somehow unsatisfactory. However, we do not have the resources to also encode in TEI all the texts that have been used in order to be able to point exactly to each sentence within its original context. 15

25
The bibliography itself (encoded as <listBibl>) is further divided into two sections, one as type="primary", the other one as type="secondary". The former contains all the primary sources, the latter the secondary sources: Since every bibliographical entry is part of either the rst or the second <listBibl>, the source type of every text passage within the biography is documented through the relationship with its corresponding bibliographical entry. The bibliography is organized like a taxonomy, and therefore serves the purpose not only of listing the sources that have been used but also of classifying and characterizing them further. The values of the @type attribute are documented in the corresponding schema.

27
If a passage was originally written by one of the editors and is thus not part of either a primary or a secondary source, but an original text, no @source is attached to the <span> element in question.
In this case, the editor referenced by @ana is identical to the author of the text.

Additional Remarks 28
The responsible editor and the original source of a passage are represented by mechanisms that point toward further descriptive elements. Given that we might want to comment on a passage in a way that is not covered by the functions associated with these elements, an optional @corresp attribute pointing to a <note> element containing additional remarks is supplied within the respective <span>: Example 6. <note> element for additional remarks.
<p><!--... --><anchor xml:id="anch1003_p652_Es_hat"/>Es hat sich gefunden, dass, als Burney viel später dieses Miserere nach einer Copie des Originals öffentlich bekannt machte, auch nicht eine Note anders als bey Mozart darin war.</p> <span from="#anch1003_p652_Es_hat" to="#anch1004_p652_Da_man" source="#Rochlitz_Tonkunst_1825_5" ana="#Nissen" corresp="#note_anch1003"/> <note xml:id="note_anch1003" type="commentary">Dieser Abschnitt ist einer Fußnote zu Mozarts Gedächtnis im Artikel "Ein guter Rath Mozarts" entnommen. Only during the editing process do the <note> elements remain positioned as shown in example 6. In a second step, the notes will be listed in a <list> element of type="notes" which is part of the document's back matter. If they remained positioned between <anchor> elements, they would be part of the text passage represented and addressed by a <span> element, which would be semantically misleading since they are not part of the original text of the biography.

32
The integration of electronic images of the primary and secondary sources used by the three authors serves two purposes: For the identication of places we reference the GeoNames authority les. 20 Referencing authority les such as GND and GeoNames enables interchange between projects. The letter edition of the DME, which Nissen Online references, collaborates with correspSearch, 21 a tool that allows one to search the metadata of various scholarly editions of letters.

Conclusion 35
Despite several critical evaluations in the nineteenth and twentieth centuries, many scholars in our days still cite the Nissen biography without being aware of the plagiarism inherent in it. This is the reason a critical edition is needed. Our online edition oers musicologists, and Mozart scholars in particular, a new, groundbreaking instrument for the critical study of the Nissen biography, with a focus on the documentation and presentation of the secondary sources used by its authors. These sources encompass most of the familiar writings on Mozart up to 1829, but also less well-known publications.

36
The outlined encoding method allows for numerous search functions to be applied to the text of the biography. It allows users to examine how much primary and secondary source material and how much original text the biography contains. It also makes it possible to retrieve the contributions of a specic editor-the text passages he inserted or wrote by himself. And since the sources are further dened through the taxonomic order of the bibliography, it is possible to search the biography for particular types of sources.

37
The making of Nissen's Biographie W. A. Mozart's is by no means an isolated case. Extensive utilization as well as copying of all available sources were common practices in the making of nineteenthcentury biographies (Klein 2009, 247). Therefore, the outlined considerations on how to encode Nissen's compilation in TEI P5 might also be of interest for future work on similar texts.