Non-Biblical Textual Criticism

Contents: Introduction * The Methods of Classical Criticism: Recensio, Selectio, Examinatio, Emendatio * Books Preserved in One Manuscript * Books Preserved in Multiple Manuscripts * Books Preserved in Hundreds of Manuscripts * Books Preserved in Multiple Editions * Textual Criticism of Lost Books * Other differences between Classical and New Testament Criticism * Appendix I: Textual Criticism of Modern Authors * Appendix II: History of Other Literary Traditions * Appendix III: The Bédier Problem

Introduction

Textual criticism does not apply only to the New Testament. Indeed, most aspects of modern textual criticism originated in the study of non-Biblical texts. Yet non-Biblical textual criticism shows notable differences from the New Testament variety. Given the complexity of the field, we can only touch on a few aspects of non-Biblical TC. But I'll try to summarize both the chief similarities and the major differences.

In one sense, the materials of secular textual criticism resemble those for Biblical criticism. Both are involved with manuscripts other than the autograph -- or, in a few strange cases such as Malory's Morte D'Arthur and the works of Shakespeare, with the relationship between editions and autographs. (We have only two early sources for Malory, both near-contemporary: Caxton's printed edition and a manuscript presumably close to the autograph. They differ recensionally at some points: Caxton evidently rewrote.)

The works of Sir Walter Scott are an even more complex case: Scott's native language was Braid Scots; it differs in pronunciation and vocabulary, though hardly in grammar, from British English, which is the language in which his books were to be published. To a significant extent, he relied upon his publisher to correct his Scotticisms. He also produced a second edition of many of his works, making marginal emendations in the first edition. So what is the authoritative text of, say, Ivanhoe -- Scott's manuscript, Scott's first edition, Scott's interlinear folios which were the source for the second edition, or the second edition? And how do Scott's corrections to the galley proofs fit into this? Not all of his corrections were proper English, and the editors ignored some of these.

Thomas Percy's Reliques have some similar problems, because there quite literally was no original manuscript. Percy assembled fragments from various sheets he had collected, tore out portions of manuscripts (yes, the man was a vandal), scribbled over it them, promised fillers but supplied them late, added material after portions of the book had been printed, and in general did everything he could to torture his poor printer. Little wonder that the book took two and a half years to publish. But what, then, is "the" text of the Reliques? The various materials Percy submitted? The corrected proofs? Something else? This is a book which was published in relatively modern times by a known author, but still there is no autograph.

The sole manuscript of Malory, British Library Add. 59678. The top portion of folio 35, showing the change from the first hand to the second (a change which seems to prove that it is not the autograph). The manuscript is imperfect; eight leaves are lost at the beginning, and probably as many at the end. This manuscript seems to have been known to Caxton; there are marks from his print shop in it. But the published edition differs, sometimes dramatically, from the manuscript.
It appears that Caxton rewote most extensively in the earlier portions, where Malory was, in effect, writing independent short stories; the end, in which Malory seems to be trying to create a unified narrative, is almost the same in manuscript and print book. The whole still poses an interesting challenge to textual critics, since the manuscript is not the autograph and there are hints that Caxton had some other source -- perhaps another manuscript.
Copies of Caxton's first printed edition are almost as rare as manuscripts: Only two survive, and one of them imperfect. Simply being printed did not assure the survival of documents!

The history of printed editions of classical works is often similar to that of the New Testament text following Erasmus: "[T]he early printers, by the act of putting a text into print, tended to give that form of the text an authority and a permanence which in fact it rarely deserved. The editio princeps of a classical author was usually little more than a transcript of whatever humanist manuscript the printer chose to use as his copy.... The repetition of this text... soon led to the establishment of a vulgate text... and conservatism made it difficult to discard in favour of a radically new text" (L. D. Reynolds & N. G. Wilson, Scribes & Scholars, second edition, 1974, p. 187).

There is, however, one fundamental difference between classical and Biblical textual criticism. Without exception, the number of manuscripts of classical works is smaller. Even the Golden Legend of Jacobus de Voragine, which in many countries was better-known than the Bible itself, exists in only about a thousand copies. The most popular classical work is the Iliad, represented by somewhat less than 700 manuscripts (though these manuscripts actually average rather older than New Testament manuscripts. Papyrus copies of Homer are numerous. As early as 1920, when the New Testament was known in only a few dozen of papyrus copies, there were in excess of a hundred papyrus texts of the Iliad known, a fair number of which dated from the first century C. E. or earlier.) But the case of Homer is hardly normal. More typical are works such as Chaucer (somewhat over 80 manuscripts of the Canterbury Tales, of which about two-thirds once contained the complete Tales; a few dozen copies of most of his other works). From this we work down through Piers Plowman (about forty manuscripts) to Thucydides, preserved in only eight manuscripts (this even though he was so well-known and admired that one of Josephus's assistants is known as the "Thucydidean hack") to the literally thousands of works preserved in only one manuscript -- including such great classics as Beowulf, the Norse myths of the Regius Codex, Tacitus (Tacitus's Annals are preserved in two copies, but as the copies are partial and do not overlap at all, for any given passage there is only one manuscript). Indeed, there are instances where all manuscripts are lost and we must reconstruct the work from excerpts (Manetho; the non-Homeric portions of the Epic Cycle; most of Polybius, etc.)

This produces a problem completely opposite that in New Testament TC. In New Testament TC, we can usually assume that the original reading is preserved somewhere; the problem is one of sorting through the immense richness of the tradition to find it. In classical criticism, the reverse is often the case: We know every manuscript and every reading in the tradition, but have no assurance that the tradition preserve the original reading. As an example, consider a reading from Gregory of Tours' History of Tours: in I.9 the manuscripts of Gregory allude to the twelve patriarchs (specifically mentioning that there are twelve) -- and then list only nine: Reuben, Simeon, Levi, Judah, Issachar, Zebulun, Dan, Gad, Asher. Clearly, three names -- Naphtali, Benjamin, and either Joseph or his sons -- have been omitted. But where in the reading? And is it Joseph, or his sons? We simply cannot tell.

It will be observed that many of the documents cited above are in languages other than Greek. Textual criticism, of course, can be applied in all languages; the basic rules are the same (except for those pertaining to paleography and other aspects related to letter forms and the history of the written language). For perspective, many of our examples will be based on works written in languages other than Greek -- though, because I lack the background, none will be taken from ideographic languages.

The text which follows is littered with footnotes and parentheses. I am genuinely sorry about this, since it makes the article much more confusing. But this is a far more complex field than New Testament criticism -- there are many different sorts of documents requiring many different techniques. Most rules have long lists of exceptions. And I don't want to deceive by overgeneralizing. The only alternative is the long list of special cases.

The Method of Classical Textual Criticism

Classical textual criticism, as its name implies, goes back to the classical Greeks, who were concerned with preserving the text of such ancient works as Homer. One of the centers of ancient textual criticism was Alexandria; it has been theorized (though there is no evidence of this) that the reason for the relative purity of the Alexandrian text of the New Testament is that Egyptian scribes were influenced by the careful and conservative work of the Alexandrian school. Their textual work on Homer was not always sophisticated (indeed, their conclusions were often quite silly), but they developed a critical apparatus of high sophistication (see the discussion of Alexandrian Critical Symbols).

Modern textual criticism, however, dates back to Karl Lachmann, who would later edit the first text of the New Testament to be fully independent of the Textus Receptus. In his work on Lucretius, Lachmann defined the basic method that has been used ever since.

It is interesting to note that, while New Testament textual critics break themselves down into two groups, textual critics of vernacular works see three classes (see compare Ralph Hanna III, "(The) Editing (of) The Ellesmere Text," in Martin Stevens & Daniel Woodward, editors, The Ellesmere Chaucer: Essays in Interpretation, Huntington Linrary & Yushodo Co., Ltd., 1997, pp. 225-226). The three are Best Text editors, who largely follow the lead of Bédier and print the text of a single source almost unaltered; Eclectics, who choose between texts based on their own interpretation of the best reading; and stemmatic workers. Westcott and Hort are sometimes regarded as Best Text editors (although they were in fact more eclectic than any Best Text editor I've ever studied); Lachmann invented the stemmatic method but was unable to use it on the New Testament; every other New Testament critic since then has been some type of eclectic.

I, on the other hand, would at least like to be able to work stemmatically. So here is an outline of how it is done.

Textual criticism, in this system, proceeds through four basic steps (some of which will be neglected in certain cases, and which occasionally go by other names):

recensio, the creation of a family tree for the manuscripts of the work
selectio, the comparison of the readings of the various family members, and the determination of the oldest reading (this is sometimes considered to be part of recensio)
examinatio, the study of the resultant text to look for primitive errors
emendatio, (also called divinatio, and sometimes considered to be a part of examinatio or vice versa), the correction of the primitive errors.

Recensio

Recensio is the process of grouping the manuscripts into a stemma or family tree. Of all the steps involved in classical textual criticism, this is the one regarded as having the least direct relevance for New Testament TC. In this stage, the differences between the manuscripts are compared and a stemma compiled. (This assumes, of course, that several manuscripts exist. If there is only one manuscript, we will omit this stage, as described in the section on books preserved in one manuscript.)

The essential purpose of the stemma is to lighten our workload, and also to tell us what weight to give to which manuscripts. Let's take an example from Wulfstan's thirteenth homily (a pastoral letter in Anglo-Saxon). Five manuscripts exist, designated B C E K M, the latter being fragmentary. According to Dorothy Bethurum, these manuscripts form a stemma as follows (with lost manuscripts shown in [ ] -- a useful convention though not one widely adopted):

    [ARCHETYPE]
         |
    -----------
    |         |
   [X]       [Y]
    |         |
  -----       |
  |   |       |
  C   E       B
  |
 [Z]
  |
-----
|   |
K   M

That is, the archetype gave rise to two manuscripts, X and Y, now both lost. (Based on the stemma itself, it would appear that the archetype was actually the parent of X and Y, but this is by no means certain in reality.) B was copied from Y, and C and E were copied from X. Another lost manuscript, Z, was copied from C, and gave rise to K and M.

Observe what this tells us. First, K and M are direct descendents (according to Bethurum, anyway) of C. Therefore, they tell us nothing we don't already know, and can be ignored. Second, although C, E, and B are all primary witnesses, they don't have the same weight. Since C and E go back to a common archetype [X], their combined evidence is no greater than B alone, which goes back to a separate archetype. (We might find that [X] was a better witness than [Y], but the point is that C and E are dependent and B is independent. That is, the combination B-C against E is a good one, and B-E against C is good, but C-E against B is inherently weaker; it's ultimately a case of one witness against another.)

We also know that K and M have no value at all; their readings all go back to C, and since we have C, we have no need to consult K and M (unless C is incomplete, but that does not apply in this case). Such manuscripts are said to be eliminated from consideration (the process of so doing being called eliminatio.)

So how does one determine a stemma?

One begins, naturally, by collating the manuscripts (in full if possible, though family trees are sometimes based on samples). This generally requires that a single manuscript be selected as a collation base. (Unfortunately, since the manuscripts are not yet compared, the manuscript to collate against must be chosen unscientifically. One may choose to start with the oldest manuscript, or the most complete, or the one most superficially free of scribal errors; as Charles Moorman comments on page 35 of Editing the Middle English Manuscript, the determination can only be made "by guess or God." Fortunately, while choosing the right collation base makes everything easier, using the wrong base should not affect the result.)

Once the manuscripts are collated, one proceeds to determine the stemma. Methods for making this determination vary. Lachmann based his work on "agreement in error." This is a quick and efficient method, but it has two severe drawbacks: First, it assumes that we know the original reading (never a wise assumption, although critics as recent as Zuntz have sometimes used this technique), and second, it requires a fairly close-knit manuscript tradition. Both criteria were met by Lucretius, the author Lachmann studied.

According to the latest research I have seen (summarized on pp. liv-lvii of the Loeb Classical Library edition of Lucretius, the 1992 revised edition by Martin Ferguson Smith), the stemma of the Lucretius is as follows:

          AUTOGRAPH
              |
ARCHETYPE (with 26 lines per page)
              |
   ---------------------
   |                   |
   O                  (?)
(IX; corrected         |
 by Dungal)            ------------
   |                   |          |
(Poggio's MS)          Q        G+U+V
    |                 (IX)      (IX)
    -------------
    |           |
    L     Other MSS (about 50, from Italy; XV-XVI)

This stemma, being so compact, is readily revealed by agreement in error. Other books are not as cooperative. Paul Maas observed that the agreement-in-error method requires two presuppositions: "(1) that the copies made since the primary split in the tradition each represent one exemplar only, i.e. that no scribe has combined several exemplars (contaminatio), (2) that each scribe consciously or unconsciously deviates from his exemplar, i.e. makes peculiar errors" (Paul Mass, Textual Criticism, translated by B. Flowers, p. 3). The first of these conditions will generally be true for obscure writings -- but it is no more true of the Iliad or the Aeneid than it is of the New Testament. As for the latter requirement, it makes scribes into badly-programmed computers -- they are not accurate, but are inaccurate in particular and repeatable ways. This can hardly be relied upon.

In addition, there is an unrecognized assumption in Maas's Point 1: That there is a "primary split" -- i.e. that the text falls into two and only two basic families. Bédier noted that the "agreement in error" method seems always to lead to trees with two and only two branches. (This is not as surprising as it sounds. First, it should be noted that most variants have two and only two readings. Thus a single point of variation can only identify two types. On this basis, if there are more than two types, the types which are more closely related, or merely more similar, will tend to be grouped as a single text-type. Thus when trying to seek new text-types, the first place to look is probably in the largest and most diverse of the established types. This is certainly true in the New Testament; the "Western" text has generally defied attempts to subdivide it, but the Alexandrian text often can be subdivided -- in Paul, for instance, the manuscripts called Alexandrian actually fall into three groups: 𝔓⁴⁶+B, Family 1739, and ℵ+A+C+33+81+1175+al. For fuller discussion, see the appendix on The Bédier Problem.)

The good news is, if we do somehow construct a stemma with more than two branches, things are easy from there: majority rules, and Maas says, "where the primary split is into at least three branches, [it is possible] to reconstruct with certainty the text of the archetype in all places (with a few exceptions to be accounted for separately)."

In any case, for most sorts of literature we cannot identify errors with the certainty that Lachmann could. As Moorman notes (p. 50), "For what passes in recension as science is in fact art and as such depends for its success upon the artistry of the editor rather than the accuracy of the method." E. Talbot Donaldson makes this point even more cogently in "The Psychology of Editors of Middle English Texts": "It is always carefully pointed out that MSS may be grouped together only on the basis of shared error, but it is seldom pointed out that if an editor has to be able to distinguish right readings from wrong in order to evolve a stemma which will in turn distinguish right readings from wrong for him, then he might as well go on using this God-given power to distinguish right from wrong throughout the whole editorial process, and eliminate the stemma. The only reason for not doing so is to eliminate the appearance -- not the fact -- of subjectivity: the fact remains that the whole classification depends on purely subjective choices made before the work of editing begins." The student, therefore, who wishes to have a truly repeatable method and must be content to work from agreements in readings (which is slower but does not depend on any assumptions). This, if pursued consistently, is a more than adequate method (and it can be made to work even if our manuscripts are mixed, as Lachmann's were not). It can also, if a system of characteristic readings is used, identify multiple independent branches of the tree, even if two branches are more similar to each other than to a third branch.

Below: Perhaps the single most important manuscript of Wulfstan: Cotton Nero A I, bearing corrections perhaps by Wulfstan himself. This is the introduction to Homily XX, the Sermon to the English. Observe the Latin introduction -- and how distinct are the alphabets used for the Latin and the Old English!
The Latin preface reads (abbreviations expanded; note the interesting use of the chi-rho for "per"):
SERMO LUPI AD ANGLOS QUANDO DANI
MAXIME PERSECUTI SUNT EOS, QUOD FUIT
ANNO MILLESIMO .XIIII, AB INCARNATIONE DOMINE
NOSTRI IESU CRISTI
The five complete lines of the Old English text shown here are
Leofan men, gecnawað þæt soð is: ðeos worold
is on ofste, ₇ hit nealæcð þam ende, ₇ þy hit is,
on worolde aa swa leng swa wyrse; ₇ swa hit sceal
nyde for folces synnan, ær antecristes tocyme,
yfelian swyþe, ₇ huru hit wyrð þænne

(Note: There are cases where agreement in error is absolutely reliable. A classic instance is in Arrian. Here, one codex is missing a leaf, causing a lacuna. Every other known copy -- there are about forty -- proceeds from the last word on the page before the loss to the first word of the page after, with no indication of anything missing. Thus, one can be sure that all the manuscripts are descended from this one -- and that it lost the leaf before the others were copied. Observe that this is identical to the situation of F^p and G^p, which also have lacunae in common.)

(Additional note: It appears that this method has now been rendered truly reliable. Stephen C. Carlson's work on Cladistics seems at last to have rendered stemmatics mathematically coherent and repeatable.)

This is not entirely to dismiss agreements in error even in the New Testament tradition. I use agreements in error regularly in grouping Byzantine manuscripts. For closely-related texts such as those, it is a completely reliable method. The problem comes in when one moves away from the closely-related texts. Zuntz, for instance, classed 𝔓⁴⁶, B, and 1739 together based on what he considered shared errors. But looking at overall agreements makes this appear quite wrong: 𝔓⁴⁶/B and 1739 are separate types, and Zuntz's shared errors in fact give every evidence of being the original text!

It's worth stressing that there are instances where scholars have created inaccurate stemma by the above means. The Middle English work Pierce the Ploughman's Creed (Piers Plowman's Creed) exists in three substantial copies. W. W. Skeat thought all three to be derived from the same original. A. I. Doyle offered strong evidence that this is not so. An even more absurd situation occurs in the homilies of Wulfstan. There are four extant manuscripts of Homily Xc: C E I and B. N. R. Ker suggested that I contained marginalia in the hand of Wulfstan himself, and Dorothy Bethurum concedes that it offers "a more authoritative text of the homilies it contains than do any of the other manuscripts" -- yet she offers this stemma, which puts I and its marginalia at the end of the copying process:

     [Archetype]
          |
   -----------------
   |               |
  [X]             [Y]   <-- lost heads of manuscript families
   |               |
 ---------       ------
 |    |   \      |    |
 C    E    \    I*    B
            \   /
             \ /
             I**

A possible stemma, certainly, but what are the odds that Wulfstan would work on a third generation manuscript instead of a first or second generation copy?

Even if documents do descend from the same original, it cannot automatically be assumed that they are sisters as opposed to cousins at some remove. If manuscripts are sisters, then every deviation, be it as small as a change in orthography, must be explained. These requirements are much less strict for cousins, since there could have been work done on the intervening copies. It is much easier (and probably more accurate!) to produce a sketch-stemma than a detailed stemma -- and there is really no loss. If you know which manuscripts are descended from others, no matter at how many removes, the primary purpose of recensio has been served. (And it's worth noting that sketch stemma are possible even for New Testament manuscript groupings such as Family 2138.)

Sometimes it will be found that recensio brings us back to a single surviving manuscript. For example, it is believed that all Greek manuscripts of Josephus's Against Apion are derived from the imperfect Codex Laurentianus (L) of the eleventh century. In this case we are, in effect, in the situation of having only one manuscript (or, in the case of Against Apion, one manuscript plus a Latin translation and extensive quotations from Eusebius, the latter two being the only authorities for a large lacuna in L and all its descendants). We proceed to the final stages (examinatio and emendatio) as described below.

(We should add a few footnotes to the above statement about it not mattering if there are intermediate generations between ancestor and descendant manuscripts; the statement is absolutely true only if the archetype manuscript is complete and entirely legible, and if all the descendents are immediate copies. If, for instance, the exemplar is damaged, even for just a few letters, we may need to turn to the copies to reconstruct it. This happens in the New Testament, e.g., with Codex Claromontanus and its copies. D/06 has lost its first few verses, and we use D^abs1 -- which has no other value -- to reconstruct them. Also, if manuscript B is not a daughter of manuscript A, but rather a granddaughter or later descendent, it may have picked up a handful of reading from mixture in the intervening steps. Although most places where B differs from A can be ignored as scribal errors, it is not proper to dismiss them entirely out of hand. Similarly, there may be marginal scholia in B which come from a different source, and may inform us of other readings.)

While some traditions will resolve down to a single surviving archetype, it is also common to find that all the manuscripts prove to derive from a lost archetype which is not the autograph. This is the case, for instance, with Æschylus. We have dozens of manuscripts all told (in fact, the number approaches one hundred) -- but they all contain the same seven plays or a subset. It appears that every extant manuscript derives its contents from a single manuscript of about the second century, which contained these seven and no others. (The later copies may include a few readings derived from other ancient manuscripts, but the plays they contain are based on that one manuscript.)

To critics accustomed to the riches of the New Testament, this may seem highly unlikely. But we should recall that most classical texts, including Æschylus and the other Greek dramatists, were the sole preserve of the educated -- used only in the schools to teach Attic grammar and the like (even a relatively small book cost the equivalent of a month's wage for a civil servant, and could be more; the tenth century Archbishop Arethas's copy of Plato cost 21 gold pieces when the annual salary was 72). In a number of cases of classical works, it is theorized that the ancestor of all copies was a lone uncial. In the ninth or tenth century, perhaps as a result of Photius's revival of learning, this uncial was transcribed into minuscule script. Since this transcription took real effort (the scribe had to determine accents, word divisions, etc.), all later copies would be derived from this one ninth century minuscule transcript. The only way multiple families would emerge is if two different schools transcribed their uncials. (Or, of course, if the text evolved after the ninth century, but given the limited number of copies made in that time, when the Byzantine Empire was much reduced and under severe stress, this seems relatively unlikely.) Even if other copies existed in Byzantine libraries, vast numbers were destroyed in the sacks of Constantinople in 1204 and 1453. (It is believed, in fact, that the Christian Crusaders who sacked Byzantium are more at fault than the Ottoman Turks who finally captured Constantinople in 1453. The Crusaders had no use for literature, while the Ottomans respected learning. In addition, real efforts were made to rescue surviving literature after 1204. So if an author's work was not made accessible in the years after 1204, it is probably because all copies had been destroyed by then.) Therefore, when confronted with a single lost manuscript, we reconstruct that archetype and then proceed to examinatio and emendatio.

But for documents which were widely copied (even if only a limited number of copies survive), we usually find more complex traditions, such as those shown here for Seneca's tragedies and Xenophon's Cyropædia. In these instances, there were a handful of early copies which spawned families of related manuscripts.

In these charts, extant manuscripts are shown in plain type and lost, hypothetical manuscripts are shown in [brackets]. Fragments are marked %.

        [Seneca's Autograph]
                |
       ------------------
       |                |
   [E-Group]        [A-Group]
       |                |
  -------------     -----------
  |     |     |     |    |    |
  E     R%    T%    α    ψ    A¹
  |
 [Σ]
  |
-----
|    |
M    N

            [Xenophon's Autograph]
                     |
  ----------------------------------------------
  |        |          |            |     |     |
 [x]      [y]        [z]           |     |     |
  |        |          |            |     |     |
-----    -----    ---------        |     |     |
|   |    |   |    |   |   |        |     |     |
C   E    D   F    A   G   H        r%    m%    π²%

This situation also occurs in New Testament manuscript families. (So there is actually some relevance to this.) For example, Von Soden's breakdown of Family 13 would produce a stemma like this (note that other scholars have given somewhat different, and perhaps more accurate, stemma):

                          [Φ]
                           |
     ----------------------------------------------------
     |             |                   |                |
    [w]           [x]                 [y]              [z]
     |             |                   |                |
-----------     -------      ---------------------      |
|    |    |     |     |      |    |    |    |    |      |
13  788  69    1689  983    826  543  346  230  828    124

It should be noted that stemma are not always this simple; families may have sub-families. Rzach, for instance, found two families in Hesiod's Theogony, which he labelled Ψ and Ω. But Ω, which consisted of seven manuscripts (to two for Ψ), had three subgroups, Ωa, Ωb, and Ωc.

This reminds us of Bédier's warning about finding only two branches, and also about making casual assumptions about the relationships of the groups. In the Theogony, can we be sure that the two manuscripts of Ψ actually form a group, or are they simply non-Ω manuscripts? (This problem is well known in other contexts: It's called "long branch assimilation," where two specimens far from the main mass appear to converge simply because they're so different.) Do the three subgroups of Ω actually form a larger group, or are they simply closer to each other than to Ψ? There is no assured answer to any of these questions, but it reminds us that we must be careful in constructing our stemma. One should also be aware that new discoveries can affect the stemma. (This, in fact, can apply also in NT TC; the discoveries of 𝔓⁴⁶, 𝔓⁴⁷, and 𝔓⁷⁵ have all given us reason to re-examine the textual picture of the books they contain.)

Having determined the families, their nature must be assessed. This process has analogies in New Testament criticism (consider Hort's analysis of the "Western" and Alexandrian/"Neutral" types), except that in classical criticism it usually applies to precisely defined texts as opposed to Hort's less-well-defined text-types. (The difference being that the reading of a text, being derived from a single ancestor, can in theory be determined exactly; text-types properly speaking will not have a single ancestor, and so no pure original can be reconstructed. Text-types are a collection of similar manuscripts.)

Once the types have been assessed, it may prove that one or another group is so corrupt as to offer little more than a source of possible emendations. (This is almost the case with the families of Seneca shown above: The E text is regarded as clearly superior, so much so that A-group readings are rarely considered if the E group makes sense. This rule is also often applied, though unjustifiably, in Old Testament criticism, where the LXX usually is not even consulted unless the Masoretic Text appears defective.) But this situation where one particular family is universally superior is not usual; more often we find that each group has something to contribute -- though we may also find that different groups have different sorts of faults (e.g. one may be prone to omission, one to paraphrase, and another to errors of sight).

Once we have assessed the types, we proceed to the next step in the process....

Selectio

This phase of the critical process occurs only if recensio reveals two or more textual groupings more recent than the autograph. If we have only one manuscript, or if our manuscripts all go back to a single ancestor, selectio has no role to play. For selectio consists of choosing the most primitive of the surviving variants.

When we begin this process, we know our materials. Manuscripts have been grouped, their local archetypes more or less reconstructed, and their variants known. Now we must proceed to assess and choose between the variants.

Here one applies canons of criticism generally similar to those applied to the New Testament, though there are exceptions. So, for instance, we still accept the rule "that reading is best which best explains the others." And obviously the same basic scribal errors (homoioteleuton, etc.) still occur. But in secular works, one is unlikely to see the piling on of divine titles one often observes in the Bible (so, e.g., if a Greek author refers to "the Lord," it is hardly likely that a scribe will expand it to read "the Lord Jesus Christ"). Similarly, there is little likelihood of assimilation to remote parallels such as we find in the Gospels and Colossians (although assimilation to local parallels can and does occur). And, of course, there is no Byzantine text to influence the tradition (though there may, in some limited instances, be some equivalent sort of majority text that affects other manuscripts).

For all that we apply canons of criticism here, the usual approach is a sort of "modified majority" process (rather like the American electoral system, in which each congressperson is elected by a majority in that person's district, and laws are passed by a majority of those congressmen -- meaning that a law can actually be passed despite being opposed by the majority of the general electorate). Consider the following provisional stemma of nine manuscripts M, N, O, P, Q, R, S, T, U. The manuscripts A (the archetype), B, C, D, and E are all hypothetical (indicated by square brackets about the letters).

                   [A]
                    |
       --------------------------
       |                        |
      [B]                      [E]
       |                        |
  -----------                   |
  |         |                   |
 [C]       [D]                  |
  |         |                   |
-----     -----         -----------------
|   |     |   |         |   |   |   |   |
M   N     O   P         Q   R   S   T   U

Now suppose we have two readings, X and Y. Assume these two are equally probable on internal grounds. Assume that X is read by M, N, P, and R, while O, Q, S, T, and U have reading Y. Thus, Y is the majority reading. However, reconstruction indicates that X is actually the correct reading. How do we determine this? We follow these steps:

Observe that M and N agree (this is the only subgroup where all the manuscripts agree). Therefore C had reading X, since this is supported by both M and N.
Observe that C agrees with one of the manuscripts of the D group (in this case, P). This implies that the original reading of D was X, in agreement with C, and that the reading of B was therefore X
Observe that B agrees with one of the manuscripts of the E group (in this case, R). This implies that the original reading of E was X, and that the reading of A was therefore X.

The above is not absolutely certain, of course. If reading X could have arisen as an easy error for Y, then Y might be original. Or there might be mixture -- the eternal bugaboo of critics -- involved. Intelligence and critical rules must be applied. But the above shows how a text can be reconstructed where critical rules are not clear. Whatever rule we use for a particular reading, we eventually reconstruct the set of readings we believe to have existed in the archetype.

When this is done, we have achieved a provisional text -- the earliest text obtainable directly from the manuscripts. It is at this point that Biblical and classical textual criticism finally part ways. As far as Biblical TC is concerned, this is usually the last step -- though Michael Holmes has argued ("Reasoned Eclecticism in New Testament Textual Criticism," published in Bart D. Ehrman & Michael W. Holmes, editors, The Text of the New Testament in Contemporary Research, p. 347), that there is no fundamental reason why New Testament criticism must stop here. The general opinion of New Testament critics was expressed by Kirsopp Lake in this way (The Text of the New Testament, sixth edition revised by Silva New, pp. 8-9): "In classical textual criticism, the archetype of all the extant MSS. is often obtainable with comparatively little work, but often is very corrupt. There is therefore scope for much conjectural emendation. In Biblical textual criticism, on the other hand, it is still doubtful what is the archetype of the existing manuscripts. But at least we may be sure that it is an exceedingly early one, with very few corruptions, and therefore the work of conjectural emendation is very light, rarely necessary[,] and scarcely ever possible.")

Thus it is only in classical criticism that we proceed to...

Examinatio

This process consists, simply put, of scanning the text for errors. This step, though it may be distasteful, and certainly difficult, is necessary. Classical manuscripts were no freer of errors than were Biblical manuscripts, and are often further removed from the archetype, meaning that there have been more generations for errors to arise. So the scholar, armed with knowledge of the language and (if possible) of the style of the writer, sets out to look for corruptions in the text. If they are found, the editor proceeds to...

Emendatio

If examinatio consists of looking for errors, emendatio (also known as divinatio) consists of fixing them. This, obviously, requires the use of conjectural emendation. This is no trivial task! Take the Anglo-Saxon epic Beowulf as an example. The Chickering text (Howell D. Chickering, Jr., Beowulf, Anchor, 1997) includes about 280 readings not in the manuscript (of which some 200 are conjectural emendations), and other editors have proposed many emendations not adopted by Chickering. The case of the Old English poem "The Seafarer" is even worse: in 124 lines of four to ten words each (usually toward the lower end of that range), the edition of I. L. Gordon adopts 22 emendations (I. L. Gordon, The Seafarer, Methuen's Old English Library, 1960). Thus the effort involved in correcting these texts can often be greater than that of simply comparing manuscripts.

One will sometimes see the final stage, of constituting or seeking for the original text, referred to as constitutio.

Of course, the way one proceeds through the four steps of classical criticism depends very much upon the actual materials preserved. We say, for instance, that emendatio is the final step in the process. But it should use the results of the other steps. The variants at a particular point, for instance, may give a clue as to what was the original reading. If, for example, we were to find two variants, "He went to bet" and "He went too bad," a very strong conjecture would be that the original was "He went to bed." Therefore we must perform each step based on the materials available. Nor is emendation a trivial task. To repair a damaged text requires deep understanding of the language and the author's use of it (a better understanding than is required simply to read the text; when reading, you can look up a word you don't know. How can you look up a word which may not even exist?). It also requires great creativity -- and knowledge of all the materials available. The following sections outline various scenarios and how critics proceed in each case.

It is perhaps worth noting that not all textual critics have been equally skilled in all of these areas. A good example is J. R. R. Tolkien in his role as a professor of philology, in which he edited several texts. His work on Sir Gawain and the Green Knight, which existed in only one manuscript, was considered excellent; there was literally no one in the world who understood the Gawain-poet's dialect and viewpoint better than Tolkien, so he was able to edit the text with an insight no one else could match. But when he set to work on Chaucer's "Reeve's Tale," he turned into the most radical of radical eclectics, basically ignoring manuscript relationships and adopting any reading he could find that conformed to his notions of fourteenth century dialects, and even occasionally emending when he didn't see the reading he thought correct. Having many manuscripts did't bring him closer to the original text; in effect, it gave him more chances to fiddle with it. On the other hand, Manley and Rickert, who are the only editors to have tried to truly classify the dozens of manuscripts of Chaucer's Canterbury Tales, spent so much effort classifying manuscripts that their text was perhaps not as well-considered as it should have been. Thus, although the tendency historically has been to have one person edit a text, a strong case could be made, when there are many manuscripts, for having two, or even three, editors of a text, one in primary charge of recensio and selectio and one primarily responsible for examinatio and emendatio. If there are three, one might even split recensio and selectio, with a good solid mathematician type in charge of recensio, a true linguist in charge of examinatio and emendatio, and someone with intermediate skills in charge of selectio.

Books Preserved in One Manuscript

In terms of steps required, this is the easiest of the various sorts of criticism. There is no need for recensio or selectio. One can proceed immediately to examinatio and emendatio.

But there are complications. For one thing, when there is only one manuscript, one is entirely dependent upon that manuscript. There is nothing to fall back on if the manuscript is illegible. And this can be a severe problem. Taking the case of Beowulf, the only surviving manuscript was burned in the Cotton Library fire, and is often illegible. So we are largely dependent on two transcripts made some centuries ago, both of which have problems of their own. Other manuscripts may present equivalent or even greater difficulties. The manuscript may be a palimpsest. Or it may use a non-standard orthography. In a handful of instances we may not even be able to read the script of the original (e.g. the Greek Linear A writings, but also some Persian inscriptions and even Old English writings in odd forms of the runic alphabet.) Thus it is more likely, in the case of a single-copy text, that the scholar will have to pay particular attention to the seemingly simple task of just reading the manuscript.

The second problem of texts preserved in a single copy is that we have no recourse in the event of an error. If a Biblical manuscript has lost a line, we can determine its reading from another copy. But if the ancestral copy of the Antigone has lost a line (and we can tell that it is missing because the surrounding lines make nonsense), how can we correct it? I use this as an example because this is a case where we can show this happened; the text of Antigone 1165-1168 makes nonsense in all the manuscripts. We know the correct reading only because Eustathius's commentary preserves the missing line. In the case of multiple manuscripts, even if all of them have an error, the nature of the mistakes may tell us something about the original. Not so when there is only one copy.

The sole manuscript of Beowulf, Cotton Vitellius A.xv. The first page of the poem. The photograph, digitally adjusted to increase legibility, still shows the scorch marks at the bottom of the page; the outer margin has also been eaten away by the fire, with some loss of text (corrections in []). For a better view of the actual manuscript, see the British Library site; there it an image here. The first seven lines of the text read as follows (the *, equivalent to a raised point in the Old English, indicates the end of a metrical line; these are not always marked in the manuscript; where they are not, {*} is used; suspended letters are spelled out. The word division matches the manuscript as I read it, though modern editions consider this defective):

HWÆT WE GARDE
na in gear dagum * þeod cyninga
þrym ge frunon {*} huða æþelingas elle[n]
fre medon * oft scyld scefing sceaþe[na]
þreatum monegum mægþum meodo setla
ofteah {*} esgode eorl syððan ærest wear[ð] {*}
fea sceaft funden he þæs frofre geba[d]

Thus the task of editing a book preserved in only one manuscript is arguably the most complex and difficult in textual criticism, for the scholar must reconstruct completely wherever the scribe has failed. We have already seen that these manuscripts often need vast numbers of emendations. They also require particularly clever ones.

There is a minor variation on this theme of emendation in the case of works which exist in only one manuscript, but for which we also have epitomes or other works based on the original source. (An example would be the portions of Polybius which overlap the surviving portions of Livy. Livy used Polybius, often quoting him nearly verbatim but without identifying the quotations.) These secondary sources can supply readings where the text is troubled. However, since the later sources are often rewritten (this is true even of the epitomes), and may be interpolated as well, it is usually best to use them simply as a source for emendations rather than to use them as a source of variant readings.

This theme has a variation in the case of editions copied from other editions: This applies in the case of Malory above, and also some of Shakespeare's plays, where we have two semi-independent editions. Caxton surely consulted British Library Add. 59678, but he must have consulted something else, too, even if it was only his own head. In the case of Shakespeare, we can take A Midsummer Night's Dream as an example: There are two texts, the quarto (properly, the first quarto, but the second quarto was copied from the first quarto and is not an independent witness) and the folio, copied from the second quarto but with corrections seemingly from an authoritative second source. The interesting question here, then, is how authoritative is the text in the places where our two sources agree: Does this agreement have as much strength as an instance where two genuinely separate sources agree (meaning that we trust the joint reading as much as a reading supported by two different manuscripts), or is it a case where one corrector or another didn't notice a divergence? This question, unfortunately, has no simple answer -- but one need to bring it up and be aware of the problem.

Another variation is the criticism of inscriptions. Although an inscription is, of course, the original inscription, it is not necessarily the original text. When Darius I of Persia ordered the making of the Behistun inscription, he certainly didn't climb the rock and do the carving himself -- rather, he composed a message and left it to the workers to put it on the rock. Thus the inscription will generally be a first-generation copy of the original. This is still much better than we expect for literary works -- but it is not the original.

Still another variation is the Gilgamesh Epic. This exists in multiple pieces, recensionally different, in multiple languages, from multiple eras, with some of the later versions incorporating material originally separate, and not one of the major recensions is complete. Here one has to step back from the problem of deciding how to reconstruct and first settle what to reconstruct.

Books Preserved in Multiple Manuscripts

This is the case for which Lachmann's technique is best suited. It is ideal for traditions with perhaps five to twenty manuscripts, and can be used on larger groups (though it is hardly practical if there are in excess of a hundred manuscripts).

We begin, of course, with recensio. This can have three possible outcomes:

All manuscripts are descendents of a single manuscript, which survives. In this case we simply turn to that manuscript, and proceed to subject it to examinatio and emendatio.
All manuscripts are descendants of a single manuscript now lost. In this case we reconstruct the archetype (this will usually consist simply of throwing out errors, since all the manuscripts have a recent common ancestor), and proceed as above, subjecting this reconstructed text to examinatio and emendatio.
The manuscripts fall into two or more families. In this case, we proceed through the full process of selectio, examinatio, and emendatio.

Books Preserved in Hundreds of Manuscripts

This is an unusual situation; very few ancient works are preserved in more than a few dozen manuscripts. But there are some -- Homer being the obvious example. (Another leading example, the Quran, is rarely considered as a subject for textual criticism. At least one major edition of the Quran, in fact, was not even taken from manuscript; it was compiled by comparing the recitations of 20 or so Quranic scholars. The primary tradition of the Quran is considered to be oral, not written.) The Iliad, which is preserved in somewhat more than 600 manuscripts, is believed to be the most popular non-religious work of the manuscript age. (Of course, it should be noted that the works of Homer were regarded as scripture by the Greeks -- but certainly not in the same way that the New Testament was regarded by Christians!)

In the handful of cases where manuscripts are so abundant, of course, the stemmatics used for most classical compositions become impossible. We have the same problem as we do with the New Testament: Too many manuscripts, and too many missing links. We are forced to adopt a different procedure, such as looking for the best or the most numerous manuscripts.

Since the methods used are fundamentally similar to those used for New Testament criticism, we will not detail them here. It is worth noting, however, that most critics consider the Byzantine manuscripts of Homer to be more reliable than the assorted surviving papyri. The papyri will occasionally contain very good readings -- but in general they seem to contain wild, uncontrolled texts. Whereas the Byzantine manuscripts reflect a carefully controlled tradition, presumably going back to the Alexandrian editors who standardized Homer.

This fact should not be taken to imply anything about New Testament criticism; the situations are simply not parallel. But it serves as a reminder that a late manuscript need not be bad, and an early one need not be good. All must be judged on their merits.

It should be noted that computers have been used to work on a stemma for Chaucer manuscripts, which number in the dozens if not in the hundreds. So there is hope that we will see stemma for texts such as Homer.

Books Preserved in Multiple Editions

A special complication arises when books are preserved in multiple editions. This is by no means rare; an author would often be the only scribe available to copy his own work, and should he not have the right to expand it? (We may even see a New Testament parallel to this in the book of Acts, where some have thought that the author produced two editions, one of which lies behind the Alexandrian text and the other behind the text of Codex Bezae.) Even authors who were not their own scribes would often expand their work. The Vision of Piers Plowman, for instance, exists in three stages (perhaps even four, though the fourth is probably a prototype and was not formally published). The first stage, known as "A," is 2500 lines long, and does not appear to have been finished. Some years later the "B" text, of 4000 lines, was issued (this is the text most often published and is the basis for most modernized versions). A final recension, the "C" text (only slightly longer, but considered to be of poorer quality) followed a few years later. All were probably by the same author (though this is not certain), but it is believed that, in revising the "B" text to produce the "C" version, the poet used a manuscript that was produced by a different scribe. What became of the original copy of the "B" text is unknown; perhaps it was presented to a patron.

Near-contemporary but not really a witness: Piers the Plowman, the upper portion of folio 9 in Cotton Vespasian B.xvi (cited in critical editions as "M"). Thought to date from the late fourteenth century, which, since the "C" text dates from around 1385 and the author died within a year or two of that date, was copied within a generation of the author's death. But, since it is a revision of the "C" text, itself a revision of the "B" text, it tells us nothing useful about the "A" text shown in the stemma at left.

Note how different this hand is from the Anglo-Saxon hands used for Beowulf and Wulfstan. Note also the elaborate use of coloured inks: the red dots to indicate line breaks, the red in the first letter of almost every line, the coloured first letters of sections, and the marginal squiggles which also mark section breaks. Finally, observe how very different is the hand scribbling in the margin.

This also poses a problem for the scholar working on a stemma. The edition of the "A" text of Piers Plowman by Thomas A. Knott and David C. Fowler, for instance, gives the following stemma (somewhat simplified), with actual manuscripts denoted by upper case letters (sometimes with subscripts or two-letter abbreviations) and ancestors in lower case letters:

            |----x-----V H
            |
Archetype---|----y--|--y1----T H₂ Ch D
            |       |
            |       |--y2----U R T₂ A M H₃
            |       |
            |       |--y3----W N Di
            |       |
            |       |--------I
            |       |
            |       |--------L
            |
            |----------------B text

Thus, for Piers Plowman, a later recension must be used as one of the three witnesses to the earlier recension -- a practice which, if we were to do it in another context, we would not call "reconstruction" but "contamination" (or, if we want to make it sound nicer, "harmonization").

Even more curious is the case of the Old English poem The Dream of the Rood, which exists in a long form, in the Roman alphabet, in the tenth century Vercelli Book, and in a much shorter form, in a runic script, inscribed on the eighth(?) century Ruthwell Cross. (In this instance it is not really clear what the relationship between the texts is.)

We could cite many other instances of works existing in multiple editions (e.g. Julian of Norwich; for that matter, we know that even Josephus issued multiple editions of his works). Indeed, there is a modern equivalent, even if I hate to mention it: Consider the movie, which often has a "studio cut" and a "director's cut." But citing examples is not our purpose here; our interest is in what we learn from these examples.

In addition to editorial work, multiple editions can come about as the result of ongoing additions to a document. This typically occurs in chronicle manuscripts. The Anglo-Saxon Chronicle, for instance, begins with a core created by King Alfred of Wessex (reigned 871-899). But from then on, the various foundations maintaining it kept their own records, often comparing the documents. In addition, a new foundation might make a copy of an older Chronicle then add its own additions (so, for example, with Chronicle MSS. A and A²). And, since the Chronicle was updated sporadically, it is theoretically possible for a manuscript to be "its own grandpa" -- the first part of A² is copied from A, but later parts of A might (barely possibly) be derived at some removes from A² or another lost descendant. To add to the fun, the manuscript A is in a different dialect of Anglo-Saxon from all other Chronicle manuscripts. The different recensions cannot be considered translations -- the dialects were still one language -- but adjustments had to be made to conform the text in one dialect to the idiom of another.

When multiple editions of a work exist, of course, it is not proper to conflate the editions to produce some sort of ur-text. The editions are separate, and should be reconstructed separately. The question is, to what extent is it legitimate to use the different editions for criticism of each other?

Although the exact answer will depend on the circumstances, in general the different editions should not be used to edit each other. (They can, of course, be used as sources of emendations.) They may be used as witnesses for one or another variant reading -- but one should always be aware of the tendency to harmonize the different editions.

Textual Criticism of Lost Books

At first glance, textual criticism of a lost book may seem impossible. And in most cases it is; we cannot, for instance, reconstruct anything of Greek tragedy before Æschylus.

But "lost" is a relative term. The "Q" source used by Matthew and Luke is lost, but scholars are constantly reconstructing it. The situation is similar for many classical works. Consider, for example, the Egyptian historian Manetho. We have absolutely nothing direct from his pen. So much of his work, however, was excerpted by Eusebius and Africanus (and sometimes by Josephus) that Manetho's work still provides the outline of the Egyptian dynasty list.

This is by no means unusual; many classical works have perished but have been heavily excerpted. Polybius is another good example. Of his forty-volume history, only the first five books are entirely intact (we also have a large portion of book six, and a few scattered fragments of the other books). But most of the information from Polybius survives in the writers who consulted him -- Livy and Diodorus used him heavily, and Plutarch and Pliny occasionally.

The problem in Polybius's case -- as in Manetho's -- lies in trying to determine what actually came from the original author and what is the work of the redactor. (We can perhaps grasp the scope of the problem if we imagine trying to reconstruct the Gospel of Mark if we had only Matthew and Luke as sources.) This is made harder by the fact that the redactors often introduced problems of their own. (A comparison of Africanus's and Eusebius's use of Manetho, for instance, shows severe discrepancies. They do not always agree on the number of kings in a dynasty, and they often disagree on the length of the reigns. Even the names of the kings themselves sometimes vary.)

Thus it is often possible to recover the essential content of lost books. However, one should never rely on the verbal accuracy of the reconstructed text.

There are variations on this theme. When the second part of Don Quixote was long delayed, an enterprising plagairist published a continuation in 1614. This was not an actual work of Cervantes (who published his correct continuation in 1615), but it thought to have been based at least in part on a manuscript Cervantes allowed to circulate privately. The result is at least partly genuine Cervantes -- but not something the author wanted published, and not entirely in his own words, either.

Other differences between Classical and New Testament Criticism

We have already alluded to several of the differences between Classical and New Testament criticism: The difference in numbers of manuscripts, the use of stemmatics, etc. There are other differences which much sometimes be kept in mind:

The Age of the Manuscripts. Our earliest New Testament manuscripts are very close to the autograph. Based simply on its age, it is theoretically possible (though extremely unlikely) that 𝔓⁵² is the autograph of the gospel of John. Certainly it is only a few generations away from the original. Even the great uncials B and ℵ are only a few centuries more recent than the autographs. Manuscripts of the versions or their recensions may be even closer to the original -- as, e.g., theo of the Vulgate may have been prepared under the supervision of Theodulf himself.
Such near-contemporary manuscripts are extremely rare for classical works (with the obvious exception of documents written in the few centuries before the invention of printing). While we often have very early manuscripts of classical works, they are still many years removed from the originals (e.g. the earliest manuscript of the pseudo-Hesiodic Shield of Heracles is P. Oxyrhynchus 689 of the second century -- a very early copy, but likely 500 or more years after the composition of the original). The problem is less extreme for some post-Biblical works (e.g. we have seventh-century manuscripts of Gregory of Tours, who wrote in the sixth century), but even these works usually exist only in very late copies. Related to this is:
The Possibility of an Autograph. Geoffrey of Monmouth's Historia Regum Britanniae exists in some 200 copies (a sad testament to the tendencies of ancient scribes, since this is a piece of bad fiction disguised as history). The book was written probably shortly before 1140. Three copies are individually dedicated to Earl Robert of Gloucester (died 1147), who may have been Geoffrey's patron; to King Stephen (reigned 1135-1154), and to Stephen's close supporter Galeran of Meulan (died 1166?), respectively. Could one of these be the autograph? Or at least an autograph -- a copy in Geoffrey's own hand? The editions at my disposal don't say one way or the other -- but there is no obvious reason why it couldn't be so.
The Evolution of the Language. Languages change with time, and manuscripts can change with them. In Greek, the obvious example is the disappearance of the digamma (ϝ). We know that Homer used this obsolete phoneme, and Hesiod seems to have used it as well (though it was less important by his time). But our extant manuscripts do not preserve it. The scholar who reconstructs an early Greek text must therefore be careful to note the possible effects of its disappearance.
This effect can also be seen, to some extent, in the New Testament (e.g. in the form of Atticising tendencies). However, the mere fact that the New Testament was the New Testament kept this sort of modernization to a minimum. (See also the next item.)
There are variations on this theme -- notably changes in the alphabet. Gregory of Tours records that the Frankish King Chilperic of the Franks attempted to add four new letters to the (Roman) alphabet, and ordered books written in the old alphabet to be erased and rewritten (HF V.44). This attempt at linguistic revision did not succeed -- but it may well have resulted in the destruction of important manuscripts and in less-accurate copies of others.
Something similar certainly happened with ancient Greek literature. In the early Classical period, there were numerous versions of the Greek Alphabet. Some of the differences were just graphical -- e.g. the Ionic alphabet used a four-stroke sigma (ᛊ) while the Attic used a three-stroke sigma (Ⰽ). But some were more significant: The Ionic alphabet had used the letter Omega, but the Attic didn't, and Corinth used M for the s sound. It wasn't until 403/2 B.C.E. that Athens formally adopted the Ionic alphabet, and some older writers probably continued to use the Attic alphabet for some time. Thus the earliest copies of most of the Greek tragedies, and very likely Homer and Hesiod as well, were originally written in alphabets other than the Ionic, and had to be converted. This means, first, that there could be errors of visual confusion in the text based on both Ionic and Attic forms, and second, that there could have been errors in translation between the alphabets.
The Semitic languages show another version of this: The addition of vowels. Each language added vowel symbols at different stages in its development, often imperfectly at first (e.g. Jacob of Edessa's system of Syriac vowels included only four symbols).

The illustration above shows a very simplified diagram of the evolution of most current alphabets. Solid lines indicate direct descent, dashed lines indirect descent. Any change in alphabetic form (including more minor ones such as changes in handwriting style, not shown here) will likely affect the history of the text of a manuscript. Above illustration adapted from page 255 of the article "The Early Alphabet" by John F. Healy in Reading the Past: Ancient Writing from Cuneiform to the Alphabet.
Dialect and Spelling. It's quite certain that modern NT editions do not use the actual orthography of the original autographs. However, there is a recognized dialect and set of spelling rules for koine Greek. Thus, except in the case of homonyms, there is no question of how to reconstruct a particular word.
Not so in non-Biblical works! If the manuscripts are any indication, Chaucer did not use consistent spelling -- and even if he did, there were no conventions at the time, and his spelling would not match that of Gower or Langland or the Gawain-poet. Indeed, Chaucer and the Gawain-poet used dialects so different as to be almost mutually incomprehensible. And a particular copyist might personally speak a different dialect than the one used in the work he was copying, and so misunderstand or alter the text. We see this also in Herodotus, who evidently wrote in his own Ionic dialect with some ancient forms. In the manuscripts, however, we find forms "that it seems unlikely Herodotus could ever have written" (Concise Oxford Companion to Classical Literature, p. 265).
This imposes two burdens on the critic. First, there is the matter of properly reconstructing the original. Then there is the matter of orthography. Should one use the orthography in the manuscripts? Should one reconstruct the author's orthography (which may differ substantially from that found in the manuscripts)? Should one use an idealized orthography? An idealized dialect? What if the manuscript exists in two dialects (as, e.g., happens with most Old English works preserved in multiple copies)? There is no correct answer to this, but the student must be aware of the problem.
This can get really interesting when combined with the problem of different recensions. Piers Plowman, as noted above, exists in three recensions, all of which exist in multiple copies. But several of these manuscripts have been modified to conform to a particular dialect. It is possible, under certain circumstances, that the modifications in dialect could cause texts of different recensions to come closer together, which could confuse the manuscript stemma. (We see hints of this in the case of the Old Church Slavonic version as well, as this version has undergone steady assimilation toward the developing South Slavic dialects.)
In some traditions (particularly French literature) there has been a tendency to use dialects as a critical tool -- i.e., if a document exists in multiple dialects, then the manuscript(s) in the author's original dialect must be closest to the original. This may be true in some instances, but is far from assured. The manuscripts in the original dialect may have suffered severely in transmission, while one of the translated works may have been carefully preserved apart from that. Or the manuscripts in the original dialect may possibly have been subjected to double translation, in which case they are no guide to the original language. In neither case can we be sure of the value of manuscripts in the original dialect.
The state of the Early Printed Editions. For the New Testament, we have no real need to refer to either Erasmus's text or the Complutensian Polyglot, which are (for all intents and purposes) the only early editions. We have all of Erasmus's manuscripts. We don't know the manuscripts behind the Polyglot, but the text contains very little in the way of unusual readings. If these editions had not existed, we would be no worse off (indeed, given the regrettable influence exercised by the Textus Receptus, we probably would be better off if they had not existed). Not so with classical works! Early editions of Josephus seem to be based on manuscripts no longer known. Caxton's and Thynne's edition of Chaucer tend to represent witnesses which no longer survive -- usually not the best types, but not valueless, either. The case is similar for many other works. Scholars, therefore, should examine ancient editions with some care to see if they add to our knowledge.
Books which Occupied More than One Volume. The New Testament, of course, is commonly divided into four separate sections, Gospels, Acts and Catholic Epistles, Paul, Apocalypse. These sections have separate textual histories, and sometimes even the books within the sections have separate histories. Because the books are relatively short, however, and were usually copied in codex form anyway, there are few if any instances of Biblical books being subdivided and the individual sections having separate textual histories. Here again the rules are different for non-Biblical documents. Many of the manuscripts of Josephus's Antiquities, for instance, contain only half the work -- and even those which contain both halves may be copied from distinct manuscripts of the two halves. The halves may well have separate textual histories. Scholars must be alert for such shifts.
The Language of the Scribe. Most copies of the New Testament were made by scribes whose native language was Greek (usually Byzantine rather than koine Greek, but still Greek). There are exceptions -- L, Θ, and 28; also perhaps some of the polyglot manuscripts -- but these were exceptions rather than the rule. By contrast, most of our copies of Latin manuscripts were made by scribes whose native language was not Latin. They knew Latin -- but it was church rather than Classical Latin, and in any case it was a second tongue. So one should always be aware of the errors an Italian scribe, say, would make in copying a classical work (and be aware that a French or English or Spanish scribe might make different errors).
In addition, there were polyglot manuscripts. There is, for instance, the British Museum manuscript Harley 2253, containing items in French, Latin, and Middle English. The scribe clearly had familiarity with all three languages (by no means unusual for an educated English scribe around 1340), but there is no certainty that the scribe's copying methods or sources were the same for the three different languages.
The Conversion from Oral Tradition. The New Testament originated in written form, so it never had to make the painful transition from oral tradition to a written text. But other documents assuredly did -- and may have changed in the process. Homer is the most obvious example, but most languages have parallels, from Beowulf to the plays of Shakespeare (where the earliest printed editions seem to have been made from actors' memories) to Grimm's Fairy Tales. In a few cases, there was also the problem of inventing an alphabet to take down the tradition. Orally transmitted material is not transmitted in quite the same way as written (see the article on Oral Transmission). In addition, it leaves a textual problem: Does one attempt to reconstruct the version that was originally taken down, or the original oral composition (this is another of those unanswerable questions).
The Need to Reconstruct from Fragments. We have many, many continuous manuscripts of the New Testament. If a new manuscript turns up, we need but fit it into the fabric of the surviving tradition.
This need not be so with classical works. We may well have multiple fragmentary manuscripts, with no complete copy to put the fragments in place.
Perhaps even worse is the case where we have a fairly complete copy, but with no indication of order. (This can happen, e.g., when a scroll is recovered from the wrappings of a mummy. It can also happen with a palimpsest, particularly if, as sometimes happened, the page numbers of the original writing were written in a coloured ink and did not adhere well to the paper.)
The problem of Spurious Additions. There is significant debate about doctrinal modifications of the text of the New Testament. However, it is generally conceded that, with the possible exception of the text of Codex Bezae and the lost New Testament of Marcion, the New Testament documents did not undergo significant rewriting. They were sacred, not to be modified.
Certain scribes felt free to modify classical texts, however. And if, as often happened, this modified text was the basis for all surviving copies, we have no ways to tell from the manuscripts that the passage is spurious. An obvious example is the famous reference to Jesus in Josephus (Antiquities XVIII.iii.3, or XVIII §63-64 in the Loeb enumeration). This passage cannot be original as it stands; it calls Jesus the Anointed -- and then spends three more sentences on him and ignores him thereafter. At the very least, the declaration of his Christ-hood must be spurious, and probably the whole passage. But it occurs in all manuscripts, and Eusebius was aware of it. So it is a very early insertion. Less certainly spurious, but even more difficult, is the ending of Æschylus's drama "The Seven Against Thebes." This drama comes to a logical tragic conclusion with the death of Eteocles -- whereupon we are presented with another 125 lines featuring Antigone, Ismene, and the Chorus. It is widely (though not quite universally) believed that this section -- over 10% of the play -- is spurious.
Normally we might say that it is not a problem for the textual critic. But this can be a problem. For instance, the Antigone/Ismene section of the Septem requires a third actor (the Herald/Messenger). This is the only portion of the Septem to use a third actor. Logic says that, had Æschylus been writing a three-actor play, he would have made better use of him than this! So if the final section is original, we need to examine the rest of the play to find a role for the third actor (keeping in mind that the speakers are not marked in the copies, only the change of speakers). This will affect our reconstruction of the play. (See the next point on Missing Elements.)
Missing Elements in specialized documents. A New Testament is complete in and of itself. It doesn't need anything except the text. But a drama, for instance, consists of more than just the text spoken by the actors. It also includes such things as stage directions and indications of who is the speaker. Our sources, however, often do not include such elements. This is true of the earliest Greek dramas (a change in speakers is marked with a special symbol, but the speaker generally is not indicated), but continues until quite recent times. Although the speakers are marked in the "Second Shepherd's Play" of the Wakefield Cycle of mystery plays, there are only four stage directions, in Latin; they are not sufficient to explain the action. This continues to be a problem, to a lesser extent, even in Shakespeare.
Once again, it is not the task of the textual critic to reconstruct the stage directions (which may never have been written down) or the speakers; that must be done after the text is established. But a knowledge of who is doing what can be essential in choosing between variants.
Stage indications are not the only thing which can be missing from a manuscript. Music is another obvious example. For poetry, there are also line and stanza divisions -- while printing poetry in this way is a modern invention, the line-and-stanza structure is ancient. And in non-metrical verse, it is not always obvious where line breaks fall. Correct reconstruction can be very important in cases such as Old English alliterative poems. If the line breaks are not correctly placed, one may not be able to tell which is the alliterating letter, meaning that errors can propagate for many lines and perhaps force bogus conjectural emendations. See also the item on Drawings and other non-textual contents.
Metrical or Other Poetic Corrections. Much of classical literature is poetic, following particular conventions of metre and perhaps rhyme. If a scribe encountered a reading which appeared unmetrical (perhaps due to changes in the language; see the section above), he/she might change it. Such a change, if done well, may be indetectable -- but a poor change may require emendation. This requires great sensitivity to the original author's style and dialect. (One should also note that scribes may have been more sensitive to errors in metre or rhyme than the authors they were copying.)
A special case of this is the so-called vitium Byzantium. Byzantine poetry resembled classical tragedy in using a twelve-syllable line. But the metre was different: The Byzantine poets were expected to place a stress on the penultimate syllable of a line, while the tragedians faced no such expectation. Scribes seem often to have adjusted the tragic texts to meet the Byzantine standard (possibly unconsciously). Even prose was somewhat affected by such conventions; sentence breaks in the Byzantine era were expected to be marked by several unstressed syllables. Thus we find many earlier works adjusted to meet these later stylistic rules.
Other rules may apply to poetry. For example, early poetic works in the Germanic languages used the alliterative metre -- each line consisted of four feet, each foot having a stressed syllable and varying numbers of unstressed syllables, with a slight pause (caesura) between the first two and the final two feet. At least two, and usually three, of the stressed syllables had to alliterate. But there were variations on this basic design. Some poems required more exact numbers of syllables. Other had more precise alliteration schemes (e.g. one scheme might allow only two syllables to alliterate, one on each side of the caesura, while stricter schemes might not only require three stressed syllables but require a pattern such as aa/ax). A scribe used to one particlar alliterative style might conform a work in a different style.
Corrections of offensive passages. A Christian scribe might well regard the works of, say, Aristophanes or Ovid as obscene. There was doubtless a temptation to bowdlerize.
Evidence of this happening is surprisingly slight. We do not find cleaned-up copies of Aristophanes. This trend seems to be more modern. But there are copies of Herodotus which omit an account of sacred prostitution (I.199). So if there are two major traditions, and one contains an account of something sexually explicit or offensive, while the other omits it, chances are that the account which includes it is original.
Drawings and other non-textual contents. A geometrical treatise obviously could be expected to contain pictures. And such a drawing, unlike a picture, could contain text. (It might also contain line segments which would extend into the text, and affect its meaning -- e.g. by crossing an omicron and turning it to a theta, though this is not very likely.) These captions could sometimes wander from the drawing into the text. Much the same is true of a work on geography if it contained maps. There is also the problem of assuring an accurate rendering of the original drawing -- a task where the rules of textual criticism are less applicable. (The whole problem is not helped by the fact that many Greek mathematical works survive only in Arabic translations.) A skilled scribe might not be a skilled artist, or vice versa.
There actually is a textual criticism of diagrams. For example, there are two versions of a drawing found in Archimedes's Sphere and Cylinder. A "collation," so to speak, is shown at right. We have five early copies of the diagram. All are effectively the same in most particulars. All involve a circle with seven points around the perimeter (in order, Α Ζ Η Β Θ Λ Γ), and a single point Ε near (but probably not at) the middle, plus a series of lines and arcs connecting them. But some also include a line (shown here in red) between Α and Β. This line, as it turns out, has no part in the proof, and some of the manuscripts omit it.
It appears that all surviving manuscripts of Archimedes go back to just three collections of material, all from the Byzantine period, known not surprisingly as A, B, and C. A and B are lost, although we have some information about their history. C survives, in the form of the Archimedes Palimpsest, which unfortunately is very hard to read and which does not seem to have been copied. Only recently has the manuscript been subjected to the full array of scientific tests -- some indeed being invented to deal with it.
A, B, and C did not contain the same material. Much that is of great value was copied only into C, and much of it sadly has been lost -- those particular pages were not included when the parchment was rewritten and used for a prayer book. The diagram shown here is from Sphere and Cylinder, which was included in A and C. There apparently exist four copies of this work derived from Codex A. Two include the questionable line AB, two omit it. Reviel Netz therefore thinks Codex A included it, because it is more likely to be omitted (since it has no meaning in the proof) than added. This makes sense -- but, since the line is omitted in C, it seems more likely that both A and C lacked it, as did Archimedes's original. Presumably some scribe copied it in by accident (there are a lot of lines in the drawing, after all!), and failed to erase it, and the version with the line propagated.
Thus we see how the textual criticism of the diagram and the textual criticism of the text itself interact. Knowing the stemma of the manuscripts, we can reconstruct the diagram. But there may be times when it is easier to figure out the history of the diagrams and reconstruct the history of the manuscripts based on that.
Spurious conflations of books. This isn't necessarily a problem just of non-Biblical works; many New Testament books have been accused of being assembled from various pieces. But it isn't the NT critic's job to reconstruct the pieces which made up 2 Corinthians, or to recreate the J, E, P sources of Genesis. For the textual critic, the task is simply to recreate the canonical work.
The case is more complex for non-Biblical works. Chrétien de Troyes, for instance, died before he could finish his Perceval, and it seems to many that another hand filled it out by using a Gawain epic of Chrétien's. This presumably required a certain amount of glue t work. Detecting and dealing with this is primarily the task of the literary critic -- but since the two parts may have circulated separately to some extent, they may also have influenced the textual tradition.
We also see simple continuations -- additions to a work that was, at the time, considered finished or at least up-to-date. These too may involve complications. Two separate authors wrote continuations for Chrétien, for instance; we must be alert to interactions. Obviously there are continuations in the Bible also (most would regard Mark 16:9-20 and John chapter 21 as examples; even conservatives admit a continuation at the end of Joshua -- though more liberal critics would dispute this example). But while these are continuations, they generally are pre-canonical continuations (with the possible exception of the ending of Mark), and hence of no concern to textual critics.
We see a very strange instance of this in the Old English poetic paraphrase of Genesis. This, it can be shown, consists of two parts, following different poetical rules. The so-called "Genesis B" fragment is a translation and adaption of a German poem. This is enclosed within "Genesis A," which tells the rest of the Genesis story. It is by no means clear how the two came to be conflated -- or what effect the conflation had on the two poems.
The problem of translations. We encounter this, to some extent, in the New Testament versions -- but there the problem is rather different. For all their peculiarities, the version will try to translate their underlying text accurately.
Many translations of secular works are not as secure. Alfred the Great's Old English translation of Boethius's Consolation of Philosophy, for instance, was actually an expanded adaption. There are also poetic translations of romances, such as the Middle English Ywain and Gawain, derived from a French work by Chrétien de Troyes. The Middle English romance cannot mechanically follow the French; since it is a poetic translation, it must heavily adapt the original. Yet this confronts us with at least the possibility (though perhaps not the likelihood) of interaction between original and translation. This might affect spellings of names and other minor details -- but it could also lead to interpolations or, less probably, a shortening of the text.
This can be particularly a problem when one work inspires another, but the never work is not a direct translation. For example, the Middle English romance Sir Launfal was inspired by Marie de France's Breton Lai Lanval, but it is a retelling, not a translation. Yet might not Marie's original influenced the result? When the translated work exists in only one copy, this might happen. And when both source and translation exist in only one, things really get complicated!
Abbreviations. In the Bible, there are only a handful of abbreviations, generally quite standard: The Nomina Sacra, a handful of suspended letters, the occasional symbol for και. But every language will have its own set of abbreviations, and these may well cause some confusion. To take a trivial example, an English scribe confronted with the abbreviation "Geo." would expand it as "George," while a Scot might read it as "Geordie," and the Russian-born physicist George Gamow insisted that it was a nickname, "Joe." This must always be kept in mind in dealing with manuscripts. The text before you may not even contain any abbreviations -- but perhaps an ancestor did.
The problem of incompetent ancient editors. Not all editions of classical works were produced by modern editors; ancients did it too. New Testament scholars will have some familiarity with this from the problems of Vulgate textual criticism (as with, e.g., the edition of Alcuin), and may also be familiar with the Lucianic text of LXX -- but the problem can be much more severe in classical writings. Juvenal, for instance, is perhaps the most-copied Latin author of antiquity (some 500 copies survive in whole or in part) -- but the vast majority of these are believed to derive from a single incompetently-executed edition containing many mistakes and errors. Only one important manuscript (P) is regarded as independent of this tradition. This puts Juvenal in a state arguably worse than an author for whom only two (good) witnesses survive, simply because the editor who stands behind the majority of manuscripts was so bad.

At this point it is perhaps worth quoting another passage from Reynolds & Wilson (page 212):

[Rules such as the above] will inevitably give the impression that textual criticism is a tidier and more cut-and-dried process than it proves to be in practice. While general principles are undoubtedly of great use, specific problems have an unfortunate habit of being sui generis, and similarly it is rare to find two manuscript traditions which respond to exactly the same treatment.

Appendix I: Textual Criticism of Modern Authors

Most of the preceding discussion has been directed toward writings which, in broad outline at least, have histories similar to the New Testament: Written in manuscript, and copied one at a time by scribes, with most of the copies being lost.

It should be noted, however, that there is a form of textual criticism practiced on works written since then, though it is a very different sort of subject. The difficulty is that a printed copy of a book, or even the author's autograph, may not really represent the author's actual intentions. (Compare the case of Malory described above, where Caxton much expanded from the manuscript.) A modern example of this is noted by Jerome J. McGann in A Critique of Modern Textual Criticism (Virginia, 1992), p. 59. He offers as an example Byron's poem The Giaour. This had an extraordinarily complex history, with most "states" of the text surviving:

First draft: 344 lines
Fair copy by the author: 375 lines
Printed trial proof: 453 lines
First edition: 684 lines
Second edition (not corrected by Byron): 816 lines
"Third" edition, first run: 950 lines
"Third" edition, second run: 1014 lines
Fourth edition (not corrected by Byron): 1048 lines
Fifth edition: 1215 lines
Sixth edition: (lineation not noted)
Seventh edition: 1334 lines

And so on, through fully fifteen editions in a very short span of time (supposedly 14 editions between 1813 and 1815).

Now it should be obvious that the first and fourteenth editions aren't really "the same," and a textual critic shouldn't be reconstructing one with reference to the other. But there is another question: What did Byron intend each edition to look like?

This is an even more complicated question, because of orthographic considerations. Particularly in the early era of printing, there was no standardization of spelling or punctuation. We see faint vestiges of this even today -- e.g. Americans refer to workers collectively as "labor," the British refer to them as "labour." Again, newspapers tend to omit the serial comma ("I went to work, the store and home") while higher-end books tend to include it ("I went to work, the store, and home").

And authors often expected their publishers to help them in this regard. Sir Walter Scott wanted his writings to be "de-Scotticised" by the publisher. Byron's works were overseen by Mary Shelley, who introduced corrections both orthographic and substantial -- and Byron accepted a very high fraction of these changes, implying that he desired the help.

Or take A. E. Housman. He wrote many more poems than he published. In his will, he allowed his brother to publish any additional poems from his notebooks which Laurence Housman thought good enough to publish (while allowing him to make minor alterations). Laurence probably published more than his brother would have liked, but he did cut out and destroy much of his brother's manuscript book -- and then fiddled somewhat with the poems he published. So what, exactly, is the original of those poems? A. E. Housman's unfinished version? Laurence Housman's published version? Some ideal version toward which one or the other was striving?

Thus, even the author's final draft was not necessarily regarded as final in the author's mind. So what does one reconstruct?

And even if one has decided what to reconstruct, does it follow that one should actually retain that form? Should an American version of Byron or Housman, e.g., use the spelling "labour"?

This is a hot topic in textual criticism of modern works; it is the whole and entire subject of the McGann work cited above. (Though I must confess that I never figured out what McGann actually wanted to see happen. Thomas Tanselle's take is that McGann doesn't think authorial intent matters. There is apparently a school with which McGann is associated which thinks that texts only have meaning while their writer is alive to be able to fiddle with them, which obviously has peculiar implications for New Testament criticism, not that they're worried about that! Just for their reference -- yes, I regularly revise works like this as I gather more information. But, McGann and Co., I published this thing. I published it because I want it out there. Might the next version be better? Yes, if I live long enough and don't develop dementia. But I published because I'm satisfied and this work is real and deal with it, OK?)

Thomas Tanselle came up with what I would consider a definitive answer to this: "I would... claim that the initiator of a discourse can he identified as a historical figure (whether or not his name is known...), distinct from others because he is the initator... and that the task of attempting to segregate his contributions from those of others is therefore one legitimate scholarly pursuit" (quoted in Vincent McCarren and Douglas Moffat, editors, A Guide to Editing Middle English, p. 51). In other words, applying this to the New Testament context and looking at all the people who are fascinated by states of the text, yes, the text of Bezae has interest; yes, the ancestor of K^r has interest, but the original has a special and unique place and should not be treated as the same as all of its descendants!

I suspect, however, that the issue is not of much interest to NT critics. (I know it isn't of much interest to me!) NT editions necessarily create their own punctuation (derived perhaps in part from a manuscript -- see the article on Copy Texts), and the tendency is also toward modern orthography.

A matter somewhat more serious (to my mind) is the case of Bishop Percy's Reliques of Ancient English Poetry (1765 and later editions). This was an annotated edition of poems Percy collected here and there, most particularly from a manuscript of the previous century which he had saved from burning.

The manuscript, however, was mutilated, and much of what was still intact had nonetheless been damaged in transmission, and several of the pieces were indelicate. So Percy included pieces not from the manuscript, omitted much that was in the manuscript -- and heavily rewrote it all.

The result, frankly, was a botch. Many versions of traditional songs are defective, and it is accepted that an editor who wishes to prepare a song for singing must sometimes conflate or rewrite. But this still leaves two obligations: The author should admit to rewriting -- and the author shouldn't produce garbage. Also, the author should not hide the manuscript (as Percy did), so that later editors can produce diplomatic editions or propose their own emendations.

Take it as given that Percy failed in all three of his tasks. But what should be done instead? About a century after Percy performed his butchery (an ironically successful butchery, since the Reliques was the most popular collection of tradition-based ballads to that date), an author produced a revised edition. What should he have done? Replaced Percy's hacks with the original manuscript versions? Printed Percy's version with footnotes? Something else?

There is no answer, really -- but it reminds us of just how bad an editing job can be. Percy's edition was not useful to scholars because it was too heavily edited, and was no use to ordinary people because it was too badly edited, and at no point said what it did. Whatever else modern critics do, they really need to learn the Percy Lesson.

Appendix II: History of Other Literary Traditions

Note: This is not a history of literature, nor an account of literary criticism. It is simply a very brief account of the manuscript history of non-Biblical traditions. (Limited by what I myself know or can find out about these traditions. The primary sources for most of the shorter entries is David Crystal's An Encyclodepic Dictionary of Language and Languages and the Encyclopedia of Literature edited by Joseph T. Shipley, though I have consulted fuller literary histories for most of the longer entries. I have attempted to cover all current European languages, though examining the remaining languages of the world is beyond either my powers of the scope of this article (yes, I know this is unfair; a language such as Persian, e.g., has inscriptions from Biblical times, and a large literature, and its speakers have influenced Biblical history. But I have to draw the line somewhere). For that matter, even deciding what constitutes a language is difficult; the definitions are as often political as linguistic. Czechs and Slovaks, for instance, can understand each other, but their languages are called distinct. Different dialects of Italian, by contrast, are mutually incomprehensible but labelled as one language.

Knowledge of the history of literature in a language can be helpful in reconstructing the history of manuscripts. Our understanding of the history of the New Testament text, for instance, is strongly influenced by the manuscripts which have survived. We have a handful of early manuscripts from Egypt, then a very quiet period in the sixth through eighth centuries, from which little of significance survives, then a great flowering beginning with the ninth century.

Latin literature and manuscripts have a history somewhat like that of the New Testament, though the dates are later, and there is no early phase. There are effectively no Latin manuscripts from the papyrus era (apart from those buried by Mount Vesuvius, of course); the areas where Latin was spoken generally did not have a climate suitable for long-term survival of papyri. We have some inscriptions, but few are literary.

The transition from uncial to minuscule happened somewhat earlier in the Latin than in the Greek tradition; the west, which was poorer than the Greek East, probably felt the need for a smaller hand at an earlier date. In any case, we see attempts at literature in minuscules as early as the seventh century. By the late eighth century, the Carolingian Minuscule became dominant, and uncials all but died out.

The Carolingian period also saw the first real revival in Latin learning. Old texts were unearthed and recopied; most of our oldest manuscripts are from this period.

The impoverishment that followed the breakup of Charlemagne's empire saw literary productions decline, but there was another revival in the twelfth century. This was the heyday of Latin literature in Christendom, and the single richest period for Latin manuscripts.

The Romance Languages, naturally, have a much shorter literary heritage. Although tongues such as French and Italian were starting to take form by Charlemagne's time, a literature requires more than that: It requires both authors and copyists. Monks, at this time, were still concerned with Latin literature, and few if any vernacular writers seem to have existed.

While a language recognizeably French appears to have existed by the ninth century, French literature has a complex history, as France remained a nation of semi-independent counties until the fifteenth century. (The French king was overlord of Normandy, Burgundy, Brittany, etc. -- but hadn't the strength or authority to control the dukes who ran those fiefs. At best, he was allowed to name a new Duke if the old line died out.) Language and culture were by no means united. So the earliest important French writing was the Song of Roland, regarded as the earliest (and certainly the best) of the chansons de geste. It is believed to date from around the beginning of the twelfth century, and other chansons date from somewhat later in that period. Also from the twelfth century (probably the latter half) is Marie de France (so named, it is thought, because of her birthplace; she seems to have worked in England), a writer of romantic fables (lais). At the same time, the flood of romances (many of them, ironically, connected with the legendary British King Arthur) began to appear. Few of these, however, survive in many copies. Even the Roland exists in only one significant manuscript, Oxford, Bodl. Lib. Digby 23, which seems to have been copied by an Anglo-Norman scribe. (There are many later manuscripts, but they are all so bad that the critical editions tend to work simply by emending the Digby text.) Similarly, there is only one complete manuscript of Marie's lais; British Museum Harley 978. A large subset, nine, are found in a Paris manuscript, Bibliothèque Nationale nouv. acq. fr. 2168, also from the thirteenth century. There are a handful of other fragments, all from the thirteenth and fourteenth centuries. It seems likely enough that the compositions survived primarily because they are so recent.

We tend to think of France as the country of French-speakers, but a significant minority still speaks Provençal (also known as Languedoc, and known to linguists as Occitan). Although a minority language in France, many of the traditions we regard as French are actually Provençal; in its early form (known since the tenth century), it was the language of the troubadours who created the "courtly love" mythology. The tongue itself was much more important in the past; today, northern French is imposed on southern children in the schools, and Provençal is a sort of a street language comparable to Braid Scots in Scotland. It flourished until the fourteenth century, but came under pressure thereafter (probably in part as a result of the Hundred Years War; many of the southern French had preferred English rule and the French government wanted to bind them more closely to France). The earliest written manuscript is a fragment of the Boeci, thought to have been written around the year 1000. Another fragment, the Life of Saint Fides, was copied at about that time. Then came William IX, Count of Poitiers (who lived around 1071-1127), the so-called first Troubadour. Although only about a dozen of his works survive, Provençal literature becomes common starting from him -- starting, of course, with the Courtly Love lyrics of poets such as Bernart de Ventadorn (mid-twelfth century).

It is not really proper to speak of Spanish literature of the manuscript era; for much of this period, the Iberian peninsula was in Moslem hands (Granada, in the south, was not dispersed until 1492). And even once Christians reclaimed the area, they formed separate principalities (Aragon, Castile, Leon, Navarre). Thus, properly, we should refer to either Iberian literature or the literature of the individual nations -- though almost no one does so. It was not until 1469 that Ferdinand of Aragon married Isabella of Castile (with Isabella reigning from 1474 in Castille and Ferdinand from 1479 in Aragon), at last forming a united Spain. (And even this nation was not united administratively, and did not have a single monarch until 1516, when Charles I -- who was also the Holy Roman Emperor Charles V -- succeeded his grandfather Ferdinand, setting aside his mother Juana "the Mad.") There are, of course, manuscripts from Spain -- such as the excellent Vulgate manuscripts cav and tol, plus some Visigothic fragments -- but these properly fall under other headings.

Still, we have documents from this era. The earliest vernacular Spanish writings (as opposed to writings in late Latin) seem to be law codes from about tenth century. We do not find actual literature in Spanish until the about the twelfth century. From about this time come three epic romances: the Poema del Cid (Cantar de Mio Cid, about the Castilian Rodrigo Diaz de Vivar, died 1099) was written about 1140 (which, although it survives entire in only one manuscript, is considered the great early example of Spanish literature; we also find extremely large portions of it quoted in later chronicles), the Crónica Rimada, and the Roncesvalles (a translation and adaption of the French Song of Roland), also surviving in a single manuscript. All of these are evolved works, hinting that there are older epics, but they are lost. From this time, we see increasing volumes of literature in all categories (epic, drama, poetry, etc.)

Portugese is now spoken primarily in Brazil, which has a far larger population than Portugal itself, but of course the language did not reach that nation until after the invention of printing. Portugal itself has had a complex history, occasionally being united with Spain; the two languages have influenced each other. The famous Portugese explorers also brought home many loan-words. The basic language, however, remains fairly close to the Latin from which it sprang. There is a strong literary tradition starting from the twelfth century (the earliest dated inscription comes from 1189); the songs of the troubadours, the most important part of the tradition, come from the next century. These have a complex history, written separately and combined, with many of the anthologies lost, but they may have cross-fertilized. Portugese is especially closely related to Galician, spoken primarily in the northwest corner of Spain north of Portugal (the two did not split until after Portugal became an independent country and the western Iberians were largely cut off from each other). Distinctly Galician literature is, however, rare and largely confined to the period after the development of printing and the split with Portugese; although there are cultural hints of a Celtic history in the region, this has not affected the language or literature.

Catalan was for much of its history the official speech of Aragon (a small country which came to be incorporated into the larger Catalan state but retained the name Aragon because Aragon had kings and Catalonia only counts), but it is now the forgotten Romance language -- it's almost the only Romance speech not to be official somewhere. It is spoken primarily in northeastern Spain and surrounding areas (e.g. into the eastern French Pyrenees; the primary city of Catalan Spain is Barcelona). Catalan speakers have been oppressed at various times in Spanish history (as recently as under Franco), which has resulted both in the destruction of texts and in a strong tendency to conform to Spanish. Still, there are literary remains going back to about the twelfth century, and chronicles starting not much after -- and the fact that Aragon and the County of Barcelona came to be dominated by Castile, and that Catalan texts and speakers have been abused, means that there is much need for textual reconstructive work.

Even more thoroughly ignored is Corsican, spoken by only a few hundred thousand people on the island of that name. Although Corsica has been governed by France for more than two centuries, it is a language with Italian roots (closest to Tuscan). It has, however, no real literature (Corsica long remained a land of subsistance farmers and shepherds), particularly from the manuscript era.

Sardinian has been written since the eleventh century, but has only a small literature; the language (which is close to Italian, and also said to be closer to vulgar Latin than any other Romance language) has several dialects, none dominant, and it has never been an official language even on its home island.

Ladinic is the usual name for a Romance language spoken primarily by Jews. As such, it has a fairly large literature, though much of it is fairly recent. The tradition is confused by the fact that both Hebrew and Roman alphabets have been used for it.

The name "Ladinic" is also sometimes used for the fourth official language of Switzerland, but the correct name is Romansch or Rhaetian or Rhaeto-Romansch. It has several dialects, influenced variously by Italian and French. The earliest writings date from the twelfth century, but the small number of speakers has kept the tradition small.

It was Dante who truly put vernacular Italian literature on the map (though he wrote in Latin as well as Italian, his great work, the Divine Comedy, was the first major work of Italian vernacular literature, and written not many centuries after the first hints of Italian writing in the tenth century -- that earliest writing being scribbles in the margins of Latin documents. We have some verse fragments from the twelfth century, but their dialect seems to indicate that they were dead ends). So great was Dante's influence that Boccaccio, the second great light of Italian literature, adopted almost all of Dante's techniques. Dante did not invent everything he did -- his slightly older colleague Guido Cavalcanti, for whom Dante wrote the Vita nuova, pioneered a great deal. Dante, however, was the great voice who spread the literature to the wide world. Like Boccaccio, Petrarch (the popularizer of the sonnet) wrote in the period immediately after Dante (Petrarch too was of Florentine ancestry, though born outside that city). Dante, Cavalcanti, Petrarch, and Boccaccio, however, wrote only a few centuries before the invention of printing. Thus the Italian manuscript tradition presents few interesting features. In addition, Italy, like Spain, was not united until long after the invention of printing -- in this case, the nineteenth century. The Divine Comedy is not really Italian literature (except in its use of the vernacular language); it is the language of one of the city-states (even today, some of the Italian dialects are mutually incomprehensible; Received Italian is based on the Tuscan dialect of Florence (i.e. the Dante dialect), but about half the population does not speak this form as a native language; there are also minority languages. Francis of Assisi, for instance, wrote extensively in his local Umbrian dialect). There was thus no national Italian literature in the manuscript era.

Widely separated from the other Romance languages is Rumanian. This has caused it to develop unusual features -- e.g. it adds articles as suffixes to nouns, and of course has many Slavic loan words. The language presumably evolved away from Latin very early, but the earliest writings seem to date from the sixteenth century, and these were confined to official documents and liturgical works. Even then, Slavic alphabets were used for several centuries.

Some texts will speak of Moldavian as a separate Romance language, but this is one of those political distinctions, since Moldova, prior to independence, was long part of Russia. Moldavian is really a dialect of Rumanian (with some Russian loan words) written in the Cyrillic alphabet, with no real literature from the manuscript era.

Dalmatian, which died out as recently as the end of the nineteenth century, was also a Romance language, but seems to have left little literature. (This is fairly typical of Balkan area languages.)

Romani (Romany, Gypsy), despite its name, is not a Romance language; its origin is something of a mystery although it has been attributed to the Indo-Aryan group. The language is very diverse, and tends to take on local attributes. When written, it tends to use the local alphabet. Romani literature, however, is oral; there is little if any need for textual criticism.

Greek Literature never went into as much of a decline as Latin, so we do not see as much of a revival. The strongest period of copying, however, is not that different; many of our earliest manuscripts date from the ninth to eleventh centuries. The Photian Revival of the ninth century is no doubt at least partly responsible. After the eleventh century, the decline begins. The Battle of Manzikert (1071) began the long slow Byzantine retreat which ended with the fall of Constantinople in 1453. The worst destruction, however, was wrought by Christians, not Turks. The Fourth Crusade sacked Constantinople in 1204, and many of its treasures were either destroyed at that time or carried off to Western libraries where they were forgotten. (One wonders if this might not be how Vaticanus found its current home,)

It is interesting to note that, for both Greek and Latin literatures, there is something of a break following the third century. Until this time, authors freely and regularly quoted works such as the Epic Cycle and the lost plays of the Athenian dramatists. Following the third century, this becomes much rarer. Occasional extremely diligent authors such as Photius will occasionally produce something from a lost work, but the strong majority of quotations are from works which still exist today. This cutoff is so strong and so obvious that scholars have speculated that the surviving works are part of some sort of official curriculum, with works outside that curriculum being ignored. (The problem with this theory is that there is absolutely no other evidence for it. The likely explanation is just the general decline of the Roman Empire.)

Russian literature really gives us very little to work with. There was not even a Russian/Slavic alphabet until the creation of the Old Church Slavonic version. Even then, there was little to write down (a fact which is to a significant extent responsible for out ignorance of early Russian history); Russia, more than almost any nation in Europe, was a land of poor peasants and wealthier but equally ignorant aristocrats. It also suffered outside disruptions -- the sack of Kiev in 1170, the Mongol and Tatar invasions, the later sack of Novgorod and the other battles for Russian unification. The problem is made that much worse by the various dialects of the language. (We truly do not know the extent to which early Russian differed from Old Church Slavonic.) Histories do not begin to speak of Russian literature until the eighteenth century. Prior to that, there were church manuals and a few chronicles and the like (starting from the twelfth century), but little else save the letters of Tsar Ivan IV (Ivan the Terrible, 1530-1584). From the manuscript era, there is little original literature except for saints' lives and monastery annals. The latter hardly need textual criticism. The former may have suffered more modification -- but in this case, the modifications may be of as much interest as the original text.

The situation is similar for most of the eastern Slavic languages (in the areas where the Orthodox church held sway). The situation is perhaps even worse for the western Slavs; since these regions were Catholic, they used the Latin Bible, and had no vernacular translation to inspire a literary tradition. Slovenian, for instance, is said not to have had any literature at all until the nineteenth century.

Interestingly, textual criticism continues to be an active need in some of the Slavic languages to this day. Because of the Habsburg Empire's lack of respect for its subject peoples, writings in these tongues were often published very casually. A classic example is Jaroslav Hasek's The Good Soldier Schweik, written after the First World War though including elements from the period before the war. Hasek's manuscript (written in Czech, though with bits of German) is incomplete, the two early editions differ substantially, and Hasek (who died in 1923) had no real part in either. (He was dictating almost to the day of his death, and exercised little control over the volumes which actually appeared in print.) Thus there is a real need for a critical edition of this famous twentieth century writing. This is all the more ironic in that Czech as a language (as opposed to a dialect of East Slavonic) did not emerge until the sixteenth century; had there been free publication in the Habsburg Empire, all true Czech works could have been published by their authors there would be little need for textual work. But government opposition was strong -- in no small part because much Czech literature was anti-Catholic. The literary impulse was largely a belated reaction to the work of Hus, who tried to regularize Czech orthography and conform the language to that of the people. From about 1350 to 1500, the period when Czech was becoming a distinct language, effectively all Czech works were religious and Husite. Hus's orthography eventually came to be widely accepted -- but, with the Habsburgs trying to suppress Czech aspirations, it took a long time for it to receive universal acceptance. A side effect of this is that many Czech writers, such as Comenius, had to work outside the Habsburg empire (Comenius, properly Jan Amos Komensky, worked in Poland, Sweden, and Holland; printers there naturally had some troubles with his works.)

The situation for Slovak is even worse. Almost indistinguishable from Czech (the two are fairly mutually intelligible, and might be considered one were it not for political reasons -- the Czech regions of Bohemia and Moravia were under Austrian control in Habsburg times, while the Slovaks were ruled by the Magyars), Slovak is a language of small farmers and villagers. It has many dialects, there were no schools, and the Magyar overlords used Latin or, later, Hungarian. The idea of a separate "Slovak" language does not seem to have existed before the time of Bajza (1754-1836), and there was little literary impulse until the nineteenth century, when Ludovít Stúr produced a newspaper using a standardized Slovak language. Even that was opposed by many Slovaks, some of whom preferred Czech as a literary language (Czech influence had long affected the few works published in Bratislava). And the outside pressure continued: the influence of first the Magyars and then the Czechs suppressed the development of a literary language. With no Hus to look back to, and no early works to preserve, Slovak has little need for textual criticism.

The other languages of the Former Soviet Union have suffered similarly. Belorussian (Byelorussian, White Russian, Byelo-Ruthenian) written in the Cyrillic alphabet, has literary remains dating back to the eleventh century, but the people has never been independent until now, and both Russian and Habsburg dynasties tended to hold down both people and language. Ukrainian has a curious history, as the Ukrainian/Russian separation was initially more cultural than linguistic. The Ukrainians had a tendency toward the Uniate church, and affiliations with the Poles (ironic, given the modern hostility between the two), while the Russians are Orthodox. There are hints of a Ukrainian dialect as early as the thirteenth century, but the current language (marked, e.g., by Polish loan words) did not come into being until the late eighteenth century.

Polish as a language existed by the twelfth century, but literary works do not appear until the fifteenth century (we have catalogs of older works, but apart from a few surviving hymns and fragments, our earlier survivals are all in Latin; so too the writings of Copernicus, the first great Polish scholar), with a flowering in the sixteenth. There were few widely popular Polish works before the invention of printing. And after printing came along, Poland was the victim of cultural imperialism (the almost-universal fate of Eastern European peoples), with the country eventually being divided by Prussia, Russia, and the Habsburg Monarchy; it was not reunited until after the First World War. This means that, although there was a standard literary Polish (derived from the dialect of Poznan), the local dialects were little influenced by this form. This slowed and fragmented the development of Polish literature, which did not really revive until the nineteenth century. In any case, there is little here for textual criticism to do.

Sorbian (Wendish, Lusatian) is a Slavic language spoken in primarily in Germany in the region of the Polish and Czech borders. There are only a few tens of thousands of speakers, but even so, the language has several dialects. The earliest texts date from the fifteenth century, but the remains are limited for obvious reasons. The New Testament was the first printed work, being published in 1548.

Bulgarian is unusual among Slavic languages in that it came to be written early (though the oldest Bulgarian inscriptions predate written Bulgarian, and are in ungrammatical Greek). Closely related to Old Church Slavonic (there are Slavonic biblical manuscripts which can be called proto-Bulgarian), the earliest Bulgarian literature dates from the tenth century, meaning that textual criticism has a genuine place in dealing with Bulgarian writings. (The earliest writings, for instance, will have been in the Glagolitic alphabet, later to be changed to Cyrillic.) The earliest works were mostly religious and mostly derivative; starting in the twelfth century, however, there was a flowering which lasted until the Ottoman conquest. Since the Ottomans suppressed education and technology, printing did not arrive until late; many works were destroyed and many others that would otherwise have been printed survived in only a handful of manuscripts.

Macedonian is a curious language, fragmented into very diverse dialects, many of which are as close to Bulgarian as to each other. (Indeed, Bulgaria has claimed the Macedonian language as dialects of its own.) Some features of Macedonian appear in writings as early as the tenth century, but as a literary language, it did not emerge until late in the eighteenth century, and only quite recently has it truly come into its own.

The ultimate example of interplay between politics and linguistics may be in the case of Serbian/Croatian/Serbo-Croatian. The languages of Serbia and Croatia are mutually comprehensible in speech, but both parties insist that the languages are different; the Serbs are Orthodox Christians and write their language in the Cyrillic alphabet, while the Croats are Catholic and write using the Roman alphabet. There are remains of the language from the twelfth century, but politics can play a role in their interpretation. Making the matter even more complex is the fact that the Serbs long clung to Church Slavonic as their literary language. What few works there are are mostly liturgical, and needing examination by someone familiar with both Slavonic and Serbian. True Serbian literature did not come into being until the nineteenth century. Croatian saw a brief flowering in the sixteenth century, but the Croats, as Catholics, tended to use mostly Latin for their few writings until quite recently. The outcome of this was the very odd Knjizevni Dogovar agreement of 1850, which caused Croats and Serbs to formally adopt the same literary language! Obviously that is mostly a dead letter now.

Related to Serbo-Croatian, but more obviously distinct, is Slovene (Slovenian). Although there are signs of written Slovene from the eleventh century, a standard literary form did not develop until the nineteenth.

More distantly related to the Slavic languages are the Baltic tongues of Latvian, Lithuanian, and Old Prussian. Old Prussian is extinct; there are some written remains, but here the need is more for linguistic than textual reconstruction. Latvian (Lettish) was first written in the sixteenth century, in a Gothic alphabet, though the Latin alphabet has been in use since shortly after World War I. Lithuanian also gives us literary remains from the sixteenth century, though it uses a 32-letter alphabet based on the Latin.

Germanic literature (including English, Scandinavian, and German writings) had a more complex history than Greek or Latin or Romance literature, as there was never a united German nation in the manuscript era. Then, too, languages like English and Frisian and Dutch did not formally divide from Old German until well after the New Testament was written (indeed, the Germanic group continues to spawn new languages; Afrikaans sprang off from Dutch starting in the eighteenth century). In addition, many of these people acquired writing only after long periods of independent development, meaning that individual nations had completely independent literary histories.

English literature had a curious, rather roller-coaster-like history. The Romano-Celtic literature which preceded the Anglo-Saxon invasions (if there ever was one) was completely extinguished by the Germanic invaders. The invaders themselves seem to have had a rudimentary knowledge of writing (there are a few inscriptions, such as the Ruthwell Cross, in runic letters, and as the runes are of an ancient form, with no dependence on Latin letters, the forms presumably predate the conversion of the Anglo-Saxons). There is, however, no evidence of a literature written in these characters. Indeed, there is no evidence that they had any form of written literature at all; all the earliest Anglo-Saxon poems, from Caedmon's Hymn to Beowulf, seem to have been originally oral. To make matters even more complicated, the invaders were not actually all one people, and in any case they did not at once form a unified England. (Traditionally there were seven Anglo-Saxon kingdoms -- Northumbria, Mercia, East Anglia, Wessex, Sussex, Essex, and Kent -- but Northumbria, for instance, was formed by the union of Bernicia and Deira, and most of the other seven kingdoms were also assembled from smaller units.) The result was significant dialectial differences between the nations.

The Viking invasions of the ninth century did much to change this picture. First, they destroyed all of the ancient English kingdoms except Wessex (without establishing anything of permanence in their place), and second, they placed so much pressure on Wessex that it could not afford a child-king. As a result, when King Ethelred I died around 871, he was succeeded not by his son but by his younger brother Alfred.

This was significant on two counts. First, it made a united England possible; the old English nations were no more, and the new Viking states did not have the strength to resist Wessex. (Nor did they really object to English overlordship; at this stage, English and Norse were still fairly closely linked culturally and linguistically. English adopted a number of Norse loanwords, especially in the northern dialects.) Alfred did not himself unite England, but his son and grandsons were able to create a unitary Saxon state which would last until the Norman Conquest.

More significant for our purposes, however, is the revival of learning encouraged by Alfred. We cannot really tell, from the surviving records, how much was actually the work of Alfred himself -- but there is no doubt that the survival of Anglo-Saxon literature is due to Alfred's efforts. Anglo-Saxon manuscripts almost without exception date from this era (Alfred took the throne in about 871; he held it until about 899). Even in Alfred's time, little Anglo-Saxon literature was written (other than several translations encouraged by Alfred, plus his own creation, the Anglo-Saxon Chronicle, one of the most textually confusing documents ever written). But the old epics and poems were written down; the manuscript of Beowulf was written in the tenth century, and most other surviving texts were written in the same period (probably from about 880 to 1010, when the Danish invasions resumed).

Despite all of Alfred's work, almost all that survives of Old English poetry (the core of their literature) is found in four volumes, all from the post-Alfred period:

The Exeter Book, Exeter Cathedral MS. 3501, dated paleographically to the second half of the tenth century and believed to have been written by a single scribe. The surviving portion consists of folios 8-130, and contains some dozens of works. Very many of these are on Christian themes (from the Lord's Prayer to an account of the apocryphal Descent into Hell), but it also contains such well-known works as The Wanderer, The Seafarer, Widsith, Deor, and the famous Exeter Riddles. This is the chief anthology of Old English literature; with the exception of Beowulf, it contains almost all of the more famous poems of the pre-Conquest period. It is widely believed that this is the "big English book about everything" donated by Leofric, the first Bishop of Exeter, but this certainly cannot be proved.
Cotton Vitellius A.xv, now in the British Museum, dated paleographically to about 1000. Written by two contemporary hands (the shift comes at line 1939 of Beowulf). It contains both prose (such as a legend of Saint Christopher) and poetry; the most notable items of the latter are Beowulf and Judith. The manuscript was badly charred in the Cotton Library fire (1731); although most of it can still be read (with difficulty), there are passages where we must rely on earlier transcripts or conjectural emendation. The book was rearranged at some point in its history, and some items may have been lost entirely.
Oxford, Bodleian Library Junius 11 (5123). Written by four scribes all working around 1000, though the scribes were not necessarily exactly contemporary. Contains only four works (poetic treatments of Genesis, Exodus, and Daniel, written by the first scribe, and the story of Christ and Satan, which may have been a separate volume and was written by the other three scribes).
The Vercelli Book, Vercelli, Biblioteca Capitolare CXVII. Probably (though not quite certainly) written by a single scribe in the second half of the tenth century. Contains a series of homilies and such poems as The Fate of the Apostles. Also contains one of three copies (the fullest) of The Dream of the Rood. It is speculated that a pilgrim was carrying the book to Rome (whether for personal use or for presentation to the Pope is uncertain), but the book (and presumably the traveller) never completed the journey.

Also of note is:

Cotton Otho A.xii, dated perhaps to around 1000, containing of poetry only The Battle of Maldon, but also the only known copy of Asser's Life of Alfred. It was completely destroyed in the Cotton fire, and our sole knowledge of these works is from transcripts made before the fire. Those who saw it prior to the fire say two scribes were involved. Whether it was originally a unity may be doubted; Cotton sometimes bound leaves from multiple sources together, and this volume is reported to have included some modern leaves. If originally a unity, the volume cannot have achieved its final form before the Battle of Maldon in 991, but it is possible that the Alfred was copied earlier.

Time has not been kind to the handful of other manuscripts containing small amounts of Old English material. The Cotton fire of 1731, already mentioned repeatedly (as a side note, we might mention that Richard Bentley was one of those who worked to save books from the fire), destroyed Otho A.xii and badly damaged Vitellius A.xv. What we have of Waldere came from the binding of a book in Copenhagen. The Finnsburh Fragment, Lambeth 487, is one of the several lost Lambeth manuscripts. Even much of what survives is on Christian topics; these are of relatively little value since the same material is available in other languages. In any case, almost all the Old English works survive in single copies, leaving the textual critic with little to do except work at conjectural emendation. Among the few exceptions to this rule are Caedmon's Hymn (existing in many manuscripts, including the Moore MS at Cambridge, Kk. 5.16, dating all the way to 737, and the Saint Petersburg manuscript Public Library Lat. Q. v. I. 18, believed to predate 746; also in Bede), The Battle of Brunanburh (multiple copies, with significant differences, in the Anglo-Saxon Chronicle) and The Dream of the Rood (three copies, with differences clearly recensional)

In addition to Old English works, the pre-Conquest period produced a number of Latin documents, most notably Bede's history (as well as the Life of Alfred, but this was of interest primarily to the English). But since these could be circulated beyond England, they are properly the province of a history of Latin or Catholic literature.

Following the Normal Conquest, English literature as such effectively disappears for three centuries. With the exception of the Anglo-Saxon Chronicle (which was gradually abandoned over the years, with fewer entries being made and fewer comparisons between texts), the surviving writings are all in Norman French or Latin. By the time English writings re-emerged in the fourteenth century (with Langland and Chaucer and Gower and the Gawain-poet), Old English had given way to Middle English -- and the dialects had separated to the point of being mutually incomprehensible. Gower (who also wrote in Latin and French) and Chaucer used the London dialect, close enough to modern English that little but practice is needed to understand it. The Gawain-poet, by contrast, used a northwestern dialect equally incomprehensible to us and to Chaucer. We may demonstrate this using the first four lines of Sir Gawain and the Green Knight:
Sithen the sege and the assaut was sesed at Troye,
The borgh brittened and brent to brondes and askes,
The tulk that the trammes or tresoun ther wroght,
Was tried for his tricherie, the trewest on erthe....
(And this is with spelling regularized!) Most writings in non-London dialects were equally obscure. The case of Piers Plowman is more complex, as Langland appears to have tried to use more universal forms, but it appears that Langland's own dialect was that of the west Midlands.

It may not be coincidence that the works of the Gawain-poet, who used a highly obscure dialect, survive in only one manuscript, while Piers Plowman survives in 52, and the Canterbury Tales exist in eighty-plus manuscripts (though we only have sixteen of Troilus and Criseyde, and fewer still of most of Chaucer's other works).

These manuscripts show some significant textual variation, but it is worth noting that all were written in the two centuries before the invention of printing, and that textual variation was rather limited. Much more important and troubling was the matter of dialect translation.

As noted, English was a nation of dialects in the post-conquest period. But even worse was the fact that there was no standard dialect -- no "King's English." (The only situation more or less parallel to this was Germany in the period before the unification, and even there, the Prussian and Austrian courts exerted some influence.) Prior to the reign of Edward III (1327-1377), all official business was done in French. It was not until the reign of Henry VI (1422-1461) that French gave way entirely to English. Until this happened, there was absolutely no standard. So texts had to be "translated" -- converted from one dialect to another. Sometimes this was just a matter of correcting endings or the like; this is no worse than Attic tendencies in the New Testament. But sometimes it required significant alterations. This makes textual criticism much more difficult. The only work believed to have been spared this process is the Wycliffite Bible -- and it probably because of an unusual combination of circumstances: It is translation English in any case, it is in a fairly standard dialect, and it was not made until the period when English was again emerging as an official language.

Manuals of textual criticism devoted to Middle English are few. The only one available for most of the twentieth century was Charles Moorman's 1975 book Editing the Middle English Manuscript, which is less a book on textual criticism than on the actual task of editing -- and has been rendered largely obsolete by computer technology. In 1994, Tim William Machan published Textual Criticism and Middle English Texts, but this is frankly less a study of textual criticism than a screed against the whole discipline. It is a field ripe for a good manual.

Icelandic literature suffered no such problem. The Icelandic language has evolved so little that it is thought that a modern could converse directly with an inhabitant who lived there 800 or more years ago. Icelandic is almost identical to the Old Norse which is the ancestor of modern Scandinavian languages.

This means that Icelandic literature such as Snorri Sturluson's "Prose Edda" has undergone little linguistic tampering. More problematic is the matter of limited numbers of copies. Iceland is a small country; for most of its history, it has had a population little larger than a small town of today. Given its size, it has an immense literature, though much of it is preserved outside Iceland. (The reason is not far to seek: For many years, Iceland was the poetic capitol of the Scandinavian world, exporting Court Bards to the other Norse kingdoms.) Few of these works are preserved in more than one copy, however. The single most important Icelandic work, the so-called Elder Edda (which is not really a single work but an anthology), is typical: Although a handful of the tales exist in other documents, the large majority are found only in the Codex Regius (c. 1275), which is itself damaged. Snorri Sturluson's Prose Edda is an exception; we have three good copies and some lesser manuscripts. The Uppsala Codex, perhaps the best, dates from about 1320, or roughly a century after Snorri's original composition. But this is exceptional; the Prose Edda is actually a sort of a fictional saga (Iceland was well and truly Christianized by his time), typical of the prose sagas of the period (which obviously never existed in oral tradition). Most of the others sagas are more sparsely attested. Thus Icelandic literature is like Anglo-Saxon literature in that we can only correct the text by emendation, but unlike it in that we do not have to concern ourselves with dialect-to-dialect translations.

The history of Norwegian and Danish literatures are essentially tied up with Icelandic literature (and, in the latter case, there is some link to English literature as well, as the Danes ruled all or parts of England for many years -- notably in the reigns of Cnut and his sons, 1016-1042). Danish did not become clearly distinct from Old Norse until the twelfth century, and Norwegian separated from the common language at about the same time. There are hints of literary remains (inscriptions) from as early as the third century, though these were written in the runic alphabet (it seems to have been Christianity -- which came late to the North -- which inspired the switch to the Roman alphabet; we have, e.g., a number of law codes from the period before 1200 C. E. Most early Danish works in the Roman alphabet were written in Latin, not the Norse dialects). So the literatures of these languages in some cases has gone through two transitions: From runic to Roman alphabet (a transition not complete until the thirteen or fourteenth century), and from generic Old Norse to more modern local languages. There are also cross-influences: Since Denmark at various times ruled Norway, some Danish influence crept into Norwegian even after the languages split. They remain largely mutually intelligible.

Recent changes in Norwegian have further complicated matters, as there are two basic dialects, neither of which is entirely natural. Bokmål, the "book language," was influenced by Danish (the two nations were united from 1380 to 1814), while Nynorsk was invented in the nineteenth century based on several dialects and was an attempt to return the language closer to its roots. All of this, of course, happened after the manuscript era, but it affects the editors' approach.

Also derived from Old Norse, and quite close to Icelandic, is Faeroese (Faroese). As, however, this language was not written until 1846, it is of no concern to textual critics.

The situation is quite different for Swedish literature; although Scandinavian, Sweden was not really part of the Norse culture in the sense that Norway and Iceland and Denmark were. (This despite the fact that Swedish, Norwegian, and Danish are quite close to each other, and to a significant extent mutually intelligible, while Icelandic and Faroese are much more distinct.)

The earliest Swedish "literature" is found in the thousands of runestones scattered about the country. These are, for the most part, written in the sixteen-symbol Swedish runic alphabet (which later gave way to a Danish/Norse runic alphabet) -- but textual criticism is hardly a concern with runestones; they rarely contain material of literary interest, and in any case were usually written under the direct supervision of the composer of the inscription.

There are exceptions. The Rök stone, which came to be part of a church wall, includes a great deal of text, including some poetic material. It is a mysterious inscription, with several different alphabets involved. (Including both the ancient 24-character runic alphabet and the later, pruned-down 16-rune form.) It seems nearly certain that at least part of the content of the stone is old, and in need of textual criticism (part of it, in fact, appears to refer to Theodoric the Goth, king of Italy 476-525, which would almost certainly date it before the time it was inscribed). But as best we can tell, there are no other copies of the material. (Given the strange alphabets, this cannot be considered entirely certain.) That older Swedish literature existed seems to be implied by carvings such as that on the Ramsudberg stone, which appears to allude to the Sigurd epic. But this is only a picture with a short text; it is not literature in itself.

Part of the problem may be that Sweden was the last Scandinavian nation to achieve political unity. Somewhat cut off from the cultures of its neighbours, it was not large enough to achieve a strong literary tradition of its own. We have no clear remnants of Swedish poems from the Skaldic age (the era of the bards). Our oldest writings, in fact, appear to be land laws (in copies dating from the thirteenth century, but probably based on older writings). In addition, Sweden did not found its first University (at Uppsala) until 1477, and it did not become permanent until 1593. The Sigtuna monastery (founded in the first half of the thirteenth century) had a large library, but it and other Swedish religious institutions seem to have been entirely hostile to secular, particularly pagan, literature. Thus most books found in Sweden are in Latin, and the few in Swedish are generally religious, and often translations of Latin works -- e.g. the Fornsvenska legendariet, a translation of a set of saints' legends by Jacobus de Voragine known in English as The Golden Legend; the translation is considered the oldest surviving Swedish prose work except for the land laws. This may have been the work of Petrus de Dacia (died 1289), who in any case is the first named author in Swedish history; he also wrote the Vita Christinae Stumbelensis (but in Latin, not Swedish). From the next century comes Birgitta (died 1373), a mystic whose visions began after her husband's death in 1344, but which were not collected until they were published in 1492 (translated from Swedish into Latin as Revelationes Celeste; she had already been canonized in 1391. There are a few Swedish fragments, perhaps from Birgitta's own hand, but these do not form part of an actual literary composition.)

This paucity of works in the vernacular continued throughout the middle ages. Sweden had few of the tales of chivalry so common in the rest of Europe (partly influenced, no doubt, by the fact that knighthood did not flourish in Sweden). There is a Swedish redaction of the story of Florice and Blancheflour (part of the Eufemiavisor, perhaps the earliest of these legends -- but compiled at the instigation of a Norwegian queen!). But this is very nearly all there is in the manuscript era. This left the field to the rhyming chronicles, a form largely peculiar to Sweden but common there in the early middle ages. These can perhaps be called the chief form of early Swedish literature, though they eventually gave way to prose chronicles (which were less interesting without being notably more accurate). After their time, Swedish literature went into a decline; we have relatively few manuscripts of these works, and few works of any sort from the final centuries of the middle ages. The last significant works were the writings of Bishop Thomas Simonsson of Strängnäs (died 1443). His "Song of Liberty" was the last important Swedish work of the manuscript age -- but late enough that it need not detain us.

In addition, Sweden (like most countries) has an oral literature. There are Swedish ballads, just as there are German and English and Norse. (The Swedish ballads, indeed, are almost certainly survivals from Old Norse roots.) But as with most oral literatures, the originals are almost certainly beyond reconstruction.

Dutch (Flemish) is a Germanic language, and had the Netherlands and Flanders become part of Germany rather than independent, Dutch might well have had a history resembling that of English: Just as Scots split off from English, then was (somewhat forcibly) re-merged so that it became little more than a dialect, so Dutch might have been re-conformed. Indeed, this happened with East Dutch (Oosters), the language of writers such as Menno Simmons; it has been pulled toward other languages. But the Netherlands and Germany became separate (with the Netherlands spinning off Belgium in 1830, only a few decades before Germany became a nation), and Dutch evolved into a genuine language with literary works coming into existence around 1100. From this time until the end of the manuscript era, however, the Netherlands (in this case, including Flanders) were generally under foreign rule -- French or Burgundian or Spanish. At times this rule was oppressive and sought to control the local literature (which often stressed independence). This has probably affected the manuscript tradition. In addition, some would call works such as Reynard the Fox or Beatrijs (all written in Belgian Flanders in about the thirteen century) to be "Belgian," others Flemish or Dutch. There was also Burgundian influence.

Frisian is considered to be closer to English than any other language, but it has a very small population base. Only about half a million people speak it, mostly in the Netherlands in the islands off the Dutch coast (and the other groups, also in the coastal areas of the North Sea and Baltic, speak rather different dialects with little literary history). There are a few written remnants starting from the thirteenth century, but the small population base and the fact that (until recently) it received no support from the various local governments kept the literature sparse. The earliest items in the language seem in fact to have been preserved in Old English works. The few "native" works are primarily law codes, starting from the eleventh century. We also have a handful of rhymed chronicles from the days when Frisia was an independent region.

Tracing the history of actual German literature is beyond the ability of this writer, as the language has many dialects, some barely mutually comprehensible, and some of them (e.g. Luxembourgish/Luxemburgish/Lëtzebuergesch) sometimes listed as separate languages. It should be remembered that Germany was not a political unity at any time from the era of Charlemagne until 1870. The classical distinction is into High and Low German (Hochdeutsch and Plattdeutsch), but there are also languages and dialects such as Yiddish and Swiss German. Insofar as there is any unity, it is based on the language Luther used in the German Bible -- after the manuscript era. The "standard" dialect, taught in the schools, is derived from High German, but this is Modern High German, while the manuscripts will be of works written in Old German and Middle German. The greatest number of texts are those, such as the Nibelungenlied, in Middle High German. Much of the literature, though, such as the work of the Minnesänger, was long transmitted orally. But there is a significant quantity of manuscript literature, and those manuscripts have suffered the usual troubles. For example, the oldest significant German work is Das Hildebrandslied, and all we have is a fragment.

Yiddish is primarily a Germanic language, though it has many Semitic loan words, and some dialects also have Slavic influence. As the language of a large number of European Jews, it naturally has a relatively rich literary tradition (dating from the twelfth century). Yiddish literature has been subjected to several pressures. Jewish tradition would tend to result in carefully preserved documents -- but Yiddish, unlike most other languages, has never really had a "homeland"; its speakers have been scattered throughout Europe. This has resulted in the adoption of large numbers of local loanwords, so that (e.g.) a Jew in Russian territory might not understand all the vocabulary of German Yiddish. And since there was never any national center, there was no centralizing force. Today, East European Yiddish is rather the standard, but a scholar working on Yiddish texts must be very aware of the time and place of the original.

Literature in the Celtic languages is relatively sparse. This is not due to a lack of literary activity, but because the languages themselves belong to relatively small populations. It is traditional to speak of six Celtic languages: Irish Gaelic, Scots Gaelic, Manx, Welsh, Cornish, and Breton. Irish and Scots are so close as to almost be dialects of one another (and Manx also closely related), while Welsh, Cornish, and Breton form another, less tight-knit group. This picture is rather unreal, however. The Cornish language actually died out centuries ago, leaving only a few literary remains (mostly from the fifteenth century and shortly after, though they may be based on older materials; the earliest one cannot have been copied earlier than 1340, as it is written on the back of a charter of that date). By 1611, the date of Gwreans an Bys (the Creation of the World), the language was in decline, and the decline accelerated thereafter; no Bible or Prayer Book was published in Cornish, which doubtless hastened the abandonment of the language. The literary fragments, combined with analogies from Welsh, have been used as the basis of a Cornish restoration -- but no one knows if the reconstructed language actually matches the original! (This makes for an interesting task in textual criticism; at what point does reconstructing the text move into reconstructing the language?) Manx is still spoken, but has never had more than a few thousand speakers, and is now down to a few hundred, none of whom can call it a first language. Scots Gaelic (derived from the common Gaelic stock which also produced Irish and Manx; Gaels invaded Scotland from Ireland, bringing their language with them, and although it appears the two were distinguishable as early as the tenth century, the three are still largely mutually comprehensible) is now confined to a few fringes in the Highlands and the Hebrides, and with the coming of television, will likely be extinct within generations if no attempt is made to save it. Irish would hardly be in a better state were it not that the Irish Republic is making the effort to save it -- with limited success; English remains the dominant language of Ireland. Breton and Welsh are still spoken, and even undergoing a sort of literary revival, but both are become minority languages even in their homelands (and Breton has fully four dialects, one of which is barely mutually comprehensible with the other three. Breton orthography was not fixed until 1807). The result is that manuscript-era literary remains in Manx, Breton, and Cornish are effectively non-existant (even though we have a handful of minor writings in Breton, e.g., from the eighth century. Manx, by contrast, has no literary remains prior to the seventeenth century). Many Breton writers chose to write in French; others saw their works preserved only orally. The earliest Breton works are mostly religious, starting with the Life of Saint Nonn, from about 1475; these works were generally translations or adaptions; by the time more original works appeared, the printing press was firmly established (though not always used for Breton works). There are Breton folk songs and ballads, some of which look very old, but the written versions are relatively recent. There is somewhat more material in Scots Gaelic, but Scots, it should be recalled, is not the language of Scotland but of the Scottish Highlands; although the kings of Scotland prior to Malcolm Canmore were Highland kings, from Malcolm's time (reigned 1057-1093) they adopted lowland customs, including Braid Scots (which, in its most extreme state as spoken in the fifteenth century or so, scarcely resembled English, but was assuredly a Germanic and not a Celtic language!). Since the Highlands were not fully reincorporated into Scotland until after the Battle of Culloden (1746) and the Highland Clearances (which functionally destroyed the old clan system), and since the highlanders prior to that were a largely non-literary society, even Scots Gaelic probably never produced much real literature; the first true literary work was a Bible translation from 1801. Welsh and Irish are by far the strongest literary languages in the Celtic tradition. But even in these tongues, the literary tradition is actually an oral tradition, usually transcribed late in its history (though we have documents from as early as the sixth century) and with significant defects. Nor is the tradition rich. Of the Welsh tales now known (incorrectly) as "The Mabinogion," for instance, there is only one complete copy, The Red Book of Hergist (c. 1400); the earlier White Book of Rhydderch (c. 1325) is now fregmentary for several tales. There are earlier citations (none before about 1225), but their existence mostly demonstrates the impoverishment of the tradition, since they predate the Red Book by 300 years or more but contain little additional material. Irish relics are probably more common (one need only observe the many "Irish Miscellanies" now in print), but almost all are from oral tradition, found in late manuscripts, and usually only in one copy. The case of Irish differs a bit from the other Celtic languages, as the language had more time to develop and Ireland was never penetrated by the Romans (Ireland did suffer from Viking raids, but was never taken over by Germanic speakers as England was). There are inscriptions from as early as the fifth century in the Ogham alphabet; the earliest literary works seem to date from the eighth century (some have claimed dates as early as the sixth, making Irish the oldest vernacular literature in Europe). The oldest manuscript, the Würzburg codex, may be as old as the early eighth century. And, of course, many Irish monks travelled elsewhere (e.g. there was a strong Irish presence at Saint Gall).

Several dead Celtic languages are known to scholars (excluding Cornish, which has been revived, and Manx, which hasn't). Celtiberian, the Celtic language of Spain, is extinct but known from a few inscriptions. Galatian, used by the Gauls in Asia Minor, did not die out until some time around the fifth century C.E., but left few literary remains; we know of it from the histories of the period. There also seems to have been a Cumbrian/Cumbric language, spoken in the region of what later became the English-Scottish border, but this is all very hypothetical. Except for some translations into Welsh and other Celtic languages, the only remains of this tongue are some place names.

Albanian is an ancient language; although Indo-European, it is the only member of its linguistic group. But as a literary language, it is quite recent. There are no written remains from before the fifteenth century (a fragment by the Orthodox Bishop of Durrës is dated 1462, and some minor religious works date from about the same time; little else exists, as the Turks suppressed writing and publishing in Albanian). Even the few writings that exist are rather confused by the mixture of the Gheg (northern) and Tosk (southern) dialects, which show significant variants and have many local subdialects. (Albania is an extremely rough country, with settlers in the various valleys having little contact with each other.) It was not until 1909 that the Roman alphabet was formally adopted, and a Received Albanian (based on Tosk) was first promulgated in 1950. The result is a language with little use for textual criticism.

It is generally stated that Gothic is a dead language, with the only remnants being Bible fragments (see the article on the Gothic version), but Crimean Gothic is reported to have been used as late as the sixteenth century. I know of no actual literature in Crimean Gothic, however.

Armenian literature begins with the Bible (see the article on the Armenian version), but there was an active literary tradition in the early centuries of the Armenian church (observe how many foreign writings, such as Irenaeus and Ephraem, are preserved in Armenian; it's interesting to note that the earliest Armenian work seems to have been Aganthage's biography of King Tiridates, written in Greek but translated.) We also have, from the fifth century, Moses of Khorene's history of Armenia, with many excerpts from folk song, poetry, and epic. Later works were abundant though mostly religious and of little interest to non-Armenians. Armenia, however, has had a troubled history as a nation, rarely independent (and when, in periods like the Crusades, it achieved partial independence, it was split between many independent and uncooperative princes). The language has many dialects, and only a few million speakers; few writings other than the Armenian Bible are available in multiple copies.

Hungarian (Magyar), it should be noted, is not the language of the Huns, but the language of the later Magyar invaders. It is a non-Indo-European tongue, the most widely spoken representative of the Ugric branch of the Finno-Ugric family. The Magyars are an ancient people, and turned to Christianity soon after coming to Europe, but such writings as they produced in these early days were all in Latin. The first native literature dates from the thirteenth century, but it was slight (a few chronicles and legends); a standard orthography was not developed until the sixteenth century; this, and the need to develop a modified Roman alphabet to handle Magyar vowels, will have some effect on early texts in the language.

Basque is the westernmost non-Indo-European language of Europe, and has never been spoken by a large community. It did not develop a literature until the sixteenth century (poems by Bernard Dechepare, written 1545), and so has little in the way of a manuscript tradition, though there are inscriptions dating back to Roman times, and a few quotations (possibly not accurate representations of the original) in works in other languages.

Finnish long suffered as a result of Swedish political control of Finland (plus lots of Russian pressure); it did not become an official language until 1883. As John B. Oll writes, "Due to historical conditions... Swedish as a vehicle of culture has played and still plays an important role in Finnish life... Finland has a bilingual literature. Its historical development has been analogous to that of language and literature in Ireland and in medieval England, where the language of a minority gained such prestige that it for a long time overshadowed the language of the majority...." Russia annexed Finland in 1809, but that had little effect; the schools were and remained Swedish for a long time; the first Finnish school opened in 1859. There was little literature prior to that time; the first written work seems to have been a sixteenth century Bible translation. Even the great Finnish national epic, the Kalevala, was not written down until the nineteenth century, and is the edited work of a Finnish scholar (that is, as an entity, rather than a collection of fragments, it is the work of a single controlling hand, Elias Lönnrot.)

Estonian, which is also non-Indo-European (it belongs to the Finno-Ugric family) does not seem to have produced any literature prior to the sixteenth century, and written Estonian did not become widespread until the nineteenth century. (Even the Bible did not make it into Estonian until 1730, though there are some older liturgical works -- but they were printed as soon as they were written.) There is little scope for textual criticism.

Same is the official name for the language most would call Lapp or Lappish. It is not an official language anywhere, and there is little literary material.

The case is even worse for other European members of the Finno-Ugric group. Komi (Komian, Zyrian), for instance, is spoken in a small region of the Kola Peninsula (in northern Russia near the Finnish border), and although it is now a written language (it uses the Cyrillic alphabet), it has no literary remains. Much the same can be said of the other languages of this family.

Maltese is a complex blend of European and Semitic elements, thought to be derived primarily from Arabic but with a very large admixture of Indo-European vocabulary and written in the Roman alphabet. The population is small, and the educated population, until recently, was foreign. There is little Maltese material in manuscript form; the oldest recorded material seems to date from the seventeenth century.

Iberian is an apparently non-Indo-European language spoken in Spain in ancient times, now extinct. It is known only from inscriptions, and to date has not been deciphered.

Georgian as a written language is believed to predate the translation of the New Testament (hence the use of an alphabet not derived from the Greek), but of this literature, which is thought to date back to the third century B. C. E., nothing has survived. The post-Biblical literature was about what one would expect: Lives of saints believed to date from the sixth century, and an eighth century translation of St. Cyril. The first secular literature seems to date from about the twelfth century. From that time on, Georgia was almost constantly under outside domination (Mongols, Persians, Russians), meaning that relatively few manuscripts were preserved and printing came relatively late.

Turkish did not become a literary language until relatively late, but it also did not become a printed language until relatively late, and much material remained in oral tradition until quite recently. There is a significant place for textual criticism. An added complication is that the language has evolved quite rapidly (Old Turkish was spoken until the fifteenth century, and Modern Turkish did not come into use until the nineteenth century). In addition, the language was originally written in Arabic script, but in the twentieth century, Ataturk converted it to the Roman alphabet.

Arabic literature does not begin with the Quran; there are inscriptions which seem to date to the third century B.C.E. and earlier. These were not written in what we now know as the Arabic alphabet (see discussion below), and if by some chance written materials of this era have been preserved in more recent manuscripts, they must have undergone alphabetic conversion with all its hazards, as well as conversion from the archaic dialects. But it is unlikely that any such works survive; an anthology was undertaken in 772, but editor Hammad al-Rawiyah collected mostly oral works. The Quran is the earliest known work of Arabic prose, and the inspiration for most later Arabic literature (though there is a large corpus of Arabic translations of Greek philosophers; much of our knowledge of Greek mathematics, for instance, is known only from Arabic translations. Much of Greek astronomy is also known largely through Arabic; this is in part why the constellations have Latin names while the named stars usually have Arabic names). To make matters worse, most pre-Quran works have been edited to make them seem less pagan. (We see the same thing in the Hebrew Bible, e.g. with "Eshbaal" being written as "Ish-Bosheth.") These works follow some extremely strict structural formulae, giving them relatively little variety. In addition, Classical Arabic was largely fixed by the Quran, and is fairly distinct from the language most Arabic speakers use in their everyday lives (though most also know Classical Arabic, which is used as a means of communication between those who use distinctly different Arabic dialects). The existence of a fixed language distinct from scribes' own has doubtless affected the transmission of early Arabic literature. Thus there is scope for textual criticism here, but little real material from which to work.

The Quran resembles the Bible in that it was not composed as a single work. Although all parts were taken down by Mohammed, the 114 sections were written separately and only later combined. (This led to some dispute over which writings would be authoritative, and which texts of those writings.) There are various other mysteries associated with the Quran (such as the mysterious letters at the top of certain sections) -- but as the Quran survives in many, many copies and is maintained by a culture significantly different from the Western, we will not delve into its text here. This is particularly true since manuscripts of the Quran are considered secondary -- the goal in Islam is for a scholar to memorize the Quran, and at least one printed edition was compiled not from manuscripts but by comparing the recitals of a variety of experts. Thus, to some extent, Quran criticism must be viewed in the context of Oral Transmission.

It is interesting to note that the earliest surviving "manuscripts" of the Quran can be precisely dated -- for they are actually inscriptions in the Dome of the Rock mosque.

It is perhaps worth mentioning that the authorities for the Quran are in much better order than the Biblical. There is a complete and dated Quran from A.H. 168 (784/5 C.E.), and several other dated manuscripts from within a century of that date, as well as quite a few of that era or earlier which lack dates and are somewhat fragmentary. Thus the task of reconstructing the history of the text is much easier for the Arabic work than for the Bible.

On the other hand, the work of constructing the text itself has hardly even begun. I have seen only one book on textual scholarship of the Quran, Keith E. Small's Textual Criticism and Qur’ān Manuscripts (2011), which begins (p. 3) "It is widely acknowledged that there has never been a critical text produced for the Qur’ān based on extant manuscripts, as has been done with other sacred books and bodies of ancient literature."

An interesting problem with Arabic is that it was written in several different alphabets -- all ultimately derived from the Aramaic alphabet, but with much separate evolution along the way. In the process of that evolution, several new letters were added to the Arabic alphabet (Arabic has 28 consonants, Aramaic was written with 22.) This meant, first, that different letters might be confused in different scripts (e.g. some Arabic alphabets suffer from the problem of confusing d and r, well known to scholars of Hebrew; others do not confuse these letters), and second, that there might be occasional conversion problems.

Another thoroughly problematic language is Hindi/Urdu (Hindustani). To begin with, although grammatically a single language, it has two different cultural forms. Hindi, spoken in large portions of Hindu India, is written in the Devanagari alphabet (which is actually semi-syllabic), while Urdu, the language of Moslem Pakistan, is written in an alphabet similar to Persian Arabic scripts. Although both languages are derived largely from Sanskrit (a language with literary remains dating back to Old Testament times; the earliest Hindu literature is nearly as old -- and needs as much textual criticism -- as the Hebrew Bible), Hindi has been more influenced by the old language, which remains the language of its sacred writings. Texts in Hindi (as opposed to Sanskrit) begin to appear around the seventh century; Urdu did not begin to produce a literature until the fourteenth century. The oldest Hindi literature, the religious hymns of the Rig Veda, have a complicated history, first of oral tradition, then of compilation, then as the sole scripture of the proto-Hindu religion, then as one of several units, with a gradually standardized orthography, most forms of which are known only in printed versions. This history is at least as complicated as that of the New Testament, and requires equal specialization.

The modern nation of India is a federation of many ethnic groups, not all Indo-European speaking, and many of these languages (e.g. Assamese and some of the Dravidian tongues) have ancient literary works. The history of these must, sadly, be excluded as outside the scope of this author's library.

One of the most fertile fields for textual criticism is Akkadian, a language which presents challenges very different from those above. Akkadian is one of the greatest sources of ancient literature, featuring such works as the Epic of Gilgamesh (alluded to above) and the famous Enuma Elish -- both of which have parallels in the material in Genesis. But access to these works is extraordinarily complicated. The language is dead, and survives only in cuneiform works. It has relatives but no real linguistic descendants. The tablets on which the works are copied are sometimes damaged, and individual tablets of multi-tablet works are often missing. And while the tablets are generally very old (the largest share come from Ashurbanipal's library, from the seventh century B.C.E., with most of the others being older still), they are copies of works from still earlier eras -- and which have probably undergone much oral evolution in the interim. The scribes who copied it were trained primarily in record-keeping, not preservation of literature, since Akkadian was used largely for court documents and diplomatic correspondence, and often served as a lingua franca for people who did not speak Akkadian as a native tongue. This would strongly influence how scribes understood what they copied.

We also have "secondary" sources which may, in some cases, be primary. Parallels to portions of the Akkadian books exist in other languages -- in some cases (especially when the parallels are Sumerian), the parallel may have been the source or inspiration of the Akkadian work.

It will be evident that the scholar working on Akkadian (or other similar sources, such as Sumerian or Ugaritic/Canaanite) will need a much larger toolbox than the common textual critic; one must be a paleolinguist as well as a critic, and the ability to understand archaeology is also important. A good grounding in folklore wouldn't hurt, either! -- folklorists will have an insight into how certain tale-types tend to be transmitted, and the motifs they are likely to contain.

Egyptian and Coptic offer opportunities rarely found for other languages -- e.g. we have many older texts. There are many complications, though. One is the way the language was written: In syllabic hieroglyphics, in the demotic, and later in the Coptic, which came into use before the extra letters that were added to the Greek alphabet for the special phonemes of Coptic were fully standardized. This assuredly produced occasional complications -- a scribe might take down a royal edict in demotic, which was faster, and then transcribe it in hieroglyphic, for instance. Also, much that has survived has survived as wrappings of mummies. Apart from making it a difficult task to recover the materials, we also have to reassemble the documents so scattered and, perhaps, torn up. And Egyptian syllabaries ignore vowel sounds, depriving us of some information (e.g. verb tenses) useful in reconstructing texts.

The oldest Thai/Siamese works are inscriptions from the late thirteenth century; they use an indigenous alphabet based on other local scripts.

We have, of course, written materials from a wide variety of languages in addition to the above. But we can hardly perform textual criticism when we cannot read the language! Examples of lost languages for which we have texts include Mayan, Etruscan, and the language underying Cretan Linear A. This list could surely be multiplied. (We can, to some extent, read Etruscan, and have some ideas about Mayan, but the shortness of the contents of the former mean that it cannot be fully deciphered, while Mayan is too complex for understanding without additional materials.)

A different sort of problems come from non-alphabetic languages such as Chinese and Japanese. There are old texts in these languages, of course (we have Chinese texts from c. 1500 B.C.E.; Japanese texts do not appear until later -- the written language is thought to have been taken from Chinese models in the fifth century C.E. -- but there are documents believed to date from the eighth century C.E.. Japanese also possesses two kana syllabaries, which just make things that much more complex). However, the rules of criticism are different for non-alphabetic language. Haplographic errors, for instance, are less likely (since a repetition must involve whole words rather than just a few letters). There are no spelling errors, just errors of substitution and addition/omission. These languages do have other complexities, though -- for instance, Chinese writing was invented, according to legend, some time around 2650 B.C.E., but that version used only a limited vocabulary; many new symbols were added over the years, and this must be kept in mind in examining ancient texts. If a previously-unattested symbol occurs in an ancient work, it is a clear error -- but of what sort? Also, Chinese combines symbols in complex and varying ways, sometimes based on the sounds in a particular dialect -- which may be meaningless in another dialect. For these reasons, I will not consider ideographic languages, leaving them to critics with expertise in this rather different form of criticism.

There is also the matter of unknown languages. How do we engage in textual criticism of a text in a script such as Cretan Linear A, which we cannot read? The key to deciphering such a writing is getting good samples; if there are scribal errors, it can slow or halt the whole process. There is no general solution to this problem.

But the list of languages with literary remains is actually relatively slight. Of the thousands of currently-spoken languages, and the thousands more spoken up until the last century or two, the majority are not written languages, or were not written at the time of the invention of printing (many of the latter now have a literature consisting of a single book: A translation of the Bible, made in the last century or so by one of the translation societies).

While the above list oof languages is far from complete, the task of textual criticism is finite, even if the number of errors perpetrated by scribes sometimes seems infinite.

As a final topic, we should discuss another area where textual criticism has scope: Music. This poses some interesting questions: Musical notation has evolved heavily over the years (see the article on neumes for background). Is the scholar really expected to reconstruct the original notation, or just what it represents? One inclines to answer the latter; after all, nearly every modern New Testament printing includes accents, breathings, word divisions, punctuation, and upper and lower case letters, as well as a standardized spelling, even though the original autographs probably used these reader helps only sporadically if at all. Correcting the ancient notation to the modern, as long as the result is identical in meaning, is no different.

But, of course, the underlying musical form, and even the most fundamental details, are sometimes in doubt. Many types of music notation circulated in early times, and most were not as complete as modern notation (which in itself is not truly complete, as it has no way to record the actual dynamics of a performance). The notion of keys, for instance, is quite modern. This isn't really important (a tune is the same in the key of C as in the key of G, it's just sung in a different voice range and with different instrumental accompaniment). But the inability of old formats to convey accidentals, or timing -- or quarter tones, as are found in some eastern music -- makes the reconstruction harder.

There are even occasional odd analogies to Biblical criticism. Certain manuscripts, for instance, have an odd similarity to the K^etib and Q^ere variation on YHWH/Adonai. This is the so-called musica ficta or "feigned music." Under the notation systems of the time, performers were only "supposed" to play certain notes -- but sometimes those notes sounded bad. (For example, in the key of F, hitting a B note instead of a B flat produces a tritone -- a very harsh sound. But the notation didn't allow B flat to be written.) So musicians were expected to read these notes and play something else -- just as Jewish lectors were expected to read YHWH and say Adonai. We, unfortunately, generally can't tell what note was meant -- and so we can't reconstruct the pieces with perfect precision even if we have a correct copy of the original notation.

There are also problems of scholarly presuppositions. A noteworthy example of this is Chappell's book Popular Music of the Olden Time (with variant titles such as Old English Popular Music). Chappell's first edition of this made certain assumptions about the scales used in old pieces. Later, the book was revised by Wooldridge, who made fewer assumptions and wound up with noticeably different melodies for certain of the songs. This, too, has analogies to criticisms of texts, where scholars may reject a reading as grammatically impossible.

Incidentally, the problem of reconstruction goes far beyond the manuscript era, and even the invention of modern notation. For two reasons. One has to do with folk songs. Many of these were transcribed in the field by students with limited musical skills -- meaning that aspects of the tune, especially the timing, were often taken down incorrectly. (Folk musicians often have problems with timing. Pitches they can test against an instrument; timing requires testing with a metronome, a much more difficult process.) The other has to do with alternate notations, such as tonic sol-fa. Tonic sol-fa was invented as a means of making music easier to read, but continued to be used for about a century because it was a notational form capable of being reproduced exactly (and easily) on a typewriter, or by hand on ordinary paper (as opposed to staff paper). But it generates a completely different sort of error from standard notation or from neumes. When copying the graphical notations, the typical error will be one of moving a note up or down a bar line (I know; I've done this) or missing a note or (more likely) a measure. Errors in timing are rare in copying notation, and the transposed note will usually harmonize with the original. Not in tonic sol-fa! The "notes" in sol-fa are d (do), r (re), m (mi), f (fa), s (sol), l (la), t (ti). The typewriter being laid out at it is, this means that common errors would include re/ti (r/t) and the rather more harmonious sol/do (s/d) and do/fa (d/f). Similar types of errors could occur in the timing, though I won't spend more effort to explain.

Appendix III: The Bédier Problem

We alluded to this a couple of times above: The "Bédier Problem" is the strong tendency of Lachmann-type stemma to fall into two and only two main branches. The problem was descried in Joseph Bédier's edition of the Lai de l'Ombre. He claimed that 105 out of 110 editions he checked had two-branch stemma.

It's worth noting that this is a non-issue in most fields where stemmatics is used. For example, one would expect a two-branch stemma in linguistics. Proto-Germanic didn't split into Old English, Old Norse, and Old German all at the same time; two of these three must be more closely related than the third. Similarly with different species; they only split into two, not three or four. Splits might be very close together, but they wouldn't happen at exactly the same moment! But it is perfectly reasonable to assume that there were three or four distinct copies of the archetype of, say, Virgil's Aeneid, so logically the stemma should have three or four branches going back to those four copies.

Bédier charges scholars with deliberately not seeing these branches. According to the translation on p. 13 of John M. Manly and Edith Rickerrt's The Text of the Canterbury Tales, volume II, "Our two-branched trees have not all grown that way. They are for the most part pruned trees. In other words, he who is preparing an edition of a text normally arrives at a system which distributes the MSS into several families, three or more; when he comes to the final operation, however, mechanical though it appears to be, which consists in establishing the text, he discovers in the course of this work, and only then, as disclosures of the last moment, reasons to modify the system, to remodel it, to simplify the tree. Everything takes place as if, with a desire for his personal comfort, he were making an effort to free himself from too rigid a law." If there were three branches to the tree, Bédier charges, then the critic would have to, in general, adopt the reading of two branches over the reading found in one. But by attaching everything to one of two branches, "he recovers a part of the authority that he had imprudently given away." On p. 14, Bédier concludes that instead of the "brazen rule" of editing when there are numerous manuscript families, they end up with the "'leaden rule' of a classification of two branches. And since there was no attempt at falsification, since not one of them foresaw this result, fatal as it was, it is this fatality which without their knowledge descended upon them."

Bédier suggested that critics, being afraid of not going far enough in grouping their manuscripts, instead insisted on going too far: Since, as long as there were more than two branches, it was possible that two of them could be combined, the critic would continue to look for ways to join them. Only when the tree was reduced to two and only two branches was it possible to say, "There are no more groups to join." In this area, at least, there is every reason to think New Testament critics are every bit as guilty as the scholars Bédier attacked so harshly. Is there any real evidence that D and 614-1505-1611-2138-2412-hark form a single family in Acts? That 𝔓⁴⁶-B and ℵ-A-C-33-81-1175 for a single family in Paul? On the contrary, the most detailed analyses (e.g. Zuntz) seem to indicate the exact contrary.

There is something rather Housmanian about the final phase of Bédier's argument (Manly & Rickert, p. 15): the critic has three final groups, x, y, and z, and "it will rarely happen that he does not find some variants joining x and y against z (or x and z against y, or y and z against x) which suggest that these agreements may represent innovations, that is to say, faults. Merely a possibility, no doubt, but a sort of moral necessity obliges him to dwell upon this idea, and it becomes a scruple which obsesses him. He cannot free himself from it. He can find peace of mind only when he becomes convinced that this possibility is more than a possibility, and that such and such of the readings, doubly attested (by x and y against z, for example) are in effect variants, that is, faults. It is not with impunity that he has become accustomed to oppose the good reading to the bad, light to darkness, Ormuzd to Ahriman: the dichotomic force once released acts to the end. The Lachmannian system has launched him upon the chase of common faults but without giving him any means of any means of knowing at which moment he ought to stop."

There have been many sorts of refutations of Bédier -- see, for instance, the article "The Introduction to the Lai de l'Ombre: Half a Century Later" by Frederick Whitehead and Cedrick E. Pickford, available in Christopher Kleinhenz, editor, Medieval Manuscripts and Textual Criticism. There are some interesting arguments there, but I'm going to address this in my own way.

I haven't seen Joseph Bédier's work (not only is it in French, so are many of the responses to it, and I don't read French), but I can instinctively understand what he is saying. So I'm going to demonstrate artificially how this can arise in the context of Lachmann's work. I stress that the example which follows is artificial, and many of the variants are trivial and orthographic and not really of textual significance, but the whole is based on real materials -- in this case, one of the "Sloane Lyrics," found in British Library Sloane MS 2593.

This is often regarded as a "dirty" piece (though this is by no means certain), but we certainly don't have to get into that. I picked it because I had so many divergent editions and have a good photo of the original manuscript.

We take our base text from James J. Wilhelm's Medieval Song, p. 358; we'll take only the first stanza (and add line breaks, which are marked in the manuscript although the text is written continuously):

1) I have a gentil cock,
2) Croweth me the day;
3) He doth me risen erly,
4) My matins for to say.

Against this we collate various texts. We'll denote Wilhelm as "W." We also cite Maxwell S. Luria and Richard S. Hoffman, Middle English Lyrics, item #77 (cited as L); R. T. Davies, Medieval English Lyrics, #64 (cited as D); Chris Fletcher, 1000 Years of English Literature, transcription on p. 34 (cited as F); Brian Stone, Medieval English Verse (Penguin Classics), p. 103 (cited as G -- for "goofed up").

Finally, we include this modernized-but-not-as-heavily-paraphrased-as-in-G version under the symbol M:

I have a gentle cock,
Croweth for me day;
He bids me rise up early,
My matins for to say.

The Collation

1) gentil FLW ] gentle DM, noble G || cock DFMW ] cok L, cockerel G
2) croweth DLMW ] crowyt F, whose crowing G || me DFLW ] starts my G, for me M || the W ] omit DFGLM
3) doth DFLW ] makes G, bids M || risen DLW ] rysyn F, get up G, rise up M || erly DFLW ] early GM
4) matins DLMW ] matynis F, morning prayer G

Now we do as Lachmann did, and decide what is an error. And suppose we decide that modern English spellings are correct, and anything else is an error. By that method, we would probably come up with this stemma, where [A] is of course the archetype and [B], [C], and [E] lost copies:

                  [A]
                   |
        ---------------------
        |                   |
       [B]                 [C]
        |                   |
  ------------           ------
 [E]    |    |           |    |
  |     |    |           |    |
 ---    |    |           |    | 
 | |    |    |           |    |
 D L    F    W           M    G

Your standard two-branch stemma. And [C] would probably be considered more accurate than [B].

This is in fact just plain wrong. D, F, L, and W are a genuine family -- they're all touched-up editions of the actual text. But G and M are not a family; they are independent modernizations of the (hypothetical) original text. This is a well-known phenomenon in biology, known as long-branch assimilation. So the true stemma is:

            [A]
             |
     ------------------------
     |             |        |
[B]=Sloane MS      |        |
     |             |        |
----------         |        |
|  |  |  |         |        |
D  F  L  W         G        M

Three branches, and [B], which in the first stemma was less accurate, is in fact more accurate!

Just to put this in perspective, the following shows how we would reconstruct the original based on each of the two stemma -- plus the original, which shows how far astray they both can lead us

Reconstruction based on Stemma #1

I have a gentle cock
Croweth me day
He doth me risen early
My matins for to say

Reconstruction based on Stemma #2

I have a gentle cock
Croweth me day
He doth me rise up early
My matins for to say

The actual manuscript reads

I haue a gentil cook
crowyt me day
he doþ me rysyn erly
my matyins for to say

Obviously the artificial nature of the witnesses here hurt us. (But I had to use something artificial -- or else make this appendix much longer.) The point stands: Agreement in error will tend to produce two-branch stemma, because (no matter how many actual textual groupings there are) each group will almost certainly be closest to one other group. In essence, the Bédier problem is that the easiest way to determine a stemma is to find the two most extreme textual groupings, then attach all intermediate groupings to one or the other. This is more or less what happened to Zuntz: He observed the Alexandrian and Western types in Paul, as well as the P⁴⁶-B type, and noted that the latter approached the Alexandrian more than the Western, and so classified it as "proto-Alexandrian," then noted that 1739 was closer to the Alexandrian than the Western, and closer still to P⁴⁶-B, and so classified 1739 with the latter -- in effect reducing four text-types to two.

It is unfortunate that the first major response to Bédier was Quentin, who replied with the "Rule of Iron." This had two problems. First, it was too complicated for algorithm-hating textual critics, and second, methodologically, it assumed (in effect) that all trees had three branches, which is even worse than assuming two (because, if there are only two, Quentin's method will inherently favor one branch of the tree). So Quentin added further mud to an already complex problem.

Bédier's own response was to propose the Best Text edition, which in essence amounts to finding (by whatever means you prefer -- sometimes it was little better than guessing) which manuscript is best and following that relatively blindly, correcting (at most) only the clearest errors. Thus, for the New Testament, that would probably amount to printing the text of B where it is extant, ℵ or A where it is not. Or, if you're a Byzantine type, then some K^x or K^r manuscript that you find particularly convenient.

In particular, Bédier (according to one of his students) suggested picking a manuscript which "is of the poet's own dialect, which is relatively old, which does not have many mechanical defects and one should reproduce this text without attempting correction unlss there is a proved slip of the pen" (see George Kane's article on emendation in Christopher Kleinmetz, editor, Medieval Manuscripts and Textual Criticism, University of North Carolina Press, 1976. The quote is from p. 214. Obviously the comments refer specifically to medieval poets, but the argument generalizes.)

The problems with this approach are surely obvious. A. E. Housman, who had much to say about the problems with best text editions (some of it quoted in his biography in this encyclopedia), gave a good visual illustration of one difficulty with choosing a best text and printing it; he suggested a case where one manuscript is a faithful copy of an original which has been redacted, while another manuscript is an extremely careless copy of a pure original. This situation definitely can arise -- consider the "Western" text of Paul, and particularly the manuscripts D and F. D has more major differences from the Alexandrian and Byzantine texts than does F; it is clearly more thoroughly reworked than F. But F is a careless copy of the common ancestor of the pair F+G, and the F+G archetype itself was a careless copy of the archetype of the "Western" text -- F and G, although they have fewer major deviations from the original text of Paul than D, have more minor deviations such as changes of a preposition. So which is the better example of the "Western" text, D or F? Housman offers an analogy: Who is heavier, a tall thin man or a short fat man? You can't know. Similarly, you can't tell whether D or F is the best manuscript, at least based on the short description here. What is certain is that there are readings where D is more correct and readings where F is more correct, and a best-text edition based on either one -- even if it intended to be a best-text edition of the Western text! -- would be frequently wrong. And while the problem is more obvious when we are talking about D and F than when we are talking about, say, B and ℵ, a best-text edition of the New Testament based on B or ℵ would still be inferior to an edition where both are allowed their voice.

It is true that some people accuse Hort of a best-text edition, but Hort was willing to go against B; he just followed B when he couldn't decide on other grounds. That is entirely a different thing! -- a copy-text edition, not a best-text edition. Ultimately, the solution almost certainly is not to abandon stemma; it's to come up with a better solution to creating stemma than agreement in error. The solution to the Bédier problem is probably not the best text edition, it's cladistics.

(I do find it ironic that Bédier's ideas were a direct response to Lachmann, and Hort has been accused both of making Bédier's error and of making Lachmann's. Sounds to me as if Hort probably did exactly the right thing based on the materials he had available!)

More recently some scholars have gone beyond Bédier and in essence claimed that they don't want to find the original; this idea, for instance, appears in Tim William Machan's Textual Criticism and Middle English Texts. This is, in one sense, a defensible decision; he deals only with extant copies. But it's not textual criticism. There are times when we want to study the extant copies, and there are times when we want to study the original, and let's do both and not pretend they're the same thing.... Machan argues, correctly, that almost all our work in textual criticism is lexical, that is, we operate on the words of the text (as opposed to orthographic details such as punctuation) -- but, in a text that was copied (and perhaps written) without punctuation or word divisions, what else are we supposed to do? All that has been preserved is the lexical data.