Editorial policies
Screen presentation in the edition and in text files
There are tooltip clarifications to explain such features as unclear and supplied text, and (in diplomatic transcriptions only) deleted and added text, appearing on hover or touch, according to device. Other features with tooltips are signalled by blue text, especially biographical information on persons. Diplomatic and normalised screen versions have corresponding red text to show where they differ elsewhere, apart from ſ ~ s. Tooltips for annotations in the diplomatic version give information on the annotator.
Conventions in screen and file versions are summarised in the table below. Editorial footnotes only appear in the diplomatic transcription, numbered afresh on each page.
handwritten original | TEI/XML file filename.xml | diplomatic text on-screen | normalised text on-screen | diplomatic plain text file filename.txt | normalised plain text file filename-n.txt |
---|---|---|---|---|---|
long s | ſ (UTF-8 long s) | ʃ (UTF-8 esh) | s (normal s) | ſ (long s) | s (normal s) |
any symbol for and | & | & | & | & | & |
other characters | normal print equivalent | normal print equivalent | normal print equivalent | normal print equivalent | normal print equivalent |
&c. = 'et cetera', No. = 'Number', Dr = 'Doctor' (as title), Mr, Mrs, Messrs, PS, St = 'Saint' | as written,1 untagged | as written1 | as written1 (inline) | as written1 (inline) | as written1 (inline) |
other abbrev'ns, incl. Dr = 'Dear', 'Doctor' (as common noun), 'Dowager', St = 'Street' | tagged 1 <abbr> ~ <expan> | as written1 | expanded (if known) | as written1 (inline) | expanded (if known) |
initial for name | tagged <abbr> ~ <expan> | as written | expanded (if known) | as written | expanded (if known) |
dash as punctuation 2 | -- | -- | -- | -- | -- |
obsolete spelling known at the period (acc. to OED) | tagged <orig> ~ <reg> | as written | normalised | as written | normalised |
idiosyncratic spelling or error | tagged <sic> ~ <corr> | as written[sic] | normalised | as written | normalised |
initial capital 3 | as written, untagged | as written | as written | as written | as written |
(non)use of possessive apostrophe, incl. e.g. gen. sg. any bodies | as written | as written | as written | as written | as written |
verb with d, 'd for -ed | tagged <orig> ~ <reg> | as written | normalised | as written | normalised |
foreign word or phrase4 | tagged | unmarked | unmarked | unmarked | unmarked |
obsolete morphology (e.g. had wrote, She eat some chicken) | tagged <orig> ~ <reg> | as written | normalised | as written | normalised |
text supplied by editors | tagged | [supplied text] + tooltip | [supplied text] + tooltip | supplied text, unmarked | supplied text, unmarked |
text added by writer | tagged | |added text| + tooltip | added text, unmarked | added text, unmarked | added text, unmarked |
substitution by writer | tagged | + tooltip + |substitute| + tooltip | substitute only, unmarked | substitute only, unmarked | substitute only, unmarked |
deleted text | tagged | + tooltip | text absent | text absent | text absent |
deleted text, unreadable or uncertain | tagged <del> + <gap> | + tooltip | text absent | text absent | text absent |
unreadable or uncertain text | tagged <gap> | [------] + tooltip | [------] + tooltip | <GAP: nn units> (characters, words, lines) | <GAP: nn units> (characters, words, lines) |
unclear or damaged but reasonably certain text | tagged | unclear text (wavy underline) + tooltip | unclear text (wavy underline) + tooltip | unmarked | unmarked |
superscript, subscript, position above/below line | tagged <hi> or <add> | formatting displayed | formatting absent | formatting absent | formatting absent |
underline (various styles) | tagged <emph> | formatting displayed (single underline) | formatting absent | formatting absent | formatting absent |
boundary stroke or line (≠ word underline) | (some) tagged <milestone> [in progress] | thin horizontal line | thin horizontal line | ignored | ignored |
new line | tagged | as written | as written | as written | as written |
word split across lines 5 | tagged <orig> ~ <reg> | as written5 | reassembled on first line without internal punctuation | as written5 | reassembled on first line without internal punctuation |
new paragraph at linebreak ± indent | tagged | as written | as written | no indent | no indent |
centred text 6 | tagged | centred, on new line | centred, on new line | left-aligned, on new line | left-aligned, on new line |
right-aligned text 6 | tagged | right-aligned, on new line | right-aligned, on new line | left-aligned, on new line | left-aligned, on new line |
new column or page | tagged | ruled line | ruled line | blank line | blank line |
catchword | tagged <fw> | catchword + tooltip | text absent | <CATCHWORD: word> | text absent |
surplus word | tagged | surplus word + tooltip | word absent | <SURPLUS: word> | word absent |
editorial footnote | tagged <note/@resp> | lemmanumeral + tooltip | note absent | note absent | note absent |
quoted speech | tagged <q> | unmarked | unmarked | unmarked | unmarked |
literary or biblical quotation | tagged <cit/quote + bibl> | quoted text | quoted text | quoted text | quoted text |
line of verse | tagged <l> | unmarked | unmarked | unmarked | unmarked |
change of hand in letter as sent | tagged <handShift> | unmarked unless footnote needed | unmarked | <HANDSHIFT> | <HANDSHIFT> |
annotation not present in letter as sent | tagged <note/@hand> | annotation + tooltip | annotation absent | <ANNOTATION: annotation> | annotation absent |
moved section 7 | original and destination locations tagged <anchor>, <ref> | ▼ at original location + tooltip, footnote at destination | ▼ at original location | no indication at original location, <MOVED> at destination | no indication at original location, <MOVED> at destination |
Notes to table
1 Any punctuation under superscripted letter(s) in abbreviations is placed last, regardless of relative left-right orientation in the original. Thus, Mr. Mr: Mr– Mr may occur (inline versions Mr. Mr: Mr- Mr), but M.r M:r M-r will not. A letter+macron abbreviation (ac̄ept, com̄and, etc.; Bāloon) is generally expanded as doubling of that letter (accept, command) or an adjacent one (Balloon), but note Com̄ps,thrō, wc̄h (Compliments, through, which).
2 The dash as punctuation, represented by two hyphens, always has a space on either side. By contrast, a single unspaced hyphen character is used for normal hyphen (well-known) and horizontal stroke under superscript abbreviation (Mrs–). Unspaced double em-dash is used for a dash that suppresses all or part of a name or place (Miſs —— = ‘Miss Goldsworthy’, their —— = ‘their Majesties’, to —— = ‘to Windsor’, Mr. H—— = ‘Mr. Hodges’, Ly– S.—— = ‘Lady Stormont’, the K——g = ‘the King’), shortens a word (by T——w = ‘by Tomorrow') or euphemistically blanks all or part of a profanity (D——d = ‘Damned’).
3 In some hands it can be difficult to distinguish upper and lower case in word-initial position. Decisions are based on close comparison with other letter-forms in the same hand, but some arbitrariness is inevitable.
4 French and other foreign languages are not normalised – neither corrected nor regularised to present-day grammar and orthography. Place-names and personal names are not generally normalised either.
5 Words split across two lines may have a hyphen on the first, the second or both fragments (reco-|ver, imperfect|-ly, satisfacti-|-on); or a double hyphen (pur=|port, dan|=ger, qua=|=litys); or none (respect|ing).
6 Centred text and right alignment are simulated on-screen by extra indentation.
7 Insertions that interrupt the text are moved to their logical point or to the start or end of a letter; address panels are placed at the end.
Project files
The master-copy of each document in the project is an XML file conforming to TEI P5. End-of-line is LF only (Unix style).
Two different TXT files are derived from each (transcribed) XML file: plain and (partially) normalised. The main purpose of normalisation is to facilitate research and improve part-of-speech tagging; coverage is subject to change. EOL is CR + LF (Windows style).
The transcriptions released to date, with each TXT format in a separate zip file, are freely available for non-profit use to anyone who registers. Just fill in our simple online form here.