Date Descriptor
Date Descriptor
There is a need for a concise (but also precise) way of expressing approximate dates and date ranges, especially one suitable for working with medieval documents. My main application area is entering place-name spellings in a database. I am not aware of any standard defined by historians, except the s.xii2 notation, and this is not powerful enough for the demands of place-name research. With the system described below, if I read that the date of a charter is “early thirteenth century to circa 1240”, I just write e13C‑c1240 and do not have to think or interpret further. No information is lost, and my outputs will be consistently formatted whatever the editor wrote. See many more examples below.
Design criteria
- Quick to type, with no redundant keystrokes required
- No loss of information, and no extra assumptions imposed
- Unambiguous, but still with some looseness allowed in the inputs (so one does not have to remember too many rules; see 1W1 examples below)
- Compact, to save space in e.g. long lists of field-names
- Human-readable (or at least human-understandable with minimal need to refer to reference tables.)
- Suitable simultaneously for e.g. making quick notes in the record office, for permanent records in databases, and for final use in printed and web publications.
- Machine-readable, in a strict sense, so that invalid specifications will be detected and rejected, and also so that code can be written which “understands” the date specifications and can process them meaningfully.
- Able to express all date information as commonly used in editions of documents (thus not requiring any re-interpretation of what an editor has already decided)
- Automatically expandable to a more readable form (see verbose output below)
- Regnal years handled automatically
- Covers exact dates as well as approximate ones, and ranges with possibly both ends uncertain
- Automatically sortable into date order
- Uncertain dates are sorted by the latest likely date, so that nothing is listed too early
- Case-insensitive (except for monarchal names in regnal years); though lower case is preferred for the prefixes e for ‘early’, m for ‘mid’, and upper case for C for ‘century’, and this is provided by the normalized output
Typeset example
The screenshot shows the system in action specifying some Dodnash charters for Cattawade in Suffolk, the output of the python code below having been fed through the LaTeX typesetting system.
Software
The following two files are a complete implementation in python, a free language available for all operating systems; the code works in the current python 3.5 versions, and also in the legacy python 2.7 series. The codes are offered without any guarantees: date_descriptor_09.py and regnal_year_03.py. I would appreciate bug reports or other feedback. The following extensions are in in progress:
- A syntax t.Ed3 for “tempore” will be added soon, indentical to the standard regnal year syntax, but allowing t or t. in place of a numeral. The verbose output will expand to the limits of the reign in question, and the sort value will be at the end of the reign.
- At present “n.d.” for ‘no date’ can only be achieved by a prenote; e.g. [n.d. ]?1435. A special syntax is being considered for this.
Grammar
An instance of DD (Date Descriptor) is parsed by first checking whether it is a regnal year by a collection of special rules. If it is not, it is parsed by the following grammar. A year value for sorting is computed (a value of -1 indicates a syntax error in the input), and both normalized and verbose outputs are also created. The syntax is best understood by looking the examples below. Note that the traditional a. for ‘ante’ is completely avoided (partly because of the danger of confusion with ‘after’) in favour of <, and the symmetrical > is provided for ‘after’. The regular expression syntax is standard (see e.g. here); for example \d means a single digit, ? means ‘optional’, [] means a single one of the enclosed characters, and $ means the end of the expression. The current version does not check for nonsensical constructions like 1234-1150, but checks could easily be added.
prenote ='(\[(.*?)\])?' # arbitrary text in [...]
postnote ='(\[(.*?)\])?' # arbitrary text in [...]
circa ='((c\.?)|(circa))' # "c" or "c." or "circa" for 'circa'
uncertain ='(\?)'
ba ='([<>])' # before or after
third ='[123]t'
quarter ='[1234]q'
eml ='(em)|(ml)|[eml]' # e=early, em=early to middle, l=late etc.
prefix ='((%s)|(%s)|(%s))'%('%d',third,quarter,eml,)
century ='(\d\d?)[Cc]' # e.g. 12C
decade ='(\d{2,3}0)s' # e.g. 1260s
year ='(\d{1,4})'
simplerange='((1\d\d\d[-]\d)|(1\d\d\d[-]\d\d)$)' # e.g. "1243-7"
oldstyle ='((\d{3}[012345678]/\d)|(\d{3}9/\d{2}))' # e.g. "1355/6"
first ='{uncertain}?{ba}?{circa}?{prefix}?({simplerange}|{oldstyle}|{decade}|{year}|{century})'
second ='{uncertain}?{ba}?{circa}?{prefix}?({oldstyle}|{decade}|{year}|{century})'
dd_grammar =prenote+first+'((([x-–])|([-]{2}))'+second+')?'+postnote+'$' # the full grammar!
The syntax illustrated by examples
Here the first column is input, and the next three columns are output from the python code. The normalized output is intended to appear in publications derived from the input; this then ensures consistency of layout and formatting. Note that an ordinary hyphen (-) for a range is converted to an en-dash (–) in the normalized output.
input sort normalized output verbose output
--------------------------------------------------------------------------------
1234-5 1235 1234–5 1234–5
1234-50 1250 1234–50 1234–50
1101/2 1102 1101/2 1101/2
1109/10 1110 1109/10 1109/10
<1255 1255 <1255 before 1255
>1255 1255 >1255 after 1255
<c1255 1255 <c.1255 before circa 1255
1250-c1255 1255 1250–c.1255 1250 to circa 1255
1250-<1255 1255 1250–<1255 1250 to before 1255
1250-<c.1255 1255 1250–<c.1255 1250 to before circa 1255
1260s 1260 1260s 1260s
?em13C 1375 ?em13C perhaps early to middle 13th century
em13C 1375 em13C early to middle 13th century
ml15C 1600 ml15C middle to late 15th century
1234 1234 1234 1234
cM13C 1366 c.m13C circa middle of the 13th century
cm13C 1366 c.m13C circa middle of the 13th century
e13C 1333 e13C early 13th century
M12C 1266 m12C middle of the 12th century
?M12C 1266 ?m12C perhaps middle of the 12th century
1q15C 1425 1q15C first quarter of the 15th century
1t15C 1433 1t15C first third of the 15th century
1q15C-l16C 1600 1q15C–l16C first quarter of the 15th century to the late 16th century
l15C-1q16C 1525 l15C–1q16C late 15th century to the first quarter of the 16th century
9C--e10C 933 9C–e10C 9th century to the early 10th century
1234-1267 1234 1234–1267 1234 to 1267
1234x1267 1234 1234x1267 1234 to 1267
c1230 1230 c.1230 circa 1230
12C 1200 12C 12th century
c950 950 c.950 circa 950
cm9C 966 c.m9C circa middle of the 9th century
Ce16C 1633 c.e16C circa early 16th century
cl12C 1300 c.l12C circa late 12th century
c.l12C 1300 c.l12C circa late 12th century
CL12C 1300 c.l12C circa late 12th century
1 Henry 2 1155 1 Henry II 1 Henry II (1154/5)
3 Henry IV 1402 3 Henry IV 3 Henry IV (1401/2)
3H4 1402 3 Henry IV 3 Henry IV (1401/2)
3Hen4 1402 3 Henry IV 3 Henry IV (1401/2)
3 William II 1090 3 William II 3 William II (1089/90)
3W2 1090 3 William II 3 William II (1089/90)
1W I 1067 1 William I 1 William I (1067)
1 W1 1067 1 William I 1 William I (1067)
1W 1 1067 1 William I 1 William I (1067)
1W1 1067 1 William I 1 William I (1067)
1W2 1088 1 William II 1 William II (1087/8)
26E3 1352 26 Edward III 26 Edward III (1351/2)
26Ed3 1352 26 Edward III 26 Edward III (1351/2)
26Edw3 1352 26 Edward III 26 Edward III (1351/2)
26 Edward 3 1352 26 Edward III 26 Edward III (1351/2)
26 Edward III 1352 26 Edward III 26 Edward III (1351/2)
1R1 1190 1 Richard I 1 Richard I (1189/90)
3R2 1380 3 Richard II 3 Richard II (1379/80)
2R3 1485 2 Richard III 2 Richard III (1484/5)
[perhaps]1205-l13C 1305 perhaps 1205–l13C perhaps 1205 to the late 13th century
1205[, or later] 1205 1205, or later 1205, or later