Keith Briggs

This page was last modified 2024-01-21  

.
.


home 
·publications 
·thesis 
·talks 
·meetings 
·records 
·maths notes 
·software « 
·languages 
·music 
·travel 
·cv 
·memberships 
·students 
·maps 
·place-names 
·people 
·photos 
·links 
·ex libris 
·site map 


.
Document Date Descriptor

Document Date Descriptor (DDD)

There is a need for a concise (but also precise) way of expressing approximate dates and date ranges, especially one suitable for working with medieval documents. My main application area is entering place-name spellings in a database. I am not aware of any standard defined by historians, except the s.xii2 notation, and this is not expressive enough for the demands of place-name research. No information is lost with these DDD encodings, and the outputs will be consistently formatted (and sorted by date, if so desired) whatever the editor wrote. A crucial requirement is the ability for the notation to be able to be parsed by software.

Quick start: some real-world examples

Here the left-hand column contains real examples taken from editions of medieval documents, and the right-hand column contains the same information in DDD notation. Many more examples are at the bottom of this webpage.

early thirteenth century to circa 1240e13C-c1240
n.d. [?1st ½ 13c.]nd?1h13C
possibly 1240 to circa 1255p1240-c1255
mid 13th centurym13C
second decade of the 15th century1410s
late 13th century to about 1320l13C-?1320
first quarter of the 16th century1q16C
1 Henry IV1H4

The basic elements and their meaning

  • ?, p, c, nd: perhaps, probably, circa, no date. No precise meaning is given to the first two, but “probably” is intended to denote greater certainty than “perhaps”.
  • <, >: before, after
  • e, m, l: early, middle, late (applies to decades and centuries)
  • em, ml: early to middle, middle to late
  • s following digits: decade
  • C following digits: century
  • h, t, q: half, third, quarter (applies to centuries; must be preceded by a digit)
  • -: to
  • 1 Ed 2: typical regnal year specification

General design criteria for the notation

  1. Quick to type, with no redundant keystrokes required.
  2. No loss of information, and no extra assumptions imposed.
  3. Unambiguous, but still with some looseness allowed in the inputs (so one does not have to remember too many rules; see 1W1 examples below).
  4. Compact, to save space in e.g. long lists of field-names.
  5. Human-readable (or at least human-understandable with minimal need to refer to reference tables).
  6. Extractable from free-format text by simple pattern-matching rules.
  7. Suitable simultaneously for e.g. making quick notes in the record office, for permanent records in databases, and for final use in printed and web publications.
  8. Machine-readable, in a strict sense, so that invalid specifications will be detected and rejected, and also so that code can be written which “understands” the date specifications and can process them meaningfully.
  9. Able to express all date information as commonly used in editions of documents (thus not requiring any re-interpretation of what an editor has already decided).
  10. Automatically expandable to a more readable form (see verbose output below).
  11. Regnal years handled automatically.
  12. Covers exact dates as well as approximate ones, and ranges with possibly both ends uncertain.
  13. Automatically sortable into date order.
  14. Uncertain dates are sorted by the latest likely date, so that nothing is listed too early.
  15. Case-insensitive (except for monarchal names in regnal years); though lower case is preferred for the prefixes e for ‘early’, m for ‘mid’, and upper case for C for ‘century’, and this is provided by the normalized output

Typeset example

The screenshot shows the system in action specifying some Dodnash charters for Cattawade in Suffolk, the output of the python code below having been fed through the LaTeX typesetting system.

images/cattawade.png

Software

The following two files are a complete implementation in python, a free language available for all operating systems; the code works in the current python 3.5 versions, and also in the legacy python 2.7 series. The codes are offered without any guarantees: date_descriptor_11.py and regnal_year_03.py. I would appreciate bug reports or other feedback. An extension to regnal years allowing the syntax t.Ed3 for “tempore” will be added soon. It is identical to the standard regnal year syntax, but has t or t. in place of a numeral. The verbose output will expand to the limits of the reign in question, and the sort value will be at the end of the reign.

The DDD grammar: informal description

An instance of DDD (Document Date Descriptor) is parsed by first checking whether it is a regnal year by a collection of special rules. If it is not, it is parsed by the following grammar. A year value for sorting is computed (a value of -1 indicates a syntax error in the input), and both normalized and verbose outputs are also created. The syntax is best understood by looking the examples below. Note that the traditional a. for ‘ante’ is completely avoided (partly because of the danger of confusion with ‘after’) in favour of <, and the symmetrical > is provided for ‘after’. The regular expression syntax is standard (see e.g. here); for example \d means a single digit, ? means ‘optional’, [] means a single one of the enclosed characters, and $ means the end of the expression. The current version does not check for nonsensical constructions like 1234-1150, but checks could easily be added.

nodate     ='(nd)|(n\.?d\.)|(no?date)'
prenote    ='(nodate|\[(.*?)\])?'   # arbitrary text in [...]
postnote   ='(\[(.*?)\])?'          # arbitrary text in [...]
circa      ='((c\.?)|(circa))'      # "c" or "c." or "circa" for 'circa'
uncertain  ='\?|p'                  # perhaps, probably              
ba         ='([<>])'                # before or after
half       ='[12]h'
third      ='[123]t'
quarter    ='[1234]q'
eml        ='(em)|(ml)|[eml]'       # e=early, em=early to middle, l=late etc.
prefix     ='third|half|quarter|eml'
century    ='(\d\d?)[Cc]'           # e.g. 12C
decade     ='(\d{2,3}0)s'           # e.g. 1260s
year       ='(\d{1,4})'
simplerange='((1\d\d\d[-]\d)|(1\d\d\d[-]\d\d)$)'     # e.g. "1243-7"
oldstyle   ='((\d{3}[012345678]/\d)|(\d{3}9/\d{2}))' # e.g. "1355/6"
first      ='{uncertain}?{ba}?{circa}?{prefix}?({simplerange}|{oldstyle}|{decade}|{year}|{century})'
second     ='{uncertain}?{ba}?{circa}?{prefix}?({oldstyle}|{decade}|{year}|{century})'
dd_grammar =prenote+first+'((([x-–])|([-]{2}))'+second+')?'+postnote+'$' # the full grammar!

The full DDD grammar as a python regular expresssion

(?P<prenote>((nd)|(n\.\s?d\.)|(no\s?date))|(\[(.*?)\]))?(?P<uncertain0>\?|p)?(?P<ba0>[<>])?(?P<circa0>(c\.?)|(circa))?(?P<prefix0>([12]h)|([123]t)|([1234]q)|((em)|(ml)|[eml]))?((?P<simplerange>(1\d\d\d[-]\d)|(1\d\d\d[-]\d\d)$)|(?P<oldstyle0>(\d{3}[012345678]/\d)|(\d{3}9/\d{2}))|(?P<decade0>\d{2,3}0)s|(?P<year0>\d{1,4})|(?P<century0>\d\d?)[Cc])((?P<rangesep>[x-]|[-]{2}|–)(?P<uncertain1>\?|p)?(?P<ba1>[<>])?(?P<circa1>(c\.?)|(circa))?(?P<prefix1>([12]h)|([123]t)|([1234]q)|((em)|(ml)|[eml]))?((?P<oldstyle1>(\d{3}[012345678]/\d)|(\d{3}9/\d{2}))|(?P<decade1>\d{2,3}0)s|(?P<year1>\d{1,4})|(?P<century1>\d\d?)[Cc]))?(\[(?P<postnote>.*?)\])?$

The test cases

Below is the ouput from running the python code on my standard set of test cases DDD_test_cases.txt. Here the first column is input, and the next three columns are output from the python code. The normalized output is intended to appear in publications derived from the input; this then ensures consistency of layout and formatting. Note that an ordinary hyphen (-) for a range is converted to an en-dash (–) in the normalized output.

input                     	sort	normalized output     	verbose output

# simple cases
1234                      	1234	1234                  	1234
1234-5                    	1235	1234–5                	1234-5
1101/2                    	1102	1101/2                	1101/2
1109/10                   	1110	1109/10               	1109/10

# uncertain
c1230                     	1230	c.1230                	circa 1230
c.1230                    	1230	c.1230                	circa 1230
c950                      	 950	c.950                 	circa 950
?950                      	 950	?950                  	perhaps 950
?1289                     	1290	?1289                 	perhaps 1289
p1230s                    	1240	p1230s                	probably 1230s
ndcm13C                   	1266	n.d.c.m13C            	no date, circa middle of the 13th century
pl12C                     	1200	pl12C                 	probably late 12th century

# before or after
<1255                     	1255	<1255                 	before 1255
>1255                     	1255	>1255                 	after 1255
<c1255                    	1255	<c.1255               	before circa 1255
p>1300                    	1300	p>1300                	probably after 1300
l12C                      	1200	l12C                  	late 12th century

# decades
1260s                     	1270	1260s                 	1260s
e1220s                    	1213	e1220s                	early 1220s
m1220s                    	1217	m1220s                	middle of the 1220s
l1220s                    	1220	l1220s                	late 1220s

# centuries and parts of centuries
12C                       	1200	12C                   	12th century
13C                       	1300	13C                   	13th century
m12C                      	1166	m12C                  	middle of the 12th century
l12C                      	1200	l12C                  	late 12th century
e13C                      	1233	e13C                  	early 13th century
em13C                     	1275	em13C                 	early to middle 13th century
?em13C                    	1276	?em13C                	perhaps early to middle 13th century
em13C                     	1275	em13C                 	early to middle 13th century
ml15C                     	1500	ml15C                 	middle to late 15th century
cM13C                     	1266	c.m13C                	circa middle of the 13th century
cm13C                     	1266	c.m13C                	circa middle of the 13th century
?M12C                     	1166	?m12C                 	perhaps middle of the 12th century
2h14C                     	1400	2h14C                 	second half of the 14th century
1h8C                      	 750	1h8C                  	first half of the 8th century
1t15C                     	1433	1t15C                 	first third of the 15th century
2q15C                     	1450	2q15C                 	second quarter of the 15th century
3q15C                     	1475	3q15C                 	third quarter of the 15th century
4q15C                     	1500	4q15C                 	fourth quarter of the 15th century
cm9C                      	 866	c.m9C                 	circa middle of the 9th century
ce16C                     	1533	c.e16C                	circa early 16th century

# ranges
975-1016                  	1016	975–1016              	975 to 1016
975x1016                  	1016	975x1016              	975 to 1016
1297-1298                 	1298	1297–1298             	1297 to 1298
1297-98                   	1298	1297–98               	1297-98
1297-8                    	1298	1297–8                	1297-8
1234-5                    	1235	1234–5                	1234-5
1234-50                   	1250	1234–50               	1234-50
1250-c1255                	1255	1250–c.1255           	1250 to circa 1255
1250-<1255                	1255	1250–<1255            	1250 to before 1255
1250-<c.1255              	1255	1250–<c.1255          	1250 to before circa 1255
c1200-c1300               	1300	c.1200–c.1300         	circa 1200 to circa 1300
c1200xc1300               	1300	c.1200xc.1300         	circa 1200 to circa 1300
c1200-<c1300              	1300	c.1200–<c.1300        	circa 1200 to before circa 1300
l12c-e13c                 	1233	l12C–e13C             	late 12th century to the early 13th century
1q15C-l16C                	1525	1q15C–l16C            	first quarter of the 15th century to the late 16th century
l15C-1q16C                	1525	l15C–1q16C            	late 15th century to the first quarter of the 16th century
9C-e10C                   	 933	9C–e10C               	9th century to the early 10th century
1234-1267                 	1267	1234–1267             	1234 to 1267
1234x1267                 	1267	1234x1267             	1234 to 1267
>1243-<1255               	1255	>1243–<1255           	after 1243 to before 1255
<1255-c1260               	1260	<1255–c.1260          	before 1255 to circa 1260
p>1255-c1260              	1260	p>1255–c.1260         	probably after 1255 to circa 1260

# prenotes and postnotes
[perhaps]1205-l13C        	1300	[perhaps]1205–l13C    	perhaps 1205 to the late 13th century
[n.d., perhaps]c1345      	1345	[n.d., perhaps]c.1345 	n.d., perhaps circa 1345
[no date, possibly]1345-1355	1355	[no date, possibly]1345–1355	no date, possibly 1345 to 1355
1205[, or later]          	1205	1205[, or later]      	1205, or later

# insensitivity to case
cl12C                     	1200	c.l12C                	circa late 12th century
c.l12C                    	1200	c.l12C                	circa late 12th century
CL12C                     	1200	c.l12C                	circa late 12th century
E12c                      	1133	e12C                  	early 12th century
E1260S                    	1253	e1260s                	early 1260s

# no date
nd?1250                   	1250	n.d.?1250             	no date, perhaps 1250
n.d.?1250                 	1250	n.d.?1250             	no date, perhaps 1250
no date?1250              	1250	n.d.?1250             	no date, perhaps 1250

# regnal years
3E6                       	1549	3 Edward VI           	3 Edward VI (1548/9)
3Edw6                     	1549	3 Edward VI           	3 Edward VI (1548/9)
1 Will 1                  	1067	1 William I           	1 William I (1067)
1 Will I                  	1067	1 William I           	1 William I (1067)
1 William I               	1067	1 William I           	1 William I (1067)
1W I                      	1067	1 William I           	1 William I (1067)
1 W1                      	1067	1 William I           	1 William I (1067)
1W 1                      	1067	1 William I           	1 William I (1067)
1W1                       	1067	1 William I           	1 William I (1067)
1W2                       	1088	1 William II          	1 William II (1087/8)
1 Henry 2                 	1155	1 Henry II            	1 Henry II (1154/5)
3 Henry IV                	1402	3 Henry IV            	3 Henry IV (1401/2)
3H4                       	1402	3 Henry IV            	3 Henry IV (1401/2)
3Hen4                     	1402	3 Henry IV            	3 Henry IV (1401/2)
3 William II              	1090	3 William II          	3 William II (1089/90)
3W2                       	1090	3 William II          	3 William II (1089/90)
26E3                      	1353	26 Edward III         	26 Edward III (1352/3)
26Ed3                     	1353	26 Edward III         	26 Edward III (1352/3)
26Edw3                    	1353	26 Edward III         	26 Edward III (1352/3)
26 Edward 3               	1353	26 Edward III         	26 Edward III (1352/3)
26 Edward III             	1353	26 Edward III         	26 Edward III (1352/3)
1R1                       	1190	1 Richard I           	1 Richard I (1189/90)
3R2                       	1380	3 Richard II          	3 Richard II (1379/80)
2R3                       	1485	2 Richard III         	2 Richard III (1484/5)
tHen3                     	1272	t. Henry III          	t. Henry III
1Ma                       	1554	1 Mary I              	1 Mary I (1553/4)
2My                       	1555	2 Mary I              	2 Mary I (1554/5)
3Mary                     	1556	3 Mary I              	3 Mary I (1555/6)
4Mary                     	1557	4 Mary I              	4 Mary I (1556/7)
1234-50                   	1250	1234–50               	1234-50
1234x50                   	1250	1234x50               	1234 to 50
This website uses no cookies. This page was last modified 2024-01-21 10:57 by Keith Briggs private email address.