Oxford University Press Text Capture Instructions

 

Extracts of legal documents and structured quotes

Where required by the product, capture extracts of legal documents (legislation, or fragments of other judgments) or structured quotes (e.g. quotes containing heading levels, div structures) in the extract element. Otherwise, capture them as quotes in displayText.

Typecodes used in manuscripts for Extracts

EXT /EXT

extract

extract

Note: uses start/end environment tags (other typecodes may be nested within)

EXT-S

extract source

extract p

LEXT /LEXT

extract - legislation

extract

Note: uses start/end environment tags (other typecodes may be nested within)

CEXT /CEXT

extract - case

extract

Note: uses start/end environment tags (other typecodes may be nested within)

When used by LPF products or HE.

//extract

When required by a product, mark legal extracts and structured quotes in the extract element.

Wherever legal extracts occur in either the headnote or award section of case reports, capture them in extract elements.

For all legal extract tagging, mark the extract element with a set of attributes that record the title, date, section number and other details. This set of attributes is exactly the same as those used for the bibItem element, and should be populated in the same way.

extract elements must be captured inside their own p element.

Within the extract element, where possible, tag the extracted content in exactly the same way as in the original document. When an extract contains:

  • one or more enumerated paragraphs or headed sections, it should contain textMatter with div1
  • text without any enumerator or headed sections, it should contain textMatter with just p children
  • a title followed by text, it should contain an initial titleGroup element

All other lower-level structural content that may appear in an original document as covered by Lower-level structural content may also be captured as nested elements within the extract.

When explicitly instructed by OUP for a particular extract, add the attribute long="true" to the extract element.

//extract/@long

When the extract is in a different language to the surrounding context add the xml:lang attribute to the extract element with the three-letter ISO code (ISO 639-2) for the language.

//extract/@xml:lang
Release ID:
20261202
ID:
OUP_Structured_Text_TCI_topic_3_6
Author:
dunnm
Last changed:
Wed, 04 Jun 2025
Modified by:
buckmasm
Revision#:
4400