MS Word styles in Primary Source Documents
A manuscript for a Primary Source Document uses named MS Word styles which map to XML structures.
| Name of MS Word style | XPath mapping |
|---|---|
| PSD Title |
/document/titleGroup/title
|
| TOC | Do not capture this content in XML. |
| H1 |
//div1/titleGroup/title
|
| H2 |
//div2/titleGroup/title
|
| H3 |
//div3/titleGroup/title
|
| Paragraph |
//p
|
Special characters may appear in the manuscript as names enclosed in angle brackets. Capture these as appropriate Unicode characters.
Text in MS
The Qur<ayn> ān was written many years ago.
Text in XML
The Qurʿān was written many years ago.