[OFFICE-2102] Member Proposal: Input field normalization - OASIS Technical Committees Issue Tracker

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: ODF 1.1, ODF 1.2
Fix Version/s: ODF 1.3
Component/s: Fields, Part 3 (Schema) [1.2: 1]
Labels:
None

Proposal:

Hide

I.

3.18 White Space Processing and EOL Handling

in the Note, in "their element children. 6.1.2", replace "element children" with "descendant elements".

II.

6.1.2 White Space Characters

replace:

"* in their descendant elements, if the OpenDocument schema permits the inclusion of character data for the element itself and all its ancestor elements up to the paragraph element."

with:

"* in their descendant elements, if the OpenDocument schema permits <text:s> [6.1.3], <text:tab> [6.1.4] and <text:line-break> [6.1.5] as element content."

replace the entire algorithm with:

<quote>
Collapsing white space characters inside a paragraph element is defined by the following algorithm:

1) Descendant <text:ruby> elements are replaced with their <text:ruby-base> child elements.

2) Descendant elements of the paragraph element which are not <text:s>, <text:tab> or <text:line-break> elements and for which the OpenDocument schema does not permit <text:s>, <text:tab> and <text:line-break> as child elements are removed from the paragraph element.

3) Descendant elements of the paragraph element for which the OpenDocument schema permits <text:s>, <text:tab> and <text:line-break> as child elements are replaced by their character data and <text:s>, <text:tab> and <text:line-break> element children.

4) original ODF 1.2 step 1) (U+0009 U+000D U+000A -> U+0020 replacement)

5) original ODF 1.2 step 3) remove leading U+0020

6) original ODF 1.2 step 4) replace many U+0020 with one

7) The remaining <text:s>, <text:tab> and <text:line-break> elements are interpreted as the [UNICODE] white space characters they represent.

OpenDocument producers shall produce paragraph elements that, when consumed according to this algorithm, result in the expected amount of white space.

OpenDocument consumers shall either process white space such that the result is equivalent to the result of the given algorithm, or implement a variation that increases interoperability with popular OpenDocument 1.2 producers. The variation replaces step 2 of the algorithm with steps 2a and 2b:

2a) Descendant elements of the paragraph element that are mark elements (
<text:change> 5.5.7.4
<text:change-end> 5.5.7.3
<text:change-start> 5.5.7.2
<text:bookmark> 6.2.1.2
<text:bookmark-end> 6.2.1.4
<text:bookmark-start> 6.2.1.3
<text:reference-mark> 6.2.2.2
<text:reference-mark-end> 6.2.2.4
<text:reference-mark-start> 6.2.2.3
<text:toc-mark> 8.1.4
<text:toc-mark-end> 8.1.3
<text:toc-mark-start> 8.1.2
<text:user-index-mark> 8.1.7
<text:user-index-mark-end> 8.1.6
<text:user-index-mark-start> 8.1.5
<text:alphabetical-index-mark> 8.1.10
<text:alphabetical-index-mark-end> 8.1.9
<text:alphabetical-index-mark-start> 8.1.8
) are removed from the paragraph element.

2b) Descendant elements of the paragraph element which are not <text:s>, <text:tab> or <text:line-break> elements and for which the OpenDocument schema does not permit <text:s>, <text:tab> and <text:line-break> as child elements are replaced with a hypothetical <text:s text:c="0"/> element.

</quote>

III. add helpful note that generic pretty-printing is not reliable in 6.1.2 White Space Characters, following the algorithm

"Note: XML formatting software that does not implement the ODF whitespace rules might introduce or remove spaces."

Show
I. 3.18 White Space Processing and EOL Handling in the Note, in "their element children. 6.1.2", replace "element children" with "descendant elements". II. 6.1.2 White Space Characters replace: "* in their descendant elements, if the OpenDocument schema permits the inclusion of character data for the element itself and all its ancestor elements up to the paragraph element." with: "* in their descendant elements, if the OpenDocument schema permits <text:s> [6.1.3] , <text:tab> [6.1.4] and <text:line-break> [6.1.5] as element content." replace the entire algorithm with: <quote> Collapsing white space characters inside a paragraph element is defined by the following algorithm: 1) Descendant <text:ruby> elements are replaced with their <text:ruby-base> child elements. 2) Descendant elements of the paragraph element which are not <text:s>, <text:tab> or <text:line-break> elements and for which the OpenDocument schema does not permit <text:s>, <text:tab> and <text:line-break> as child elements are removed from the paragraph element. 3) Descendant elements of the paragraph element for which the OpenDocument schema permits <text:s>, <text:tab> and <text:line-break> as child elements are replaced by their character data and <text:s>, <text:tab> and <text:line-break> element children. 4) original ODF 1.2 step 1) (U+0009 U+000D U+000A -> U+0020 replacement) 5) original ODF 1.2 step 3) remove leading U+0020 6) original ODF 1.2 step 4) replace many U+0020 with one 7) The remaining <text:s>, <text:tab> and <text:line-break> elements are interpreted as the [UNICODE] white space characters they represent. OpenDocument producers shall produce paragraph elements that, when consumed according to this algorithm, result in the expected amount of white space. OpenDocument consumers shall either process white space such that the result is equivalent to the result of the given algorithm, or implement a variation that increases interoperability with popular OpenDocument 1.2 producers. The variation replaces step 2 of the algorithm with steps 2a and 2b: 2a) Descendant elements of the paragraph element that are mark elements ( <text:change> 5.5.7.4 <text:change-end> 5.5.7.3 <text:change-start> 5.5.7.2 <text:bookmark> 6.2.1.2 <text:bookmark-end> 6.2.1.4 <text:bookmark-start> 6.2.1.3 <text:reference-mark> 6.2.2.2 <text:reference-mark-end> 6.2.2.4 <text:reference-mark-start> 6.2.2.3 <text:toc-mark> 8.1.4 <text:toc-mark-end> 8.1.3 <text:toc-mark-start> 8.1.2 <text:user-index-mark> 8.1.7 <text:user-index-mark-end> 8.1.6 <text:user-index-mark-start> 8.1.5 <text:alphabetical-index-mark> 8.1.10 <text:alphabetical-index-mark-end> 8.1.9 <text:alphabetical-index-mark-start> 8.1.8 ) are removed from the paragraph element. 2b) Descendant elements of the paragraph element which are not <text:s>, <text:tab> or <text:line-break> elements and for which the OpenDocument schema does not permit <text:s>, <text:tab> and <text:line-break> as child elements are replaced with a hypothetical <text:s text:c="0"/> element. </quote> III. add helpful note that generic pretty-printing is not reliable in 6.1.2 White Space Characters, following the algorithm "Note: XML formatting software that does not implement the ODF whitespace rules might introduce or remove spaces."
Resolution:

Hide

[see proposal]

Show
[see proposal]

Description

http://wiki.oasis-open.org/office/InputFields

Member Proposal: Input field normalization

Details

Description

Attachments

Activity

People

Dates