Uploaded image for project: 'OASIS Open Document Format for Office Applications (OpenDocument) TC'
  1. OASIS Open Document Format for Office Applications (OpenDocument) TC
  2. OFFICE-2681

Should openformula evaluators be *required* to support BMP or all Unicode/10646 characters?

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: OpenFormula
    • Labels:
      None
    • Proposal:
      Hide

      For users, the most robust change would be to replace the text:
      "Evaluators should accept [UNICODE] strings, but shall accept strings of ASCII (Unicode U+0020 through U+007F, inclusive) characters."
      to:
      "Evaluators shall be able to accept, process, and produce strings containing any characters defined in [UNICODE]."

      There are many alternatives. E.G., add this text as a conformance clause for the medium group,
      or make BMP support required for the medium group and all Unicode characters required for the large group.

      Show
      For users , the most robust change would be to replace the text: "Evaluators should accept [UNICODE] strings, but shall accept strings of ASCII (Unicode U+0020 through U+007F, inclusive) characters." to: "Evaluators shall be able to accept, process, and produce strings containing any characters defined in [UNICODE] ." There are many alternatives. E.G., add this text as a conformance clause for the medium group, or make BMP support required for the medium group and all Unicode characters required for the large group.
    • Resolution:
      Hide

      DUP of OFFICE-2672; closing this one.

      Show
      DUP of OFFICE-2672 ; closing this one.

      Description

      Part 2 (OpenFormula) section 3.2 "Text:" says:
      "A text value (also called a string value) is a sequence of zero or more characters.
      Evaluators should accept [UNICODE] strings, but shall accept strings of ASCII (Unicode U+0020 through U+007F, inclusive) characters."

      Some commenters on the open comment list believe there should be a stronger requirement:
      http://lists.oasis-open.org/archives/office-comment/201005/msg00002.html

      Certainly from a user point of view a stronger requirement would be nice.

      Two basic questions:
      1. Should the required character set supported by the evaluator at run-time be increased,
      and if so, to what (BMP or all Unicode/10646)?
      2. Under what conditions should they be increased?
      (All implementations? Only those of medium group or up?
      Maybe require BMP in medium group, and all characters in large group?)

      This is related to:
      http://tools.oasis-open.org/issues/browse/OFFICE-2663

      I'd like implementors to briefly respond with comments to THIS JIRA comment,
      noting what they can support and if there are major "gotchas".
      For example, can everyone support evaluating BMP or all Unicode characters
      at formula runtime regardless of the user's locale setting
      (I'm concerned this may be an issue for Excel)?
      Can everyone handle arbitrary characters, or is anyone limited to BMP
      (our 16-bit-char friends can end up with this problem)?
      If anyone is limited, is this a stumblingblock?

      I have a particular concern for the implementations that use 16-bit-chars internally.
      If you're given a character that is not in the BMP, what do FIND, LEFT, etc. do?
      Do they simply presume (incorrectly) that all chars are in the BMP, and thus you can
      cut out have a character? Or do they count "correctly" to the right character?

      Systems that use UTF-8 internally presumably do this correctly, since they
      have to "count" to get to the right characters anyway, but I'd like to know if that's
      NOT true for anyone.

        Attachments

          Activity

            People

            • Assignee:
              david.wheeler David Wheeler (Inactive)
              Reporter:
              david.wheeler David Wheeler (Inactive)
            • Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: