• Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: V4.0_OS
    • Fix Version/s: V4.01_OS
    • Component/s: CSDL JSON, CSDL XML
    • Labels:
    • Environment:

      Closed as applied 2020-1-16


      7.2.2 MaxLength

      "A positive integer value specifying the maximum length of a binary, stream or string value. For binary or stream values this is the octet length of the binary data, for string values it is the character length."

      What does character mean here? (Unicode specs don't define character in any normative text).

      3.3 Primitive Types

      "Edm.String Sequence of UTF-8 characters"

      If we combine 7.2.2 and 3.3, we might reasonably infer that MaxLength is the maximum valid length of a String value in UTF-8 encoding.

      Is this what the spec intended, in which case 7.2.2 should be clarified, or was it intended that 7.2.2 refer to UTF-16 code points or Unicode code points?

      See also: https://unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries

      Why does any of this matter? Consider a client, that wants to create an offline cache of data from a server (in a database, where columns need a specified maximum length). Or consider some other intermediary, which wants to allocate space for a buffer (e.g. malloc MaxLength+1 for a buffer to hold a Property value in a C program). It is important for such apps to be able to determine how much space to set aside to avoid accidental truncation of values.

      Additionally, any client or other agent wishing to do validation of a Property value according to MaxLength, it makes huge difference whether this is done by UTF-8, UTF-16 or Unicode code points.




            • Assignee:
              evan.ireland.2 Evan Ireland
            • Watchers:
              2 Start watching this issue


              • Created: