-
Type: Bug
-
Status: Closed
-
Priority: Minor
-
Resolution: Unresolved
-
Affects Version/s: SMP 1.0
-
Fix Version/s: None
-
Component/s: Documentation
-
Labels:None
-
Proposal:
-
Resolution:
Section 3.3 of SMP states:
XML documents returned by HTTP GET MUST be well-formed according to [XML 1.0] and MUST be UTF-8 encoded ([Unicode]). They MUST contain an XML declaration starting with “<?xml” which includes the «encoding» attribute set to “UTF-8”.
This can be interpreted as implying that using the lower case string "utf-8" for the encoding would be incorrect. There are a number of problems with this:
1) All examples in the spec use "utf-8". While it is true that the examples are marked as non-normative, one would expect them to be consistent with the spec.
2) XML 1.0 states that XML processors SHOULD match character encoding names in a case-insensitive way.
3) the IANA character set repository states that "character set names may be up to 40 characters taken from the printable characters of US-ASCII. However, no distinction is made between use of upper and lower case letters."
https://www.iana.org/assignments/character-sets/character-sets.xhtml
4) If no encoding is specified, XML 1.0 assumes UTF-8 encoding. The attribute is only relevant is some other encoding (like UTF-16) would be used.
5) XML has been around for two decades. I doubt that any of the current versions of commonly used XML libraries would break if the non-all-uppercase variant is used.
Internet conventional wisdom suggests that the uppercase variant is preferred, because XML 1.0 uses SHOULD instead of MUST, but that both are allowed.