Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.1.1
    • Fix Version/s: 3.1.1
    • Component/s: core
    • Labels:
      None
    • Proposal:
      Hide

      463 (Description of Connect Message) and 614 (Payload). Change "The payload contains one or more UTF-8 encoded strings" to "The payload contains one ore more encoded fields"
      615 Change "These strings, if present.." to "These fields, if present.."
      Each of 620 / 637 / 649 / 661 (Client ID, Will Topic, User Name, Password) change "UTF-encoded string" to "field", and add a further sentence saying "It is a UTF-8 encoded string".

      641..643 Change the paragraph
      "If the Will Flag=1, this is the next UTF-8 encoded string. The Will Message defines the content of the message that is published to the Will Topic if the client is unexpectedly disconnected. The Will Message can contain zero characters."
      to say

      "If the Will Flag=1, this is the next encoded field. The Will Message defines the payload content of the message that is published to the Will Topic if the client is unexpectedly disconnected. This field, if present, must consist of a 2-byte length (MSB followed by LSB) followed by the payload for the Will Message expressed as a sequence of zero or more bytes. The length gives the number of bytes in the payload that follows and does not include the 2 bytes taken up by the length itself."

      645...647 Change the paragraph
      "Although the Will Message is UTF-8 encoded in the CONNECT message, when it is published to the Will Topic only the bytes of the message are sent, not the first two length bytes. The message must therefore only consist of 7-bit ASCII characters."
      to say

      "When the Will Message is published to the Will Topic its payload consists only of the payload portion of this field, not the first two length bytes"

      Show
      463 (Description of Connect Message) and 614 (Payload). Change "The payload contains one or more UTF-8 encoded strings" to "The payload contains one ore more encoded fields" 615 Change "These strings, if present.." to "These fields, if present.." Each of 620 / 637 / 649 / 661 (Client ID, Will Topic, User Name, Password) change "UTF-encoded string" to "field", and add a further sentence saying "It is a UTF-8 encoded string". 641..643 Change the paragraph "If the Will Flag=1, this is the next UTF-8 encoded string. The Will Message defines the content of the message that is published to the Will Topic if the client is unexpectedly disconnected. The Will Message can contain zero characters." to say "If the Will Flag=1, this is the next encoded field. The Will Message defines the payload content of the message that is published to the Will Topic if the client is unexpectedly disconnected. This field, if present, must consist of a 2-byte length (MSB followed by LSB) followed by the payload for the Will Message expressed as a sequence of zero or more bytes. The length gives the number of bytes in the payload that follows and does not include the 2 bytes taken up by the length itself." 645...647 Change the paragraph "Although the Will Message is UTF-8 encoded in the CONNECT message, when it is published to the Will Topic only the bytes of the message are sent, not the first two length bytes. The message must therefore only consist of 7-bit ASCII characters." to say "When the Will Message is published to the Will Topic its payload consists only of the payload portion of this field, not the first two length bytes"

      Description

      The current 3.1 specification states that the will message is encoded in UTF-8 in the CONNECT message but will be published in ASCII encoding by a MQTT broker. This is a major inconsistency in the specification since this is the only case where ASCII encoding is used.

      Here's the relevant citation from the specification:
      "Although the Will Message is UTF-8 encoded in the CONNECT message, when it is published to the Will Topic only the bytes of the message are sent, not the first two length bytes. The message must therefore only consist of 7-bit ASCII characters."

      A payload for a PUBLISH can of course be any raw bytes, in case of the will message we should think of removing the inconsistency from the spec. I see two possibilities:

      1. The will message in the CONNECT message is not UTF-8 encoded but ASCII encoded.
      2. The will message in the will PUBLISH is UTF-8. This would collide with the current spec because empty payloads are possible regarding to the 3.1 spec (in case of UTF-8 IIRC two length bytes have to be sent even with an empty message).

      I would vote for option two because this would remove this inconsistency in the spec and the will message is encoded in the CONNECT message in UTF-8 anyway. I don't think the overhead of the two length bytes in case of an empty message are a serious problem. We could discuss if it would be reasonable that in case of an empty payload (= empty UTF-8 String) the length bytes should be removed automatically by broker implementations to reduce the overhead in PUBLISH messages.

        Attachments

          Activity

            People

            • Assignee:
              ragupta2 Rahul Gupta
              Reporter:
              dobermai Dominik Obermaier (Inactive)
            • Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: