-
Type: Improvement
-
Status: Closed
-
Priority: Major
-
Resolution: Fixed
-
Affects Version/s: 5
-
Fix Version/s: None
-
Component/s: None
-
Labels:None
-
Proposal:
-
Resolution:
I propose the MQTT 5.0 specification should prohibit retransmission of PUBLISH and PUBREL command packets during the lifespan of a single transport connection. This sort of retransmission was permissible in MQTT 3.1. I will argue this sort of retransmission creates problems outlined below, and in MQTT 5.0, it can break flow control and possibly topic-caching.
The necessary normative text change would be to change the statement
that currently appears in MQTT 3.1.1 Section 4.4
From:
"This is the only time where a Client or Server is REQUIRED to redeliver messages"
To:
"This is the only time when a Client or Server is permitted to redeliver messages"
For your convenience, relevant text from the MQTT 3.1 standard has been included at the end of this text.
This JIRA is intended to stimulate discussion within the TC following a
brief discussion in the MQTT-TC conference call on August 11, 2016.
1. Review of Existing Standards
Both MQTT 3.1 and MQTT 3.1.1 require retransmission of unacknowledged PUBLISH messages at QoS > 0, and PUBREL messages following re-establishment of a transport connection for a session created with CleanSession = 0. Both versions state this is the only time retransmission is required.
But MQTT 3.1 also says that QoS 1 and 2 messages may be retransmitted if acknowledgement is not received within a specified period of time. Furthermore, this retransmission is permitted even if the session was created with CleanSession=1.
The following bullet points summarize the MQTT 3.1 behavior regarding this sort of retransmission:
a.) Implementations may retransmit QoS 1 or QoS 2 messages at any time, including during the lifespan of a single transport connection, and may do so whether the connection was created with CleanSession=0 or CleanSession=1.
b.) QoS 1 messages are not required to be retransmitted in the order they were originally published. More precisely, the standard says QoS 2 is retransmitted the original order, but does not say this is the case for QoS 1.
c.) QoS 2 messages retransmitted on the same transport connection retry the protocol flow starting with the last unacknowledged protocol message, either PUBLISH or PUBREL.
d.) the specification adds "Brokers, however, should retry any unacknowledged message." This seems to contradict the statement that retransmission is only required after re-connection.
e.) the specification suggests that MQTT may retransmit SUBSCRIBE and UNSUBSCRIBE messages if an acknowledgement is not specified after specified period of time.
2. Problems with this retransmission
a.) QoS 1 messages may be duplicated or re-ordered even if there is no transport failure. Without retransmission, this will not occur if the transport connection remains operational.
b.) Busy receivers may cause the sender to retransmit, thereby increasing the traffic load on a busy receiver.
c.) Flow control of QoS 1 message can be broken if this retransmission is used because the flow control tallies will be altered by the retransmitted packets.
d.) Re-ordering of QoS 1 messages may interfere with topic caching.
The relevant MQTT 3.1 text are reproduced here for your convenience:
"Section 4.1 Quality of Service levels and flows"
(for QoS 1 )
"The receipt of a message by the server is acknowledged by a PUBACK message. If there is an identified failure of either the communications link or the sending device, or the acknowledgement message is not received after a specified period of time, the sender resends the message with the DUP bit set in the message header. The message arrives at the server at least once. Both SUBSCRIBE and UNSUBSCRIBE messages use QoS level 1.
If the client does not receive a PUBACK message (either within a time period defined in the application, or if a failure is detected and the communications session is restarted), the client may resend the PUBLISH message with the DUP flag set.
When it receives a duplicate message from the client, the server republishes the message to the subscribers, and sends another PUBACK message."
(For QoS 2)
"If a failure is detected, or after a defined time period, the protocol flow is
retried from the last unacknowledged protocol message; either the PUBLISH or PUBREL. See Message delivery retry for more details. The additional protocol flows ensure that the message is delivered to subscribers once only."
General (Section 4.2)
"Although TCP normally guarantees delivery of packets, there are certain
scenarios where an MQTT message may not be received. In the case of MQTT messages that expect a response (QoS >0 PUBLISH, PUBREL, SUBSCRIBE, UNSUBSCRIBE), if the response is not received within a certain time period, the sender may retry delivery. The sender should set the DUP flag on the message.
The retry timeout should be a configurable option. However care must be taken to ensure message delivery does not timeout while it is still being sent. For example, sending a large message over a slow network will naturally take longer than a small message over a fast network. Repeatedly retrying a timed-out message could often make matters worse so a strategy of increasing the timeout value across multiple retries should be used.
When a client reconnects, if it is not marked clean session, both the client
and server should redeliver any previous in-flight messages.
Other than this "on reconnect" retry behaviour, clients are not required to
retry message delivery. Brokers, however, should retry any unacknowledged message."