The Topic name MAY be any string which can be encoded as UTF-8 up to 65535 bytes in length.
Description
Snippet from Section 2.2 of v3.1 specification.
---------------------------------------------------------------
The topic name is a UTF-encoded string. See the section on MQTT and UTF-8 for more information. Topic name has an upper length limit of 32,767 characters.
Snippet in Appendix A -
--------------------------------
The following principles apply to the construction and content of a topic tree:
> The length is limited to 64k but within that there are no limits to the number of
levels in a topic tree.
Should the topic length on PUBLISH command be restricted to 32,767 characters ?
Raphael Cohen (Inactive)
added a comment - What are the impacts on making it 2^16? Are there any clients or existing brokers which explicitly create buffers as 2^15?
Rahul Gupta (Inactive)
added a comment - There is a interesting thread on this subject from Dan Anderson, Nick and Roger
https://groups.google.com/forum/?fromgroups#!topic/mqtt/7GpQm1ZxI2M
There's clearly inconsistency between the words in 2.2 and Appendix A which we need to resolve.
There's a hard upper limit of 65535 bytes because of the two byte length field that precedes the UTF-8 string. The reference to characters rather than bytes in 2.2 is interesting. It's either
a) A hangover from an earlier spec in which the field was single-byte ASCII characters (in which case the intent was to limit you to 32767 bytes)
b) An attempt to specify 65535 bytes based on an assumption that the UTF-8 characters would be at most 2 bytes long. As they can be three or four bytes long, you could theoretically exceed 65535 bytes and still be under 32767 characters.
So taken literally, an implementation today would have to be able to cope with 65535 bytes anyway.
I think we should remove the limit, leaving just the implicit limit of 65535 bytes imposed by the length prefix.
Peter Niblett (Inactive)
added a comment - There's clearly inconsistency between the words in 2.2 and Appendix A which we need to resolve.
There's a hard upper limit of 65535 bytes because of the two byte length field that precedes the UTF-8 string. The reference to characters rather than bytes in 2.2 is interesting. It's either
a) A hangover from an earlier spec in which the field was single-byte ASCII characters (in which case the intent was to limit you to 32767 bytes)
b) An attempt to specify 65535 bytes based on an assumption that the UTF-8 characters would be at most 2 bytes long. As they can be three or four bytes long, you could theoretically exceed 65535 bytes and still be under 32767 characters.
So taken literally, an implementation today would have to be able to cope with 65535 bytes anyway.
I think we should remove the limit, leaving just the implicit limit of 65535 bytes imposed by the length prefix.
Richard Coppen (Inactive)
added a comment - Discussed on TC Meeting (23.05.2013): Assigned to Rahul to remove limit of 32767 bytes from 2.2 (Topic Name)
updated section 3.3.2.1 in WD-04
----------------------------------------------
The Topic name MUST be a string which can be encoded as UTF-8 up to 65535 bytes in length. The Topic name MUST not contain wildcard characters. When received by a client that subscribed using wildcard characters, this string is the absolute topic specified by the originating publisher and not the subscription string used by the client.
Rahul Gupta (Inactive)
added a comment - updated section 3.3.2.1 in WD-04
----------------------------------------------
The Topic name MUST be a string which can be encoded as UTF-8 up to 65535 bytes in length. The Topic name MUST not contain wildcard characters. When received by a client that subscribed using wildcard characters, this string is the absolute topic specified by the originating publisher and not the subscription string used by the client.
What are the impacts on making it 2^16? Are there any clients or existing brokers which explicitly create buffers as 2^15?