[ODATA-232] Enhance description of normalization procedures (public comment c201301e00001) - OASIS Technical Committees Issue Tracker

Details

Type: Improvement
Status: Closed
Priority: Minor
Resolution: Fixed
Affects Version/s: V4.0_WD01
Fix Version/s: V4.0_WD01
Component/s: ABNF, URL Conventions
Labels:
None
Environment:

[Proposed]

Proposal:

Hide

After applying the three steps defined by RFC3986 the following four steps are performed:

4. Split query at "&" into query options, and each query option at the first "=" into query option name and query option value before percent-decoding
5. Percent-decode path segments, query option names, and query option values
6. Interpret path segments, query option names, and query option values according to OData rules

One of these rules is that single quotes within string literals are represented as two consecutive single quotes.

Valid URLs:
~/People('O''Neil')
~/People(%27O%27%27Neil%27)
~/People%28%27O%27%27Neil%27%29
~/OperatingSystems('OS%2F2')

Invalid URLs:
~/People('O%27Neil')
~/OperatingSystems('OS/2')

Accepted: https://www.oasis-open.org/committees/download.php/48481/odata-meeting-28_on-20130307-minutes.html#odata-232

Show
After applying the three steps defined by RFC3986 the following four steps are performed: 4. Split query at "&" into query options, and each query option at the first "=" into query option name and query option value before percent-decoding 5. Percent-decode path segments, query option names, and query option values 6. Interpret path segments, query option names, and query option values according to OData rules One of these rules is that single quotes within string literals are represented as two consecutive single quotes. Valid URLs: ~/People('O''Neil') ~/People(%27O%27%27Neil%27) ~/People%28%27O%27%27Neil%27%29 ~/OperatingSystems('OS%2F2') Invalid URLs: ~/People('O%27Neil') ~/OperatingSystems('OS/2') Accepted: https://www.oasis-open.org/committees/download.php/48481/odata-meeting-28_on-20130307-minutes.html#odata-232
Resolution:

Hide

https://www.oasis-open.org/committees/download.php/48585/odata-core-v4.0-wd01-part2-url-conventions-2013-03-19.doc
https://tools.oasis-open.org/version-control/browse/wsvn/odata/trunk/spec/ABNF/odata-abnf-construction-rules-v4.0-wd01.txt?rev=215
https://tools.oasis-open.org/version-control/browse/wsvn/odata/trunk/spec/ABNF/odata-abnf-testcases.xml?rev=215

Accepted: https://www.oasis-open.org/committees/download.php/49026/odata-meeting-34_on-20130425_26-F2F-minutes.html#odata-232

Show
https://www.oasis-open.org/committees/download.php/48585/odata-core-v4.0-wd01-part2-url-conventions-2013-03-19.doc https://tools.oasis-open.org/version-control/browse/wsvn/odata/trunk/spec/ABNF/odata-abnf-construction-rules-v4.0-wd01.txt?rev=215 https://tools.oasis-open.org/version-control/browse/wsvn/odata/trunk/spec/ABNF/odata-abnf-testcases.xml?rev=215 Accepted: https://www.oasis-open.org/committees/download.php/49026/odata-meeting-34_on-20130425_26-F2F-minutes.html#odata-232

Description

The public comment [c201301e00001](https://lists.oasis-open.org/archives/odata-comment/201301/msg00001.html) with title "Query String parsing in URIs" indicates, that the description of normalization procedures in the ABNF Construction Rules can be enhanced.

RFC3986 defines three sets of characters:

unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@"
sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="

Only characters in these three sets MAY occur in URLs, all other characters MUST be percent-encoded.

RFC3986 defines three steps for URL processing that MUST be performed before percent-decoding:
1. Split undecoded URL into components scheme, hier-part, query, and fragment at first ":", then first "?", and then first "#"
2. Split undecoded hier-part into authority and path: if hier-part starts with "//", then authority is everything after "//" and before the next "/" or the end of the string, and path is everything that remains (nothing or the next "/" and everything after it)
3. Split undecoded path into path segments at "/"

RFC3986 allows that characters in the unreserved set MAY be percent-decoded at any time.

RFC3986 does not specify how to split the query part into subcomponents, nor does it define how to split path segments into subcomponents, so OData needs to define how these are split into OData-specific subcomponents, especially whether this happens before or after percent-decoding characters in the gen-delims and sub-delims sets.

As pointed out in the public comment we have two areas that require special care:

Splitting queries into name-value dictionaries by first splitting at "&" and then splitting at the first "=" in each part
Treatment of the single quote character "'" within string literals

The first is a widely used convention supported by URL parsing tools, and it would be nice to reuse them. These tools also typically percent-decode the parts remaining after the "&"/"=" splits before handing them back.

The second is made especially interesting by the fact that Firefox always percent-encodes the single quote as %27.

Enhance description of normalization procedures (public comment c201301e00001)

Details

Description

Attachments

Activity

People

Dates