[EMERGENCY-8] Add hierarchical data to CAP through XML extensions in schema - OASIS Technical Committees Issue Tracker

Details

Type: Improvement
Status: New
Priority: Major
Resolution: Unresolved
Component/s: EDXL-CAP
Labels:
- CAP

Description

Steve Hakusa: To add hierarchical data to CAP, allow XML extensions in the CAP schema through the following change to the .xsd: <any minOccurs="0" maxOccurs="unbounded" namespace="##other" processContents="lax" />
a. Allow such XML extensions at the <info> level, and optionally at the <alert> level. Having extensions at the <info> level would be required in case the hierarchical data is language-specific. Extensions at the <alert> level could cut down on redundancy in the case the data is related to all <info> blocks.

Jacob Westfall: Add extensibility with <xs:any## other minOccurs="0"/>

Attachments

Activity

Ascending order - Click to sort in descending order

Hide

Permalink

Art Botterell (Inactive) added a comment - 03/Jun/14 10:23 PM

Um... why? Not clear what this achieves that isn't already possible using the <resource> element. An example use case would be helpful.

We ought to be careful about allowing folks to add stuff that isn't open under the CAP spec. That way lie proprietary extensions and probably other IP shenanigans.

Functionally, I think, any implementer should be able to make use of a CAP message armed only with the CAP spec. She or he may choose to enforce a profile, but shouldn't be presented with CAP messages that aren't fully usable to the extent of their individual content without applying some third-party schema.

Show

Art Botterell (Inactive) added a comment - 03/Jun/14 10:23 PM Um... why? Not clear what this achieves that isn't already possible using the <resource> element. An example use case would be helpful. We ought to be careful about allowing folks to add stuff that isn't open under the CAP spec. That way lie proprietary extensions and probably other IP shenanigans. Functionally, I think, any implementer should be able to make use of a CAP message armed only with the CAP spec. She or he may choose to enforce a profile, but shouldn't be presented with CAP messages that aren't fully usable to the extent of their individual content without applying some third-party schema.

Hide

Permalink

Steve Hakusa (Inactive) added a comment - 05/Jun/14 1:28 PM

I argue for this in a presentation on this at the 2013 CAP conference in Geneva:
http://www.wmo.int/pages/prog/amp/pwsp/documents/CAP-IW-2013-p03-10-US-Google.pdf

The summary of the argument is that, based our interactions with a dozen or so alert publishers globally in the last couple years, the growth in usage of CAP is limited by the difficulty of representing data in a more structured way in the existing CAP standard. Given that XML is built to be extensible, CAP should allow for that extensibility.

Show

Steve Hakusa (Inactive) added a comment - 05/Jun/14 1:28 PM I argue for this in a presentation on this at the 2013 CAP conference in Geneva: http://www.wmo.int/pages/prog/amp/pwsp/documents/CAP-IW-2013-p03-10-US-Google.pdf The summary of the argument is that, based our interactions with a dozen or so alert publishers globally in the last couple years, the growth in usage of CAP is limited by the difficulty of representing data in a more structured way in the existing CAP standard. Given that XML is built to be extensible, CAP should allow for that extensibility.

Hide

Permalink

Art Botterell (Inactive) added a comment - 05/Jun/14 5:17 PM

Steve, your slides seem to have overlooked the Resource block, which provides precisely this sort of extension point.

Show

Art Botterell (Inactive) added a comment - 05/Jun/14 5:17 PM Steve, your slides seem to have overlooked the Resource block, which provides precisely this sort of extension point.

Hide

Permalink

Steve Hakusa (Inactive) added a comment - 06/Jun/14 12:48 PM

My apologies, that version has no speaker notes, so it's more difficult to see resources addressed on slide 28.
The one uploaded to OASIS has speaker notes:
https://www.oasis-open.org/apps/org/workgroup/emergency/download.php/48938/Hierarchical%20structured%20data%20in%20CAP%20-%20Google.org%20presentation%20to%20the%202013%20CAP%20Workshop%20-%20with%20speaker%20notes.pdf
and here's the same file as a public Google doc: https://drive.google.com/file/d/0B6KjI0JA9k3hM2cwd2EtcEd6N1k/edit?usp=sharing

Here are the notes from slide 28:

At the [2013] WMO CAP Workshop, Eliot Christian mentioned an even shorter-term alternative for supporting XML extensions; base-64 encode them and add them to a <reference> via the <derefUri>.
Base-64 encoding is the obvious undesirable property of this solution. It does not seem like something that we would want to recommend to other countries as the "way to implement CAP with your data".
It is reasonable, if it was more palatable, to change CAP to allow non-encoded XML in a <resource>, instead of allowing XML extensions at the <info> level. We would prefer just allowing extensions at the <info> level, because it makes the CAP somewhat easier to read and work with.

The speaker notes do not answer an obvious follow-up, "Why not just put the extension data in a separate file and refer to it by URI using <resource><uri>?"

One answer is practical: it's slower, more error-prone, and just more complicated to fetch additional content by URL. I'd argue that the most would choose instead to stuff data into one or more parameters (eg slide 13), which I don't think anyone agrees is good practice.

I'd also argue that when the content is so relevant to encouraging users to take action on an alert, (eg "for a set of affected cities, how high is the tsunami going to be, what time will it arrive", or "the set of personal details to help someone recognize the missing child"), it feels uncomfortable to stick the information in a "resource". You may counter by saying, "if the content is so important, it should be in the description", and you're right: a plain-text summary should indeed be in the description field. But plain text just isn't the most useful or actionable way to present all information, and it seems wrong for the standard to hamstring systems (and I'm not just talking about Google here) that are can present the information in a way that's easier to understand.

Show

Steve Hakusa (Inactive) added a comment - 06/Jun/14 12:48 PM My apologies, that version has no speaker notes, so it's more difficult to see resources addressed on slide 28. The one uploaded to OASIS has speaker notes: https://www.oasis-open.org/apps/org/workgroup/emergency/download.php/48938/Hierarchical%20structured%20data%20in%20CAP%20-%20Google.org%20presentation%20to%20the%202013%20CAP%20Workshop%20-%20with%20speaker%20notes.pdf and here's the same file as a public Google doc: https://drive.google.com/file/d/0B6KjI0JA9k3hM2cwd2EtcEd6N1k/edit?usp=sharing Here are the notes from slide 28: At the [2013] WMO CAP Workshop, Eliot Christian mentioned an even shorter-term alternative for supporting XML extensions; base-64 encode them and add them to a <reference> via the <derefUri>. Base-64 encoding is the obvious undesirable property of this solution. It does not seem like something that we would want to recommend to other countries as the "way to implement CAP with your data". It is reasonable, if it was more palatable, to change CAP to allow non-encoded XML in a <resource>, instead of allowing XML extensions at the <info> level. We would prefer just allowing extensions at the <info> level, because it makes the CAP somewhat easier to read and work with. The speaker notes do not answer an obvious follow-up, "Why not just put the extension data in a separate file and refer to it by URI using <resource><uri>?" One answer is practical: it's slower, more error-prone, and just more complicated to fetch additional content by URL. I'd argue that the most would choose instead to stuff data into one or more parameters (eg slide 13), which I don't think anyone agrees is good practice. I'd also argue that when the content is so relevant to encouraging users to take action on an alert, (eg "for a set of affected cities, how high is the tsunami going to be, what time will it arrive", or "the set of personal details to help someone recognize the missing child"), it feels uncomfortable to stick the information in a "resource". You may counter by saying, "if the content is so important, it should be in the description", and you're right: a plain-text summary should indeed be in the description field. But plain text just isn't the most useful or actionable way to present all information, and it seems wrong for the standard to hamstring systems (and I'm not just talking about Google here) that are can present the information in a way that's easier to understand.

Hide

Permalink

Art Botterell (Inactive) added a comment - 06/Jun/14 2:40 PM - edited

The TC discussed how best to represent this sort of supplemental information at great length... and with considerable vigor... back in the 1.0 deliberations. There was strong support for allowing inclusion of "rich media" of various sorts, but also strong reluctance to allow unbounded increases in alert message size that might negatively impact (or at least hamper adoption in the case of) transmission technologies of limited bandwidth.

The compromise we reached was that <resource> values should be "dereferenced"... that is, encoded and transmitted as part of the alert itself... only over networks like high-capacity data broadcasts that a) had the bandwidth to handle them, and b) transmits only one-way and thus couldn't offer the client the option of retrieving the extended material separately if and when desired. (Indeed, as I recall the term "dereferenced URI" was suggested by Eliot.)

In all other cases the <derefUri> was (and is) deprecated and inclusion-by-reference preferred: a URL and some content-type info is offered so individual clients can retrieve it if necessary. E.g., there'd be no requirement for an audio interface to retrieve an image, so why impose that load on every receiving device? This arrangement also permits <resource> to point to ongoing streams as well as discrete files.

That implies a "gateway" conversion at the boundaries between one-way and two-way networks, which is addressed in Note 4 of the derefUri definition in the spec. I'm not sure that's ever actually been implemented, but it was thought through at the time.

The <resource> scheme has worked out well when its been used, as in IPAWS, and it seems like the use-cases you offer in your deck could all be handled within the existing framework. The real problem seems to be that many folks don't appreciate how flexible <resource> actually is.

Also... two quick observations on the argument that "XML is extensible." First, that's true, and CAP doesn't need to try to duplicate it; like any document type, CAP has a particular purpose and creeping too far from that risks bloat and user confusion. And second, CAP isn't based on XML, it's based on social science about the content of effective warning messages. CAP is a data structure that can be serialized using XML, ASN.1 and potentially other encodings. XML is neither a cause nor a necessary effect, it's just the serialization that happened to be most in vogue at the time CAP was devised.

Show

Art Botterell (Inactive) added a comment - 06/Jun/14 2:40 PM - edited The TC discussed how best to represent this sort of supplemental information at great length... and with considerable vigor... back in the 1.0 deliberations. There was strong support for allowing inclusion of "rich media" of various sorts, but also strong reluctance to allow unbounded increases in alert message size that might negatively impact (or at least hamper adoption in the case of) transmission technologies of limited bandwidth. The compromise we reached was that <resource> values should be "dereferenced"... that is, encoded and transmitted as part of the alert itself... only over networks like high-capacity data broadcasts that a) had the bandwidth to handle them, and b) transmits only one-way and thus couldn't offer the client the option of retrieving the extended material separately if and when desired. (Indeed, as I recall the term "dereferenced URI" was suggested by Eliot.) In all other cases the <derefUri> was (and is) deprecated and inclusion-by-reference preferred: a URL and some content-type info is offered so individual clients can retrieve it if necessary. E.g., there'd be no requirement for an audio interface to retrieve an image, so why impose that load on every receiving device? This arrangement also permits <resource> to point to ongoing streams as well as discrete files. That implies a "gateway" conversion at the boundaries between one-way and two-way networks, which is addressed in Note 4 of the derefUri definition in the spec. I'm not sure that's ever actually been implemented, but it was thought through at the time. The <resource> scheme has worked out well when its been used, as in IPAWS, and it seems like the use-cases you offer in your deck could all be handled within the existing framework. The real problem seems to be that many folks don't appreciate how flexible <resource> actually is. Also... two quick observations on the argument that "XML is extensible." First, that's true, and CAP doesn't need to try to duplicate it; like any document type, CAP has a particular purpose and creeping too far from that risks bloat and user confusion. And second, CAP isn't based on XML, it's based on social science about the content of effective warning messages. CAP is a data structure that can be serialized using XML, ASN.1 and potentially other encodings. XML is neither a cause nor a necessary effect, it's just the serialization that happened to be most in vogue at the time CAP was devised.

People

Assignee:

Unassigned

Reporter:

Tony Mancuso (Inactive)

Watchers:

3 Start watching this issue

Dates

Created:

05/May/14 5:03 PM

Updated:

06/Jun/14 2:45 PM