XML Schema do type references have to be qualified?

Suman, In and XMLSchema/DFDLSchema do I have to qualify the names of types? We have a bunch of test schemas written roughly like the example below. They all have the default unprefixed namespace as XML Schema's namespace. They also all have a target namespace. But some or all of the type references to named types use unqualified names. In my mind, that means they would be assumed to be in the XML Schema namespace, not the targetNamespace. On the other hand, the XML Schema validator doesn't complain. But that just means the schema is valid, not necessarily meaningful. Example here: <schema xmlns="http://www.ogf.org/dfdl/dfdl-1.0/XMLSchemaSubset" targetNamespace="http://example.com"> <element name="foo" type="bar"/><!-- IS THIS LEGAL, no prefix on name of the type. --> <complexType name="bar"> <sequence/> </complexType> </schema> Mike Beckerle | OGF DFDL WG Co-Chair Tel: 781-330-0412

So, I did some research, and answered my own question. Section 3.3.4.2 of "Definitive XML Schema" by Walmsley, says that the schema below is illegal as the name "bar" will be interpreted as a reference to xsd:bar, not the targetNamespace. However, it seems many XML Schema processors may be tolerant of this error. On Fri, Mar 16, 2012 at 4:48 PM, Mike Beckerle <mbeckerle.dfdl@gmail.com> wrote:
Suman,
In and XMLSchema/DFDLSchema do I have to qualify the names of types?
We have a bunch of test schemas written roughly like the example below. They all have the default unprefixed namespace as XML Schema's namespace. They also all have a target namespace.
But some or all of the type references to named types use unqualified names. In my mind, that means they would be assumed to be in the XML Schema namespace, not the targetNamespace.
On the other hand, the XML Schema validator doesn't complain. But that just means the schema is valid, not necessarily meaningful.
Example here:
<schema xmlns="http://www.ogf.org/dfdl/dfdl-1.0/XMLSchemaSubset" targetNamespace="http://example.com"> <element name="foo" type="bar"/><!-- IS THIS LEGAL, no prefix on name of the type. --> <complexType name="bar"> <sequence/> </complexType> </schema>
Mike Beckerle | OGF DFDL WG Co-Chair Tel: 781-330-0412
-- Mike Beckerle | OGF DFDL WG Co-Chair Tel: 781-330-0412

Mike - In the absence of explicit qualification, you cannot unambiguously say whether type bar is in your namespace or notarget namespace. It should be flagged as an error in my opinion... From: Mike Beckerle <mbeckerle.dfdl@gmail.com> To: dfdl-wg@ogf.org Date: 03/16/2012 05:04 PM Subject: Re: [DFDL-WG] XML Schema do type references have to be qualified? Sent by: dfdl-wg-bounces@ogf.org So, I did some research, and answered my own question. Section 3.3.4.2 of "Definitive XML Schema" by Walmsley, says that the schema below is illegal as the name "bar" will be interpreted as a reference to xsd:bar, not the targetNamespace. However, it seems many XML Schema processors may be tolerant of this error. On Fri, Mar 16, 2012 at 4:48 PM, Mike Beckerle <mbeckerle.dfdl@gmail.com> wrote:
Suman,
In and XMLSchema/DFDLSchema do I have to qualify the names of types?
We have a bunch of test schemas written roughly like the example below. They all have the default unprefixed namespace as XML Schema's namespace. They also all have a target namespace.
But some or all of the type references to named types use unqualified names. In my mind, that means they would be assumed to be in the XML Schema namespace, not the targetNamespace.
On the other hand, the XML Schema validator doesn't complain. But that just means the schema is valid, not necessarily meaningful.
Example here:
<schema xmlns="http://www.ogf.org/dfdl/dfdl-1.0/XMLSchemaSubset" targetNamespace="http://example.com"> <element name="foo" type="bar"/><!-- IS THIS LEGAL, no prefix on name of the type. --> <complexType name="bar"> <sequence/> </complexType> </schema>
Mike Beckerle | OGF DFDL WG Co-Chair Tel: 781-330-0412
-- Mike Beckerle | OGF DFDL WG Co-Chair Tel: 781-330-0412 -- dfdl-wg mailing list dfdl-wg@ogf.org https://www.ogf.org/mailman/listinfo/dfdl-wg

Mike You said "They all have the default unprefixed namespace as XML Schema's namespace." Technically your schema doesn't, it is using a different namespace. xmlns="http://www.ogf.org/dfdl/dfdl-1.0/XMLSchemaSubset" I assume this is the standard 2001 XMLSchema namespace but cut-down so as to include just the constructs DFDL uses in its subset? Your namespace is not formally defined in the DFDL spec, and no such xsd is freely available at that URL, so your schema is not portable and fails to validate. It also means that you can't strip out all the DFDL stuff and leave a pure XML Schema that any schema processor can handle. Should we make your schema generally available at that URL, so it is resolved by schema processor? The IBM implementation does not define such a subset, it just uses the standard 2001 XMLSchema namespace "http://www.w3.org/2001/XMLSchema", and then does extra checking to flag constructs and types that are not in the DFDL subset. More work, but with all DFDL stuff removed the result is a pure XML Schema. If I change your schema below to use the standard 2001 XMLSchema namespace then the IBM schema validator gives the following error... CTDX1100E : XSD: Type reference 'http://www.w3.org/2001/XMLSchema#bar' is unresolved ...because it is looking in the 2001 XMLSchema namespace xsd for "bar". Regards Steve Hanson Architect, Data Format Description Language (DFDL) Co-Chair, OGF DFDL Working Group IBM SWG, Hursley, UK smh@uk.ibm.com tel:+44-1962-815848 From: Suman Kalia <kalia@ca.ibm.com> To: Mike Beckerle <mbeckerle.dfdl@gmail.com> Cc: dfdl-wg@ogf.org, dfdl-wg-bounces@ogf.org Date: 19/03/2012 03:25 Subject: Re: [DFDL-WG] XML Schema do type references have to be qualified? Sent by: dfdl-wg-bounces@ogf.org Mike - In the absence of explicit qualification, you cannot unambiguously say whether type bar is in your namespace or notarget namespace. It should be flagged as an error in my opinion... From: Mike Beckerle <mbeckerle.dfdl@gmail.com> To: dfdl-wg@ogf.org Date: 03/16/2012 05:04 PM Subject: Re: [DFDL-WG] XML Schema do type references have to be qualified? Sent by: dfdl-wg-bounces@ogf.org So, I did some research, and answered my own question. Section 3.3.4.2 of "Definitive XML Schema" by Walmsley, says that the schema below is illegal as the name "bar" will be interpreted as a reference to xsd:bar, not the targetNamespace. However, it seems many XML Schema processors may be tolerant of this error. On Fri, Mar 16, 2012 at 4:48 PM, Mike Beckerle <mbeckerle.dfdl@gmail.com> wrote:
Suman,
In and XMLSchema/DFDLSchema do I have to qualify the names of types?
We have a bunch of test schemas written roughly like the example below. They all have the default unprefixed namespace as XML Schema's namespace. They also all have a target namespace.
But some or all of the type references to named types use unqualified names. In my mind, that means they would be assumed to be in the XML Schema namespace, not the targetNamespace.
On the other hand, the XML Schema validator doesn't complain. But that just means the schema is valid, not necessarily meaningful.
Example here:
<schema xmlns="http://www.ogf.org/dfdl/dfdl-1.0/XMLSchemaSubset" targetNamespace="http://example.com"> <element name="foo" type="bar"/><!-- IS THIS LEGAL, no prefix on name> of the type. --> <complexType name="bar"> <sequence/> </complexType> </schema>
Mike Beckerle | OGF DFDL WG Co-Chair Tel: 781-330-0412
-- Mike Beckerle | OGF DFDL WG Co-Chair Tel: 781-330-0412 -- dfdl-wg mailing list dfdl-wg@ogf.org https://www.ogf.org/mailman/listinfo/dfdl-wg -- dfdl-wg mailing list dfdl-wg@ogf.org https://www.ogf.org/mailman/listinfo/dfdl-wg Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

You said "They all have the default unprefixed namespace as XML Schema's namespace." Technically your schema doesn't, it is using a different namespace.
Ah, you can ignore that. Should be the standard XML Schema URI. This subset URI is a trick I've been using to get ordinary XSD tooling in the public version of Eclipse to validate the DFDL annotation content, and to enforce the subset. Another trick is to shut off standard XSD schema validation and set up Eclipse so that it validates XSD as if it was a regular XML file being validated by a schema. Then it does the "right thing" and highlights all your errors. We have, for now, and probably for a while anyway, made daffodil accept this URI (I guess we should add a warning) so that we can use standard eclipse tooling and get interactive validation.
I assume this is the standard 2001 XMLSchema namespace but cut-down so as to include just the constructs DFDL uses in its subset?
Exactly. I cut down the Schema for XML Schema. I changed to to strict validate the DFDL annotations,... and a few other things that make the interactive validation work better.
Your namespace is not formally defined in the DFDL spec, and no such xsd is freely available at that URL, so your schema is not portable and fails to validate. It also means that you can't strip out all the DFDL stuff and leave a pure XML Schema that any schema processor can handle.
Should we make your schema generally available at that URL, so it is resolved by schema processor?
No, actually, I'm thinking of changing all the internally-used "URIs" to things that cannot be mistaken for internet URLs. I.e., more like "--/ogf.org/dfdl/dfdl-1.0/XMLSchemaSubset". This would prevent the ongoing disaster that most XML processors can't work when disconnected from the Internet. Rather, they work, but are slowed down terribly by timeouts when trying to connect to these resources. Most people consider this "not working". We don't want that for DFDL. We can provide any schemas in the form of documents, or in files that require a login to retrieve or something. The w3c has a massive set of servers on the internet just to serve up schemas for things that actually probe the URIs for schemas, when they were supposed to just be unique identifiers, and were not supposed to be URLs for retrieval. They're (w3c) careful now and for any new schemas, they don't put them on the web at those URLs at all. This whole usage of URIs that are supposed to be just unique IDs, but get interpreted as URLs to actual files has proven to be a huge disaster for w3c. There are a few famous outages of these w3c servers, and people complain that it feels like the whole Internet grinds to a halt anytime those servers go down or slow down, because so many pieces of software suddenly wait for a timeout on the schema retrieval over the net. We should consider changing the official URI for DFDL schema to something that cannot be mistaken for a URL. Let's face it, the ogf's poor server is not going to hold up to any volume of retrieval traffic on the DFDL schema.
The IBM implementation does not define such a subset, it just uses the standard 2001 XMLSchema namespace "http://www.w3.org/2001/XMLSchema", and then does extra checking to flag constructs and types that are not in the DFDL subset. More work, but with all DFDL stuff removed the result is a pure XML Schema.
If I change your schema below to use the standard 2001 XMLSchema namespace then the IBM schema validator gives the following error...
CTDX1100E : XSD: Type reference 'http://www.w3.org/2001/XMLSchema#bar' is unresolved
...because it is looking in the 2001 XMLSchema namespace xsd for "bar".
Excellent. Except for the fact that I have a bunch of daffodil project test schemas to fix... but at least we agree on what the behavior should be. Apache Xerces isn't giving this same error by the way, at least the way I have it configured. I think that's a bug. I suspect there's an option to turn on the expensive checks like this referential integrity stuff that we're not using. I will be installing Message Broker myself soon for this kind of cross checking, as I now have a computer where I can run a decent virtual machine image to install it onto. ...mikeb

We should consider changing the official URI for DFDL schema to something that cannot be mistaken for a URL. Let's face it, the ogf's poor server is not going to hold up to any volume of retrieval traffic on the DFDL schema.
It is an issue if you do not specify schema location.. Note there are/will be product specific solution e.g. using catalog support where search engine would first search for the URL in the local catalog and if not found , then it would search the internet.. From: Mike Beckerle <mbeckerle.dfdl@gmail.com> To: Steve Hanson <smh@uk.ibm.com> Cc: dfdl-wg@ogf.org Date: 03/19/2012 09:09 AM Subject: Re: [DFDL-WG] XML Schema do type references have to be qualified? Sent by: dfdl-wg-bounces@ogf.org
You said "They all have the default unprefixed namespace as XML Schema's namespace." Technically your schema doesn't, it is using a different namespace.
Ah, you can ignore that. Should be the standard XML Schema URI. This subset URI is a trick I've been using to get ordinary XSD tooling in the public version of Eclipse to validate the DFDL annotation content, and to enforce the subset. Another trick is to shut off standard XSD schema validation and set up Eclipse so that it validates XSD as if it was a regular XML file being validated by a schema. Then it does the "right thing" and highlights all your errors. We have, for now, and probably for a while anyway, made daffodil accept this URI (I guess we should add a warning) so that we can use standard eclipse tooling and get interactive validation.
I assume this is the standard 2001 XMLSchema namespace but cut-down so as to include just the constructs DFDL uses in its subset?
Your namespace is not formally defined in the DFDL spec, and no such xsd is freely available at that URL, so your schema is not portable and fails to validate. It also means that you can't strip out all the DFDL stuff and leave a
Exactly. I cut down the Schema for XML Schema. I changed to to strict validate the DFDL annotations,... and a few other things that make the interactive validation work better. pure
XML Schema that any schema processor can handle.
Should we make your schema generally available at that URL, so it is resolved by schema processor?
No, actually, I'm thinking of changing all the internally-used "URIs" to things that cannot be mistaken for internet URLs. I.e., more like "--/ogf.org/dfdl/dfdl-1.0/XMLSchemaSubset". This would prevent the ongoing disaster that most XML processors can't work when disconnected from the Internet. Rather, they work, but are slowed down terribly by timeouts when trying to connect to these resources. Most people consider this "not working". We don't want that for DFDL. We can provide any schemas in the form of documents, or in files that require a login to retrieve or something. The w3c has a massive set of servers on the internet just to serve up schemas for things that actually probe the URIs for schemas, when they were supposed to just be unique identifiers, and were not supposed to be URLs for retrieval. They're (w3c) careful now and for any new schemas, they don't put them on the web at those URLs at all. This whole usage of URIs that are supposed to be just unique IDs, but get interpreted as URLs to actual files has proven to be a huge disaster for w3c. There are a few famous outages of these w3c servers, and people complain that it feels like the whole Internet grinds to a halt anytime those servers go down or slow down, because so many pieces of software suddenly wait for a timeout on the schema retrieval over the net. We should consider changing the official URI for DFDL schema to something that cannot be mistaken for a URL. Let's face it, the ogf's poor server is not going to hold up to any volume of retrieval traffic on the DFDL schema.
The IBM implementation does not define such a subset, it just uses the standard 2001 XMLSchema namespace "http://www.w3.org/2001/XMLSchema",
then does extra checking to flag constructs and types that are not in
DFDL subset. More work, but with all DFDL stuff removed the result is a
and the pure
XML Schema.
If I change your schema below to use the standard 2001 XMLSchema namespace then the IBM schema validator gives the following error...
CTDX1100E : XSD: Type reference 'http://www.w3.org/2001/XMLSchema#bar' is unresolved
...because it is looking in the 2001 XMLSchema namespace xsd for "bar".
Excellent. Except for the fact that I have a bunch of daffodil project test schemas to fix... but at least we agree on what the behavior should be. Apache Xerces isn't giving this same error by the way, at least the way I have it configured. I think that's a bug. I suspect there's an option to turn on the expensive checks like this referential integrity stuff that we're not using. I will be installing Message Broker myself soon for this kind of cross checking, as I now have a computer where I can run a decent virtual machine image to install it onto. ...mikeb -- dfdl-wg mailing list dfdl-wg@ogf.org https://www.ogf.org/mailman/listinfo/dfdl-wg
participants (3)
-
Mike Beckerle
-
Steve Hanson
-
Suman Kalia