-----Original Message-----
From: owner-nm-wg@ggf.org [mailto:owner-nm-wg@ggf.org] On
Behalf Of Martin Swany
Sent: Monday, September 26, 2005 3:54 PM
To: nm-wg@ggf.org
Subject: Re: [nm-wg] Specifying units
Hi Loukik,
It has come to our notice that the messages in v2 responses do not
specify units while giving back measurement data (ex: Bandwidth
Utilization and Capacity). Specifying such units is necessary and
messages should be enhanced to support this.
I think that what you've actually saying is that the
PerfSONAR prototype doesn't return units. The NM-WG v2 schema dated
20050802 actually includes the units in almost every
measurement and this has been in the schema for a long time.
We've talked a lot about the way to do it. The current
examples feature units in the datum element, but as you note,
that wastes space. Some of the examples also depict it as
part of the metadata (which is what I've been mostly in favor
of as the least offensive current option.) There are
definitely issues with that -- mainly that the way that the
data is stored is really different than the way in which it
is collected.
So, it can go in parameters, but it might be nice if it were
able to be presented in the data section, but not in every datum.
We've discussed this one as well, and thus far there have
been no good solutions (or none that we found generally workable.)
we could specify it just once..maybe like this:
<perfsonar:data dataUnits="bps">
This really requires the data element to be in a specific
namespace or to become an omnibus for all the things that
might be common in the enclosed datum elements. There could
be other numeric values in the datum that require unit
information and we'd have to add support for each of them to data.
For example, from the current schema:
<iperf:datum interval="2.0-3.0 sec" numBytes="231"
numBytesUnits="MBytes" value="1.94" valueUnits="MBytes/sec"/>
There are multiple numeric values that need unit qualification.
<perfsonar:data>
<perfsonar:units dataUnits="bps"/>
This one is even more thorny. We referred to this as the
"older sibling" model as it makes siblings dependent on
order, and we decided to avoid that so we could use things
like hashes.
or something else...
Something else is really where we are now. What we
have discussed is a general way to "factor out" common
attributes or elements from a set of datum elements.
CommonTime is an example of this, but a general mechanism
would be nice. I proposed something before where an
enclosed element in a datum could enclose a set of datum
elements indicating that this value was common. It was
greeted with a mixture of animosity and indifference, often
coexisting in the same person's reaction.
Actually, I think that the newly-discussed "bag of parameters"
might be a partial solution to this problem but it still
doesn't help when the things common to a set of datums are
complex and not simple attributes (like a time range.)
The second comment is: Choice of units.
After a chat with Jeff on this, I can list out two options
Jeff suggested that: Service uses the units that the data
is already
in (for ex. in rrd tool, data is in octets per second.
Hence service continues to provide data in octets per second) and
continues to return data in the same units. However, the
units in use
should be clearly specified using any of the above suitable methods.
Data in RRDTool is not necessarily in octets per second, BTW.
Interface utilization data is generally fetched from an
octet counter (just to be pedantic.)
An option that I would like to propose is usage of units used in
common practice. For example, bandwidth, as known to me, is more
commonly expressed in bps (and their factors). A service
should hence
*reasonably* strive to use the units that are in common practice.
Either way, specification of units using any of the
suitable methods
mentioned previously is absolutely necessary.
I agree that reasonably trying to use units in common
practice is a good goal. Many discussions over many years
have led me to believe that in many cases, common practices
aren't so common.
For instance bandwidth is in bits per second when talking
about link capacity or sending bogus test data, but in often
in bytes per
second when an application is using the data. I think that trying to
mandate a "best practice" is a slippery slope. I vote for
unambiguous specification and easy translation.
Nevertheless, if a service returns capacity and utilization in the
same message, it would be nice to have them both in the same units
(unlike the current case with Perfsonar prototype where capacity is
bps and utilization is octets per second)
Question here is: Which option is ideal? Should we provide capacity
and utilization in the same units in our prototype?
We can let everyone else weigh in, but if the units are
specified (as
they
can easily be) then why not just divide and convert? That
seems easier
to me than forcing one or the other into a less-than-natural format.
martin