Network Measurement-Working Group Meeting, 16th March 2005
==========================================================

  Session 1, 13:30-15:00:
  1. Introduction and Walk Through Schemas, Mark Leese
  2. Demos, Mark Leese
  3. Version 2 of Schemas, Richard Hughes-Jones

  Session 2, 15:30-17:00:
  4. Network Testbed GtrcNET-1 (and its network measurement functions), Yuetsu Kodama
  5. Network mornitoring for Research network of Korea, Minki Noh
  6. NM-WG Futures, Richard Hughes-Jones

1. Introduction and Walk Through Schemas
========================================

Mark Leese (Daresbury Laboratory)
m.j.leese@dl.ac.uk

Status
------

 - Hierarchy doc published in June
	- Standard definitions of network metrics of use to grid
 - Focus now on schema work
	- Schemas for request and response documents for querying network 
	  performance details.
	- Version 1 is stabilising and being used by early adopters.
	- Version 2 is more powerful and flexible but not backwards
compatible.
 - New website hosted by internet2

History
-------

 - Started in June 2003 with publication of schema for publishing network 
   performance data.
 - In October produced a unified schema for requesting/querying network 
   data.
 - NM-WG work began after GGF9.
 - Internal draft specification in January 2004.

Motivation
----------

 - Give software access to measurement information.
 - Use web services and XML with common schemas to enable sharing of data.
 - Grid use case:
    - File replication
      - Spread multiple copies of the same file across the grid.
      - A file has a logical file name that maps to 1 or more physical 
        file names.
      - Service decides which PFN to use based on network performance 
        data

Requirements
------------

 - What
    - DAMED style names (i.e. path.delay.oneWay)
    - Wildcards not supported
    - Can request statistical data with specified sample interview
      - Ex. Data averages for one-way delay over a month
    - Request multiple characteristics in a single request
    - Limits can be specified for results
 - Where
 - When
    - targetTime specifies period for test
    - relative +/- time tolerances
    - "now" keyword evaluated as late as possible
    - absolute time formats
    - can also give start and end time
    - testing interval to control how often tests are run
 - How
    - Can supply values to act as params for tests, or filters for querying
past data
    - Users not tied into using publication schema
      may specify return method

What The Schemas Look Like
--------------------------

 - Started with XML-Schema but changed to RelaxNG for greater clarity
 - RelaxNG schemas can be converted to XML-Schema using tools
 - Version 1c of requirements saw the introduction of business logic
    - ex. if you receive x then do y
 - Schema validation slow but useful
 - Looking at using document/literal method instead of RPC/Encoded 
   because it allows schema validation.

An example request document for daily averages between two points was shown,

followed by an example response document containing an individual value for 
same metric.

Wrap up
-------

 - There are early adopters:
    - EGEE JRA4
    - Dante
    - Internet2
    - NCSA Advisor and UK GridMon
 - Schemas are not perfect but have matured over several iterations.
 - Achieved main aim to show how powerful shared schemas can be.
 - Lots of new ideas for schemas were raised at GGF12
 - Both schema versions will be developed in parallel
    - Version 1 must remain stable for early adopters
    - However, version 2 will offer clear advantages including
      a separation of metadata from data.

Questions
---------

Thilo Kielmann: A question was asked about the signal to noise ratio of the example 
   response documents. Most of the document consists of XML tags rather 
   that the actual network performance values. Is this the best way to 
   encapsulate results?

ML: This example shows a response document containing a single value. 
    The signal to noise ratio improves when the response contains multiple 
    values. We may also consider ways to compress the results. 
    




2. Demos
========

EGEE Network Performance Monitoring Prototype
---------------------------------------------

 - Retrieved end-to-end and backbone data
    - End-to-end data retrieved from CNRS's WP7 tools
    - Backbone data from Dante perfmonit tool
 - Architecture:
    - Client interacts with mediator via web services interface
    - Mediator locates network monitoring points and uses aggregator 
      to coordinate request
			
The prototype client was shown and various example queries for retrieving 
end-to-end and backbone data were shown.

 - NPM Prototype issues
    - Discovery:
      - At the moment network end points and routes are static.
      - Could have a registry which mediator accesses for more
        dynamic system.
      - It may be possible (but very difficult) to map ANY hostname/IP addresses
        to an appropriate monitoring point
       	
Pipes Demo
----------

 - PIPES = Performance Initiative Performance Environment System
 - Interface for historical data and on-demand tests
 - Allows end users to determine end-to-end performance using partial path 
   analysis.
 - Written using Perl and SOAP:Lite or XMLRPC:Lite
 - XML Processing is hard-coded
 - More details: http://abilene.internet2.edu/ami/webservices.html





3. Version 2 of Schemas
=======================

Richard Hughes-Jones, University of Manchester

Introduction
------------

 - Introduction to hierarchy document and terminology
 - Focus is on standardizing schemas irrespective of who paid for recording 
   measurements.
 - Started with requesting historical data then expanded for requesting 
   on-demand
 - Version 1
    - monolithic
    - straight mapping of characteristics
 - Version 2
    - base schema contains common components
    - separate sub-schema for each characteristic and/or tool
    - separation of meta-data and network information
    - framework fully extensible

Requests
--------

 - A request contains metadata describing characteristic, subject and 
   parameters.
 - namespaces used for uniquely identifying characteristics

Responses
---------

 - Response consists of two sections
    - metadata
      - conditions and parameters used by tool
      - params from request may be modified
    - data
      - content depends on characteristic
      - everything in xml (base64 could be used for binary, but probably 
        not necessary)

Examples
--------

 - 8 examples were shown illustrating various requests, responses and 
   characteristic schemas.

Current Status and Future
-------------------------

 - Currently working on a developers guide
 - Also working on a toolkit that will give reference implementations
    - Perl
    - python
    - java
    - .net (need assistance here)
 - WS-RF integration needs discussion and work
    - Advertising capability (feature negotiation)
 - Version 2 base schema has been around for a while
 - Sub-schema for common characteristics and tools defined:
    - iperf, ping, traceroute
 - Extensibility allows support for almost anything to be added!
 - Need implementer feedback
 - In order to produce a draft recommendation, two inter-operable 
   implementations are required.




4. Network Testbed GtrcNET-1 (and its network measurement functions)
====================================================================

Yuetsu Kodama (AIST)

ML: Your product looks very powerful and useful. I can imagine people wanting
to use these boxes to make a distributed monitoring infrastructure in a
network. Do you think this is possible? For example, how much does each box
cost, and how many do you have deployed at the moment?
YK: Each box costs about $20k. There are currently three boxes deployed, in
the US, Amsterdam and Japan, so yes, they could be used as distributed 
monitoring boxes.


Tomohiro Kudoh: Our box generates lots of data. Could the schemas be used for
all that data?
RHJ: Yes, but probably not by directly returning the results as XML. The
response message could be the address of file to download by conventional means.
MJL: You could also use your own more efficient method for sharing data within
your network domain. The only CRUCIAL part is to share it with others using an
agreed format at the network edges. If you were doing that with large data, then
you can obviously reduce an overhead that the schemas introduce as the price you
pay for using Web Services.