
Hi Here is an actual use case for how aggregation can be done. In SGAS we aggregate usage records into the followign format: execution_date | date insert_date | date machine_name | string user_identity | string vo_issuer | string vo_name | string vo_group | string vo_role | string n_jobs | integer cputime | numeric walltime | numeric generate_time | timestamp The combination of the following fields is considered unique: execution_date, insert_date, machine_name, user_identity, vo_issuer, vo_name, vo_group, vo_role. Some of these can be null/non-existing (vo_group and vo_role). The reason for seperating insert and execution date as that some records arrive late when the registration process fails for some reason. Most admins care more about registrations, where as usage data usually uses execution time for statistics. The three following fields are aggregated number of jobs, summed cputime, summed walltime The final field us when the record was generated. This format aggregates quite well. 3.8M records aggregated into 18K records in the NDGF accounting database. I'm an in no way implying that all aggregations should look like that and that all queries can be answered by such an aggregation. But for us, in can answer most of the common queries, and time resolution per day is typical enough. Currently we don't use AUR in SGAS (we do use UR for all job records), but might in the future (currently the methods for querying data is somewhat limited). Best regards, Henrik Software Developer, Henrik Thostrup Jensen <htj at ndgf.org> Nordic Data Grid Facility. WWW: www.ndgf.org