
On Fri, Apr 17, 2009 at 11:23 AM, Chris Webb <chris.webb@elastichosts.com>wrote:
Sam Johnston <samj@samj.net> writes:
I'm leaning towards the time/space tradeoff of including the ID in each row somehow (in which case parsing into a hash of hashes is trivial again).
That works.
<snip>
I've found the tinydns-data <http://cr.yp.to/djbdns/tinydns-data.html>format a pleasure to work with as well
Yes, DJB knows how to design a decent data format, and doesn't succumb to the over-engineering fetish predominant as one moves up the software stack. I wouldn't be upset by KEY:VALUE in place of KEY VALUE. That's also easily parseable by read or strsep().
Ok so taking this a little further to tie off the formats discussion, combining the two ideas (tinydns-data w/ id on every line) gives us: decca5a5-8952-4004-9793-cdbbf05c3c63:category:server decca5a5-8952-4004-9793-cdbbf05c3c63:title:Debian GNU/Linux 5.0 Virtual Appliance Having worked with this format for what... a decade now... I can tell you that it is an absolute dream and even things that weren't even conceived of at the time (e.g. SRV records) are easily supported... the whole while avoiding annoying/dangerous parsing problems due to greedy regexps (which are surprisingly common) and the like. It also allows us to cater for simple structures like arrays later if need be: decca5a5-8952-4004-9793-cdbbf05c3c63:interfaces:eth0:eth1:eth2 Perhaps more importantly though it trivialises both generation and parsing of content by allowing you to do it in any order. This is particularly important for scalability (allowing for multiple threads querying mutliple servers and feeding back into a shared writer). I think then that the formats discussion is pretty much done, at least for the time being. On with the verbs and nouns... Sam