Fwd (andre@merzky.net): Re: Fwd (andre@merzky.net): Re: Fwd (andre@merzky.net): Re: [saga-rg] context problem
[damned, majordomo seems really broken - forward to the list
again]
----- Forwarded message from Andre Merzky
Date: Sun, 16 Jul 2006 19:38:54 +0200 From: Andre Merzky
To: Thilo Kielmann Cc: Andre Merzky Subject: Re: Fwd (andre@merzky.net): Re: Fwd (andre@merzky.net): Re: [saga-rg] context problem Quoting [Thilo Kielmann] (Jul 16 2006):
Merging 2 mails from Andre:
very good points, and indeed (1) seems cleanest. However, it has its own semantic pitfalls:
saga::file f (url); saga::task t = f.write saga::task::Task ("hello world", ...);
f.seek (100, saga::file::SeekSet);
t.run (); t.wait ();
If on task creation the file object gets copied over, the subsequent seek (sync) and write (async) work on different object copies. In particular, these copies will have different state - seek on one copy will have no effect on where the write will occur.
I cannot see a problem here: With object copying, you will simply have the same file open twice. And given the operations you do, this might even be the right thing... This example is very academic: can you show an example where the sharing of state between tasks is useful, actually?
The problem here is, that I at a user would expect the write to happen at byte 100, but it will happen at byte 0: the seek happens on a different object than the write.
What might be a more obvious example, which goes wrong along the same lines:
f.write ("line 1\n"); f.write ("line 2\n"); f.write ("line 3\n");
That will result in a file
line 1 line 2 line 3
whereas the coed
saga::task t1 = f.write ("line 1\n"); t1.run (); t1.wait (); saga::task t2 = f.write ("line 2\n"); t2.run (); t2.wait (); saga::task t3 = f.write ("line 3\n"); t3.run (); t3.wait ();
will result in a file
line_3
the last write will start on 0, as the previous write operated on a different file pointer. In general, you cannot execute any two tasks on a single object, at least not if any state is of concern, such as file pointer, pwd, replica name, stream server port, job id, ...
That is a no-go in my opinion, as it is counter-intuitive, and breaks a large number of use cases. And is incosistent with the syncroneou method calls.
Yes, you can wreak havoc with state as well:
saga::task t1 = f.write ("line 1\n"); saga::task t2 = f.write ("line 2\n");
t1.run (); t2.run ();
t1.wait (); t2.wait ();
will likely result in
linline 2 e 1
or such - the user does need to think when doing multiple async ops at once. I don't see a way around that (and don't see a need for it either: we want to make the Grid stuff easy, but not revolutionize programming styles).
I should have added that I'd prefer 3:
3. when creating a task, all parameter objects are passed "by reference" + no enforced copying overhead - all objects are shared, lots of potential error conditions
The error conditions I could think of are:
- change state of object while a task is running, hence having the task doing something differently than intended
Change of state,
That is intentional - see above.
like destruction of objects
Well, that is what we discuss :-) 3 would delay destruction until its save (state is not needed anymore).
or change of objects.
What doe you mean here?
Not to speak of synchronization conditions: supposed you have non-atomic write operations (which is everything that writes more than a single word to memory): do you thus also enforce object locking by doing this? If not, you can have inconsistent object state that can be seen by one task, just because another task is halfway through writing the object... (all classical problems of shared-memory communication apply)
See above. You are right, but I don't see a way around that, without causing more harm than good (child and bathtub come to my mind for some reason...).
BTW: the bulk optimization we have now assumes that tasks which run at the same time are, by their very definition, independent from each other, do not depend on any specific order of execution, and do not depend from each other in respect to object state. That are the very points we talk here about - I think its a very sensible assumption. I have the same behaviour on a unix shell BTW:
touch file date >> file & date >> file &
I would not be able to make assumptions about the file contents... (well, here I could make a save bet, but you know what I mean).
- limited control over resource deallocation
this is the same thing as above
The problem really is that there is no "object lifecycle" defined. There is no way to define which task or thread might be responsible or even allowed to destroy objects or change objects. Is it???
Yes, that is what I mean with limited control.
We had a discussion on this list and in Tokyo about the semantics of cancel(), which touches the same problem: should task.cancel() block until resources are freed? As we might talk about remote resources, and Grids are unreliable, we might block forever. That does not make sense, at least not always.
The resolution we came up with is that cancel() is advisory, so non-blocking, but can also use a timeout parameter (with -1 meaning forever) to block until resources are freed.
Timeouts do not make sense on destructors I believe, but 'advisory destruction' does, IMHO.
The advantages I see:
- no copy overhead (but, as you say, that is of no concern really)
ok, but minor point.
right. Lets forget that from now on.
- simple, clear defined semantics no, it is the the most dangerous of the three versions
Well, see above - I think its the most sensible semantics :-)
- tasks keep objects they operate on alive - objects keep sessions they live in alive - sessions keep contexts they use alive
what is the maening of "alive" here??? Now that you have outruled memory management...
see above: resources get freed if not needed anymore.
- sync and asyn operations operate on the same object instance.
Let's forget about "sync" here: it is the task that is running in the current thread, so multiple tasks share object instances.
Well, it would be nice to have same semantics for sync and async, don't you think? :-)
Either way (1, 2 or 3), we have to have the user of the API thinking while using it - neither is not idiot proof.
Well, we should strive to limit the mental load on the programmer as much as possible...
I think (2) is most problematic, if I understant your 'hand-over' correctly: that would mean you can't use the object again until the task was finished?
No, it means you will never ever again be allowed to use these objects. (hand over includes the hand over of the responsibility to clean up...)
Right. So you can never do a async read, and then a sync seek, and then a async read again. At least not with sensible results.
Also, I need to create 100 file instances to do 100 reads? Remember that opening a file is a remote op in itself, potentially. Then we don't need the task model anymore.
That is broken IMHO.
Cheers, Andre.
Thilo
-- "So much time, so little to do..." -- Garfield
----- End forwarded message ----- -- "So much time, so little to do..." -- Garfield
Hi all, OK long discussion about the context. Well I just went quickly trough it and here are my comments. 1) When using pure OO languages the context is dependent on how many references to the object exist, regardless of where it is created. So you never lose the object if a pointer to it exists somewhere. 2) In general no copies are done in OOP, just the pointer is maintained, thus negligible time spend on it. Unless the method clearly stipulate that the object will be cloned (full copy). 3) In OO languages the garbage collector handles the effective destruction, so there is not object flushed when they get out of scope (unless no more pointers to it). 4) When you have a task running and the objects it shares are changed (state, attributes, etc) it is no problem. The task should properly handle it. Either it fails (error, exception, etc.) or it continues with new values (if possible). This is a typical concurrent programming issue. If you use a language like Java you can also put monitors/mutex on critical sections, so the context cannot be changed concurrently. In definitive this is a language binding problem. The main spec can reference the problem for some particular cases or tell that in concurrent mode the default behavior is "...". Also it is up to the programmer to know what he does when doing concurrent programming. It is not to the SAGA to solve all the issues. I did a lot of concurrent programing in the past and the basic rule is: - All states, variables, data that must be used gets a local copy. I do not copy objects, only primitives. An object just gets a reference copy. - Elements that are critical are put within a monitor, but it must be a minimum monitor or have a signal mechanism to avoid long lock and possibly deadlocks. - If an internal object state changes and the current thread cannot handle it anymore, an exception is raised and the thread ends. In the case explained below with the write lines, this can widely different between the implementations. If you use monitors in the code you will see something like: whereas the coed saga::task t1 = f.write ("line 1\n"); t1.run (); t1.wait (); saga::task t2 = f.write ("line 2\n"); t2.run (); t2.wait (); saga::task t3 = f.write ("line 3\n"); t3.run (); t3.wait (); will result in a file for example line 3 line 1 line 2 Or any combination of theses 3 lines. However each line will be written fully before the monitor is released. This will cost in execution time, but at least the critical section will be coherent. If not then it is a free for all fight. SAGA just must mention if the run must ensure the critical execution or if it goes for a free-for-all.
Date: Sun, 16 Jul 2006 19:38:54 +0200 From: Andre Merzky
To: Thilo Kielmann Cc: Andre Merzky Subject: Re: Fwd (andre@merzky.net): Re: Fwd (andre@merzky.net): Re: [saga-rg] context problem Quoting [Thilo Kielmann] (Jul 16 2006):
Merging 2 mails from Andre:
very good points, and indeed (1) seems cleanest. However, it has its own semantic pitfalls:
saga::file f (url); saga::task t = f.write saga::task::Task ("hello world", ...);
f.seek (100, saga::file::SeekSet);
t.run (); t.wait ();
If on task creation the file object gets copied over, the subsequent seek (sync) and write (async) work on different object copies. In particular, these copies will have different state - seek on one copy will have no effect on where the write will occur. I cannot see a problem here: With object copying, you will simply have the same file open twice. And given the operations you do, this might even be the right thing... This example is very academic: can you show an example where the sharing of state between tasks is useful, actually? The problem here is, that I at a user would expect the write to happen at byte 100, but it will happen at byte 0: the seek happens on a different object than the write.
What might be a more obvious example, which goes wrong along the same lines:
f.write ("line 1\n"); f.write ("line 2\n"); f.write ("line 3\n");
That will result in a file
line 1 line 2 line 3
whereas the coed
saga::task t1 = f.write ("line 1\n"); t1.run (); t1.wait (); saga::task t2 = f.write ("line 2\n"); t2.run (); t2.wait (); saga::task t3 = f.write ("line 3\n"); t3.run (); t3.wait ();
will result in a file
line_3
the last write will start on 0, as the previous write operated on a different file pointer. In general, you cannot execute any two tasks on a single object, at least not if any state is of concern, such as file pointer, pwd, replica name, stream server port, job id, ...
That is a no-go in my opinion, as it is counter-intuitive, and breaks a large number of use cases. And is incosistent with the syncroneou method calls.
Yes, you can wreak havoc with state as well:
saga::task t1 = f.write ("line 1\n"); saga::task t2 = f.write ("line 2\n");
t1.run (); t2.run ();
t1.wait (); t2.wait ();
will likely result in
linline 2 e 1
or such - the user does need to think when doing multiple async ops at once. I don't see a way around that (and don't see a need for it either: we want to make the Grid stuff easy, but not revolutionize programming styles).
I should have added that I'd prefer 3:
3. when creating a task, all parameter objects are passed "by reference" + no enforced copying overhead - all objects are shared, lots of potential error conditions The error conditions I could think of are:
- change state of object while a task is running, hence having the task doing something differently than intended Change of state, That is intentional - see above.
like destruction of objects Well, that is what we discuss :-) 3 would delay destruction until its save (state is not needed anymore).
or change of objects. What doe you mean here?
Not to speak of synchronization conditions: supposed you have non-atomic write operations (which is everything that writes more than a single word to memory): do you thus also enforce object locking by doing this? If not, you can have inconsistent object state that can be seen by one task, just because another task is halfway through writing the object... (all classical problems of shared-memory communication apply) See above. You are right, but I don't see a way around that, without causing more harm than good (child and bathtub come to my mind for some reason...).
BTW: the bulk optimization we have now assumes that tasks which run at the same time are, by their very definition, independent from each other, do not depend on any specific order of execution, and do not depend from each other in respect to object state. That are the very points we talk here about - I think its a very sensible assumption. I have the same behaviour on a unix shell BTW:
touch file date >> file & date >> file &
I would not be able to make assumptions about the file contents... (well, here I could make a save bet, but you know what I mean).
- limited control over resource deallocation this is the same thing as above
The problem really is that there is no "object lifecycle" defined. There is no way to define which task or thread might be responsible or even allowed to destroy objects or change objects. Is it??? Yes, that is what I mean with limited control.
We had a discussion on this list and in Tokyo about the semantics of cancel(), which touches the same problem: should task.cancel() block until resources are freed? As we might talk about remote resources, and Grids are unreliable, we might block forever. That does not make sense, at least not always.
The resolution we came up with is that cancel() is advisory, so non-blocking, but can also use a timeout parameter (with -1 meaning forever) to block until resources are freed.
Timeouts do not make sense on destructors I believe, but 'advisory destruction' does, IMHO.
The advantages I see:
- no copy overhead (but, as you say, that is of no concern really) ok, but minor point. right. Lets forget that from now on.
- simple, clear defined semantics no, it is the the most dangerous of the three versions Well, see above - I think its the most sensible semantics :-)
- tasks keep objects they operate on alive - objects keep sessions they live in alive - sessions keep contexts they use alive
what is the maening of "alive" here??? Now that you have outruled memory management... see above: resources get freed if not needed anymore.
- sync and asyn operations operate on the same object instance.
Let's forget about "sync" here: it is the task that is running in the current thread, so multiple tasks share object instances. Well, it would be nice to have same semantics for sync and async, don't you think? :-)
Either way (1, 2 or 3), we have to have the user of the API thinking while using it - neither is not idiot proof. Well, we should strive to limit the mental load on the programmer as much as possible...
I think (2) is most problematic, if I understant your 'hand-over' correctly: that would mean you can't use the object again until the task was finished? No, it means you will never ever again be allowed to use these objects. (hand over includes the hand over of the responsibility to clean up...) Right. So you can never do a async read, and then a sync seek, and then a async read again. At least not with sensible results.
Also, I need to create 100 file instances to do 100 reads? Remember that opening a file is a remote op in itself, potentially. Then we don't need the task model anymore.
That is broken IMHO.
Cheers, Andre.
Thilo -- "So much time, so little to do..." -- Garfield
----- End forwarded message -----
-- Best regards, Pascal Kleijer ---------------------------------------------------------------- HPC Marketing Promotion Division, NEC Corporation 1-10, Nisshin-cho, Fuchu, Tokyo, 183-8501, Japan. Tel: +81-(0)42/333.6389 Fax: +81-(0)42/333.6382
Hi Pascal, Quoting [Pascal Kleijer] (Jul 18 2006):
Hi all,
OK long discussion about the context. Well I just went quickly trough it and here are my comments.
1) When using pure OO languages the context is dependent on how many references to the object exist, regardless of where it is created. So you never lose the object if a pointer to it exists somewhere.
2) In general no copies are done in OOP, just the pointer is maintained, thus negligible time spend on it. Unless the method clearly stipulate that the object will be cloned (full copy).
3) In OO languages the garbage collector handles the effective destruction, so there is not object flushed when they get out of scope (unless no more pointers to it).
Well, that is cretainly true for Java, but not, for example, for C++. Also, although the spec is OO, we need to define that lifetime explicitely, to allow to semantically identical mappings in non-OO languages. However, I would agree that the behaviour you describe would be the one to whish for.
4) When you have a task running and the objects it shares are changed (state, attributes, etc) it is no problem. The task should properly handle it. Either it fails (error, exception, etc.) or it continues with new values (if possible). This is a typical concurrent programming issue. If you use a language like Java you can also put monitors/mutex on critical sections, so the context cannot be changed concurrently.
In definitive this is a language binding problem. The main spec can reference the problem for some particular cases or tell that in concurrent mode the default behavior is "...". Also it is up to the programmer to know what he does when doing concurrent programming. It is not to the SAGA to solve all the issues.
I strongly agree to that: SAGA should not strive to solve the concurrent programming problems, but should allow to adopt existing practices.
I did a lot of concurrent programing in the past and the basic rule is: - All states, variables, data that must be used gets a local copy. I do not copy objects, only primitives. An object just gets a reference copy. - Elements that are critical are put within a monitor, but it must be a minimum monitor or have a signal mechanism to avoid long lock and possibly deadlocks. - If an internal object state changes and the current thread cannot handle it anymore, an exception is raised and the thread ends.
In the case explained below with the write lines, this can widely different between the implementations. If you use monitors in the code you will see something like:
whereas the coed
saga::task t1 = f.write ("line 1\n"); t1.run (); t1.wait (); saga::task t2 = f.write ("line 2\n"); t2.run (); t2.wait (); saga::task t3 = f.write ("line 3\n"); t3.run (); t3.wait ();
will result in a file for example
line 3 line 1 line 2
Or any combination of theses 3 lines. However each line will be written fully before the monitor is released. This will cost in execution time, but at least the critical section will be coherent. If not then it is a free for all fight. SAGA just must mention if the run must ensure the critical execution or if it goes for a free-for-all.
The above example should actually, IMHO, result in an ordered file, as the wait() calls are synchronizing the tasks on application level. Only if the waits are omitted the tasks could be executed in any order, resulting in a mixed file. Do I miss something? Cheers, Andre.
Date: Sun, 16 Jul 2006 19:38:54 +0200 From: Andre Merzky
To: Thilo Kielmann Cc: Andre Merzky Subject: Re: Fwd (andre@merzky.net): Re: Fwd (andre@merzky.net): Re: [saga-rg] context problem Quoting [Thilo Kielmann] (Jul 16 2006):
Merging 2 mails from Andre:
very good points, and indeed (1) seems cleanest. However, it has its own semantic pitfalls:
saga::file f (url); saga::task t = f.write saga::task::Task ("hello world", ...);
f.seek (100, saga::file::SeekSet);
t.run (); t.wait ();
If on task creation the file object gets copied over, the subsequent seek (sync) and write (async) work on different object copies. In particular, these copies will have different state - seek on one copy will have no effect on where the write will occur. I cannot see a problem here: With object copying, you will simply have the same file open twice. And given the operations you do, this might even be the right thing... This example is very academic: can you show an example where the sharing of state between tasks is useful, actually? The problem here is, that I at a user would expect the write to happen at byte 100, but it will happen at byte 0: the seek happens on a different object than the write.
What might be a more obvious example, which goes wrong along the same lines:
f.write ("line 1\n"); f.write ("line 2\n"); f.write ("line 3\n");
That will result in a file
line 1 line 2 line 3
whereas the coed
saga::task t1 = f.write ("line 1\n"); t1.run (); t1.wait (); saga::task t2 = f.write ("line 2\n"); t2.run (); t2.wait (); saga::task t3 = f.write ("line 3\n"); t3.run (); t3.wait ();
will result in a file
line_3
the last write will start on 0, as the previous write operated on a different file pointer. In general, you cannot execute any two tasks on a single object, at least not if any state is of concern, such as file pointer, pwd, replica name, stream server port, job id, ...
That is a no-go in my opinion, as it is counter-intuitive, and breaks a large number of use cases. And is incosistent with the syncroneou method calls.
Yes, you can wreak havoc with state as well:
saga::task t1 = f.write ("line 1\n"); saga::task t2 = f.write ("line 2\n");
t1.run (); t2.run ();
t1.wait (); t2.wait ();
will likely result in
linline 2 e 1
or such - the user does need to think when doing multiple async ops at once. I don't see a way around that (and don't see a need for it either: we want to make the Grid stuff easy, but not revolutionize programming styles).
I should have added that I'd prefer 3:
>3. when creating a task, all parameter objects are passed "by >reference" > + no enforced copying overhead > - all objects are shared, lots of potential error conditions The error conditions I could think of are:
- change state of object while a task is running, hence having the task doing something differently than intended Change of state, That is intentional - see above.
like destruction of objects Well, that is what we discuss :-) 3 would delay destruction until its save (state is not needed anymore).
or change of objects. What doe you mean here?
Not to speak of synchronization conditions: supposed you have non-atomic write operations (which is everything that writes more than a single word to memory): do you thus also enforce object locking by doing this? If not, you can have inconsistent object state that can be seen by one task, just because another task is halfway through writing the object... (all classical problems of shared-memory communication apply) See above. You are right, but I don't see a way around that, without causing more harm than good (child and bathtub come to my mind for some reason...).
BTW: the bulk optimization we have now assumes that tasks which run at the same time are, by their very definition, independent from each other, do not depend on any specific order of execution, and do not depend from each other in respect to object state. That are the very points we talk here about - I think its a very sensible assumption. I have the same behaviour on a unix shell BTW:
touch file date >> file & date >> file &
I would not be able to make assumptions about the file contents... (well, here I could make a save bet, but you know what I mean).
- limited control over resource deallocation this is the same thing as above
The problem really is that there is no "object lifecycle" defined. There is no way to define which task or thread might be responsible or even allowed to destroy objects or change objects. Is it??? Yes, that is what I mean with limited control.
We had a discussion on this list and in Tokyo about the semantics of cancel(), which touches the same problem: should task.cancel() block until resources are freed? As we might talk about remote resources, and Grids are unreliable, we might block forever. That does not make sense, at least not always.
The resolution we came up with is that cancel() is advisory, so non-blocking, but can also use a timeout parameter (with -1 meaning forever) to block until resources are freed.
Timeouts do not make sense on destructors I believe, but 'advisory destruction' does, IMHO.
The advantages I see:
- no copy overhead (but, as you say, that is of no concern really) ok, but minor point. right. Lets forget that from now on.
- simple, clear defined semantics no, it is the the most dangerous of the three versions Well, see above - I think its the most sensible semantics :-)
- tasks keep objects they operate on alive - objects keep sessions they live in alive - sessions keep contexts they use alive what is the maening of "alive" here??? Now that you have outruled memory management... see above: resources get freed if not needed anymore.
- sync and asyn operations operate on the same object instance. Let's forget about "sync" here: it is the task that is running in the current thread, so multiple tasks share object instances. Well, it would be nice to have same semantics for sync and async, don't you think? :-)
Either way (1, 2 or 3), we have to have the user of the API thinking while using it - neither is not idiot proof. Well, we should strive to limit the mental load on the programmer as much as possible...
I think (2) is most problematic, if I understant your 'hand-over' correctly: that would mean you can't use the object again until the task was finished? No, it means you will never ever again be allowed to use these objects. (hand over includes the hand over of the responsibility to clean up...) Right. So you can never do a async read, and then a sync seek, and then a async read again. At least not with sensible results.
Also, I need to create 100 file instances to do 100 reads? Remember that opening a file is a remote op in itself, potentially. Then we don't need the task model anymore.
That is broken IMHO.
Cheers, Andre.
Thilo -- "So much time, so little to do..." -- Garfield
----- End forwarded message ----- -- "So much time, so little to do..." -- Garfield
Hi Andre, in-lined comments...
OK long discussion about the context. Well I just went quickly trough it and here are my comments.
1) When using pure OO languages the context is dependent on how many references to the object exist, regardless of where it is created. So you never lose the object if a pointer to it exists somewhere.
2) In general no copies are done in OOP, just the pointer is maintained, thus negligible time spend on it. Unless the method clearly stipulate that the object will be cloned (full copy).
3) In OO languages the garbage collector handles the effective destruction, so there is not object flushed when they get out of scope (unless no more pointers to it).
Well, that is cretainly true for Java, but not, for example, for C++. Also, although the spec is OO, we need to define that lifetime explicitely, to allow to semantically identical mappings in non-OO languages.
However, I would agree that the behaviour you describe would be the one to whish for.
True that non-OO are more tricky to handle. But can the spec not state that it should handle this problem based on the binding used? If you use a high context language you will have an easy task, when going down to more primitive languages like Fortran or C, well the binding will have a big burden. In the case of non-OO languages it might be preferable to use by copy then by reference, this will however have an impact on the runtime but will avoid you a lot of troubles.
4) When you have a task running and the objects it shares are changed (state, attributes, etc) it is no problem. The task should properly handle it. Either it fails (error, exception, etc.) or it continues with new values (if possible). This is a typical concurrent programming issue. If you use a language like Java you can also put monitors/mutex on critical sections, so the context cannot be changed concurrently.
In definitive this is a language binding problem. The main spec can reference the problem for some particular cases or tell that in concurrent mode the default behavior is "...". Also it is up to the programmer to know what he does when doing concurrent programming. It is not to the SAGA to solve all the issues.
I strongly agree to that: SAGA should not strive to solve the concurrent programming problems, but should allow to adopt existing practices.
I did a lot of concurrent programing in the past and the basic rule is: - All states, variables, data that must be used gets a local copy. I do not copy objects, only primitives. An object just gets a reference copy. - Elements that are critical are put within a monitor, but it must be a minimum monitor or have a signal mechanism to avoid long lock and possibly deadlocks. - If an internal object state changes and the current thread cannot handle it anymore, an exception is raised and the thread ends.
In the case explained below with the write lines, this can widely different between the implementations. If you use monitors in the code you will see something like:
whereas the coed
saga::task t1 = f.write ("line 1\n"); t1.run (); t1.wait (); saga::task t2 = f.write ("line 2\n"); t2.run (); t2.wait (); saga::task t3 = f.write ("line 3\n"); t3.run (); t3.wait ();
will result in a file for example
line 3 line 1 line 2
Or any combination of theses 3 lines. However each line will be written fully before the monitor is released. This will cost in execution time, but at least the critical section will be coherent. If not then it is a free for all fight. SAGA just must mention if the run must ensure the critical execution or if it goes for a free-for-all.
The above example should actually, IMHO, result in an ordered file, as the wait() calls are synchronizing the tasks on application level. Only if the waits are omitted the tasks could be executed in any order, resulting in a mixed file.
Do I miss something?
No you didn't. I did a blunt Cut & Paste and omitted to remove the "wait". :(
Date: Sun, 16 Jul 2006 19:38:54 +0200 From: Andre Merzky
To: Thilo Kielmann Cc: Andre Merzky Subject: Re: Fwd (andre@merzky.net): Re: Fwd (andre@merzky.net): Re: [saga-rg] context problem Quoting [Thilo Kielmann] (Jul 16 2006):
Merging 2 mails from Andre:
very good points, and indeed (1) seems cleanest. However, it has its own semantic pitfalls:
saga::file f (url); saga::task t = f.write saga::task::Task ("hello world", ...);
f.seek (100, saga::file::SeekSet);
t.run (); t.wait ();
If on task creation the file object gets copied over, the subsequent seek (sync) and write (async) work on different object copies. In particular, these copies will have different state - seek on one copy will have no effect on where the write will occur. I cannot see a problem here: With object copying, you will simply have the same file open twice. And given the operations you do, this might even be the right thing... This example is very academic: can you show an example where the sharing of state between tasks is useful, actually? The problem here is, that I at a user would expect the write to happen at byte 100, but it will happen at byte 0: the seek happens on a different object than the write.
What might be a more obvious example, which goes wrong along the same lines:
f.write ("line 1\n"); f.write ("line 2\n"); f.write ("line 3\n");
That will result in a file
line 1 line 2 line 3
whereas the coed
saga::task t1 = f.write ("line 1\n"); t1.run (); t1.wait (); saga::task t2 = f.write ("line 2\n"); t2.run (); t2.wait (); saga::task t3 = f.write ("line 3\n"); t3.run (); t3.wait ();
will result in a file
line_3
the last write will start on 0, as the previous write operated on a different file pointer. In general, you cannot execute any two tasks on a single object, at least not if any state is of concern, such as file pointer, pwd, replica name, stream server port, job id, ...
That is a no-go in my opinion, as it is counter-intuitive, and breaks a large number of use cases. And is incosistent with the syncroneou method calls.
Yes, you can wreak havoc with state as well:
saga::task t1 = f.write ("line 1\n"); saga::task t2 = f.write ("line 2\n");
t1.run (); t2.run ();
t1.wait (); t2.wait ();
will likely result in
linline 2 e 1
or such - the user does need to think when doing multiple async ops at once. I don't see a way around that (and don't see a need for it either: we want to make the Grid stuff easy, but not revolutionize programming styles).
I should have added that I'd prefer 3:
>> 3. when creating a task, all parameter objects are passed "by >> reference" >> + no enforced copying overhead >> - all objects are shared, lots of potential error conditions The error conditions I could think of are:
- change state of object while a task is running, hence having the task doing something differently than intended Change of state, That is intentional - see above.
like destruction of objects Well, that is what we discuss :-) 3 would delay destruction until its save (state is not needed anymore).
or change of objects. What doe you mean here?
Not to speak of synchronization conditions: supposed you have non-atomic write operations (which is everything that writes more than a single word to memory): do you thus also enforce object locking by doing this? If not, you can have inconsistent object state that can be seen by one task, just because another task is halfway through writing the object... (all classical problems of shared-memory communication apply) See above. You are right, but I don't see a way around that, without causing more harm than good (child and bathtub come to my mind for some reason...).
BTW: the bulk optimization we have now assumes that tasks which run at the same time are, by their very definition, independent from each other, do not depend on any specific order of execution, and do not depend from each other in respect to object state. That are the very points we talk here about - I think its a very sensible assumption. I have the same behaviour on a unix shell BTW:
touch file date >> file & date >> file &
I would not be able to make assumptions about the file contents... (well, here I could make a save bet, but you know what I mean).
- limited control over resource deallocation this is the same thing as above
The problem really is that there is no "object lifecycle" defined. There is no way to define which task or thread might be responsible or even allowed to destroy objects or change objects. Is it??? Yes, that is what I mean with limited control.
We had a discussion on this list and in Tokyo about the semantics of cancel(), which touches the same problem: should task.cancel() block until resources are freed? As we might talk about remote resources, and Grids are unreliable, we might block forever. That does not make sense, at least not always.
The resolution we came up with is that cancel() is advisory, so non-blocking, but can also use a timeout parameter (with -1 meaning forever) to block until resources are freed.
Timeouts do not make sense on destructors I believe, but 'advisory destruction' does, IMHO.
The advantages I see:
- no copy overhead (but, as you say, that is of no concern really) ok, but minor point. right. Lets forget that from now on.
- simple, clear defined semantics no, it is the the most dangerous of the three versions Well, see above - I think its the most sensible semantics :-)
- tasks keep objects they operate on alive - objects keep sessions they live in alive - sessions keep contexts they use alive what is the maening of "alive" here??? Now that you have outruled memory management... see above: resources get freed if not needed anymore.
- sync and asyn operations operate on the same object instance. Let's forget about "sync" here: it is the task that is running in the current thread, so multiple tasks share object instances. Well, it would be nice to have same semantics for sync and async, don't you think? :-)
Either way (1, 2 or 3), we have to have the user of the API thinking while using it - neither is not idiot proof. Well, we should strive to limit the mental load on the programmer as much as possible...
I think (2) is most problematic, if I understant your 'hand-over' correctly: that would mean you can't use the object again until the task was finished? No, it means you will never ever again be allowed to use these objects. (hand over includes the hand over of the responsibility to clean up...) Right. So you can never do a async read, and then a sync seek, and then a async read again. At least not with sensible results.
Also, I need to create 100 file instances to do 100 reads? Remember that opening a file is a remote op in itself, potentially. Then we don't need the task model anymore.
That is broken IMHO.
Cheers, Andre.
Thilo -- "So much time, so little to do..." -- Garfield ----- End forwarded message -----
-- Best regards, Pascal Kleijer ---------------------------------------------------------------- HPC Marketing Promotion Division, NEC Corporation 1-10, Nisshin-cho, Fuchu, Tokyo, 183-8501, Japan. Tel: +81-(0)42/333.6389 Fax: +81-(0)42/333.6382
Quoting [Pascal Kleijer] (Jul 19 2006):
Hi Andre,
in-lined comments...
OK long discussion about the context. Well I just went quickly trough it and here are my comments.
1) When using pure OO languages the context is dependent on how many references to the object exist, regardless of where it is created. So you never lose the object if a pointer to it exists somewhere.
2) In general no copies are done in OOP, just the pointer is maintained, thus negligible time spend on it. Unless the method clearly stipulate that the object will be cloned (full copy).
3) In OO languages the garbage collector handles the effective destruction, so there is not object flushed when they get out of scope (unless no more pointers to it).
Well, that is cretainly true for Java, but not, for example, for C++. Also, although the spec is OO, we need to define that lifetime explicitely, to allow to semantically identical mappings in non-OO languages.
However, I would agree that the behaviour you describe would be the one to whish for.
True that non-OO are more tricky to handle. But can the spec not state that it should handle this problem based on the binding used? If you use a high context language you will have an easy task, when going down to more primitive languages like Fortran or C, well the binding will have a big burden.
Yes, agree. Well, we always said that ease-of-use is more important than ease-of-implementation, so I think to move the burden to the language binding, and finally to the implementors, is justified.
In the case of non-OO languages it might be preferable to use by copy then by reference, this will however have an impact on the runtime but will avoid you a lot of troubles.
I would assume that a non-OO implementations uses pointers to structures as object representations, or handles or such. Hence a copy would in fact represent a reference-copy, and the same semantics can easily be achieved.
Or any combination of theses 3 lines. However each line will be written fully before the monitor is released. This will cost in execution time, but at least the critical section will be coherent. If not then it is a free for all fight. SAGA just must mention if the run must ensure the critical execution or if it goes for a free-for-all.
The above example should actually, IMHO, result in an ordered file, as the wait() calls are synchronizing the tasks on application level. Only if the waits are omitted the tasks could be executed in any order, resulting in a mixed file.
Do I miss something?
No you didn't. I did a blunt Cut & Paste and omitted to remove the "wait". :(
:-) Right, w/o the waits your conclusions are correct I think, and monitors or locks should be able to have some kind of atomicity. However, we can't enforce monitors IMHO, as this would require to have these monitors available on server side - and that cannot be assumed by default. That goes back to the discussion about POSIX consistency (which assumes that write() is atomic, if I remember a quote from Felix correctly), but we agreed that POSIX consistency is too much to ask for. Also, consistency is badly defined for other operations than read/write. The spec currently says that an implementation CAN provide that type of consistency, and that it in general MUST carefully document the consistency model it support. Hence, the problem is moved into implementation space. Does that make sense to you? Cheers, Andre.
Date: Sun, 16 Jul 2006 19:38:54 +0200 From: Andre Merzky
To: Thilo Kielmann Cc: Andre Merzky Subject: Re: Fwd (andre@merzky.net): Re: Fwd (andre@merzky.net): Re: [saga-rg] context problem Quoting [Thilo Kielmann] (Jul 16 2006):
Merging 2 mails from Andre:
>very good points, and indeed (1) seems cleanest. However, >it has its own semantic pitfalls: > > saga::file f (url); > saga::task t = f.write saga::task::Task ("hello world", ...); > > f.seek (100, saga::file::SeekSet); > > t.run (); > t.wait (); > > >If on task creation the file object gets copied over, the >subsequent seek (sync) and write (async) work on different >object copies. In particular, these copies will have >different state - seek on one copy will have no effect on >where the write will occur. I cannot see a problem here: With object copying, you will simply have the same file open twice. And given the operations you do, this might even be the right thing... This example is very academic: can you show an example where the sharing of state between tasks is useful, actually? The problem here is, that I at a user would expect the write to happen at byte 100, but it will happen at byte 0: the seek happens on a different object than the write.
What might be a more obvious example, which goes wrong along the same lines:
f.write ("line 1\n"); f.write ("line 2\n"); f.write ("line 3\n");
That will result in a file
line 1 line 2 line 3
whereas the coed
saga::task t1 = f.write ("line 1\n"); t1.run (); t1.wait (); saga::task t2 = f.write ("line 2\n"); t2.run (); t2.wait (); saga::task t3 = f.write ("line 3\n"); t3.run (); t3.wait ();
will result in a file
line_3
the last write will start on 0, as the previous write operated on a different file pointer. In general, you cannot execute any two tasks on a single object, at least not if any state is of concern, such as file pointer, pwd, replica name, stream server port, job id, ...
That is a no-go in my opinion, as it is counter-intuitive, and breaks a large number of use cases. And is incosistent with the syncroneou method calls.
Yes, you can wreak havoc with state as well:
saga::task t1 = f.write ("line 1\n"); saga::task t2 = f.write ("line 2\n");
t1.run (); t2.run ();
t1.wait (); t2.wait ();
will likely result in
linline 2 e 1
or such - the user does need to think when doing multiple async ops at once. I don't see a way around that (and don't see a need for it either: we want to make the Grid stuff easy, but not revolutionize programming styles).
>I should have added that I'd prefer 3: > >>>3. when creating a task, all parameter objects are passed "by >>>reference" >>> + no enforced copying overhead >>> - all objects are shared, lots of potential error conditions >The error conditions I could think of are: > > - change state of object while a task is running, hence > having the task doing something differently than > intended Change of state, That is intentional - see above.
like destruction of objects Well, that is what we discuss :-) 3 would delay destruction until its save (state is not needed anymore).
or change of objects. What doe you mean here?
Not to speak of synchronization conditions: supposed you have non-atomic write operations (which is everything that writes more than a single word to memory): do you thus also enforce object locking by doing this? If not, you can have inconsistent object state that can be seen by one task, just because another task is halfway through writing the object... (all classical problems of shared-memory communication apply) See above. You are right, but I don't see a way around that, without causing more harm than good (child and bathtub come to my mind for some reason...).
BTW: the bulk optimization we have now assumes that tasks which run at the same time are, by their very definition, independent from each other, do not depend on any specific order of execution, and do not depend from each other in respect to object state. That are the very points we talk here about - I think its a very sensible assumption. I have the same behaviour on a unix shell BTW:
touch file date >> file & date >> file &
I would not be able to make assumptions about the file contents... (well, here I could make a save bet, but you know what I mean).
> - limited control over resource deallocation this is the same thing as above
The problem really is that there is no "object lifecycle" defined. There is no way to define which task or thread might be responsible or even allowed to destroy objects or change objects. Is it??? Yes, that is what I mean with limited control.
We had a discussion on this list and in Tokyo about the semantics of cancel(), which touches the same problem: should task.cancel() block until resources are freed? As we might talk about remote resources, and Grids are unreliable, we might block forever. That does not make sense, at least not always.
The resolution we came up with is that cancel() is advisory, so non-blocking, but can also use a timeout parameter (with -1 meaning forever) to block until resources are freed.
Timeouts do not make sense on destructors I believe, but 'advisory destruction' does, IMHO.
>The advantages I see: > > - no copy overhead (but, as you say, that is of no > concern really) ok, but minor point. right. Lets forget that from now on.
> - simple, clear defined semantics no, it is the the most dangerous of the three versions Well, see above - I think its the most sensible semantics :-)
> - tasks keep objects they operate on alive - > objects keep sessions they live in alive - > sessions keep contexts they use alive what is the maening of "alive" here??? Now that you have outruled memory management... see above: resources get freed if not needed anymore.
> - sync and asyn operations operate on the same > object instance. Let's forget about "sync" here: it is the task that is running in the current thread, so multiple tasks share object instances. Well, it would be nice to have same semantics for sync and async, don't you think? :-)
>Either way (1, 2 or 3), we have to have the user of the >API thinking while using it - neither is not idiot >proof. Well, we should strive to limit the mental load on the programmer as much as possible...
>I think (2) is most problematic, if I understant your >'hand-over' correctly: that would mean you can't use the >object again until the task was finished? No, it means you will never ever again be allowed to use these objects. (hand over includes the hand over of the responsibility to clean up...) Right. So you can never do a async read, and then a sync seek, and then a async read again. At least not with sensible results.
Also, I need to create 100 file instances to do 100 reads? Remember that opening a file is a remote op in itself, potentially. Then we don't need the task model anymore.
That is broken IMHO.
Cheers, Andre.
Thilo -- "So much time, so little to do..." -- Garfield ----- End forwarded message -----
-- "So much time, so little to do..." -- Garfield
Hi Andre, More in-lines...
OK long discussion about the context. Well I just went quickly trough it and here are my comments.
1) When using pure OO languages the context is dependent on how many references to the object exist, regardless of where it is created. So you never lose the object if a pointer to it exists somewhere.
2) In general no copies are done in OOP, just the pointer is maintained, thus negligible time spend on it. Unless the method clearly stipulate that the object will be cloned (full copy).
3) In OO languages the garbage collector handles the effective destruction, so there is not object flushed when they get out of scope (unless no more pointers to it). Well, that is cretainly true for Java, but not, for example, for C++. Also, although the spec is OO, we need to define that lifetime explicitely, to allow to semantically identical mappings in non-OO languages.
However, I would agree that the behaviour you describe would be the one to whish for. True that non-OO are more tricky to handle. But can the spec not state that it should handle this problem based on the binding used? If you use a high context language you will have an easy task, when going down to more primitive languages like Fortran or C, well the binding will have a big burden.
Yes, agree. Well, we always said that ease-of-use is more important than ease-of-implementation, so I think to move the burden to the language binding, and finally to the implementors, is justified.
In the case of non-OO languages it might be preferable to use by copy then by reference, this will however have an impact on the runtime but will avoid you a lot of troubles.
I would assume that a non-OO implementations uses pointers to structures as object representations, or handles or such. Hence a copy would in fact represent a reference-copy, and the same semantics can easily be achieved.
Well no. The problem with non-OO languages and C++ is that you must explicitly garbage collect your objects (call a destroy). So if you just copy the pointer you cannot test if it has been destroyed unless you handle the pointer to the pointer and can test for a NULL (agrl in Fortran). So the idea is to make a full copy of the object state and use a locale copy within the thread. This might cause a big overhead and be rather tricky to implement, especially if object contains other objects. The recursion in copy might be either impossible, longish or very complex.
Or any combination of theses 3 lines. However each line will be written fully before the monitor is released. This will cost in execution time, but at least the critical section will be coherent. If not then it is a free for all fight. SAGA just must mention if the run must ensure the critical execution or if it goes for a free-for-all. The above example should actually, IMHO, result in an ordered file, as the wait() calls are synchronizing the tasks on application level. Only if the waits are omitted the tasks could be executed in any order, resulting in a mixed file.
Do I miss something? No you didn't. I did a blunt Cut & Paste and omitted to remove the "wait". :(
:-) Right, w/o the waits your conclusions are correct I think, and monitors or locks should be able to have some kind of atomicity.
However, we can't enforce monitors IMHO, as this would require to have these monitors available on server side - and that cannot be assumed by default. That goes back to the discussion about POSIX consistency (which assumes that write() is atomic, if I remember a quote from Felix correctly), but we agreed that POSIX consistency is too much to ask for. Also, consistency is badly defined for other operations than read/write.
The spec currently says that an implementation CAN provide that type of consistency, and that it in general MUST carefully document the consistency model it support. Hence, the problem is moved into implementation space.
Does that make sense to you?
Yeap, and this really points to the fact that the spec must be crystal clear on what must be done. It must never be assumed that the reader/implementor/person knows, always state it even to redundancy.
Date: Sun, 16 Jul 2006 19:38:54 +0200 From: Andre Merzky
To: Thilo Kielmann Cc: Andre Merzky Subject: Re: Fwd (andre@merzky.net): Re: Fwd (andre@merzky.net): Re: [saga-rg] context problem Quoting [Thilo Kielmann] (Jul 16 2006): > Merging 2 mails from Andre: > >> very good points, and indeed (1) seems cleanest. However, >> it has its own semantic pitfalls: >> >> saga::file f (url); >> saga::task t = f.write saga::task::Task ("hello world", ...); >> >> f.seek (100, saga::file::SeekSet); >> >> t.run (); >> t.wait (); >> >> >> If on task creation the file object gets copied over, the >> subsequent seek (sync) and write (async) work on different >> object copies. In particular, these copies will have >> different state - seek on one copy will have no effect on >> where the write will occur. > I cannot see a problem here: With object copying, you will simply have > the > same file open twice. And given the operations you do, this might even > be > the right thing... > This example is very academic: can you show an example where the > sharing of > state between tasks is useful, actually? The problem here is, that I at a user would expect the write to happen at byte 100, but it will happen at byte 0: the seek happens on a different object than the write.
What might be a more obvious example, which goes wrong along the same lines:
f.write ("line 1\n"); f.write ("line 2\n"); f.write ("line 3\n");
That will result in a file
line 1 line 2 line 3
whereas the coed
saga::task t1 = f.write ("line 1\n"); t1.run (); t1.wait (); saga::task t2 = f.write ("line 2\n"); t2.run (); t2.wait (); saga::task t3 = f.write ("line 3\n"); t3.run (); t3.wait ();
will result in a file
line_3
the last write will start on 0, as the previous write operated on a different file pointer. In general, you cannot execute any two tasks on a single object, at least not if any state is of concern, such as file pointer, pwd, replica name, stream server port, job id, ...
That is a no-go in my opinion, as it is counter-intuitive, and breaks a large number of use cases. And is incosistent with the syncroneou method calls.
Yes, you can wreak havoc with state as well:
saga::task t1 = f.write ("line 1\n"); saga::task t2 = f.write ("line 2\n");
t1.run (); t2.run ();
t1.wait (); t2.wait ();
will likely result in
linline 2 e 1
or such - the user does need to think when doing multiple async ops at once. I don't see a way around that (and don't see a need for it either: we want to make the Grid stuff easy, but not revolutionize programming styles).
>> I should have added that I'd prefer 3: >> >>>> 3. when creating a task, all parameter objects are passed "by >>>> reference" >>>> + no enforced copying overhead >>>> - all objects are shared, lots of potential error conditions >> The error conditions I could think of are: >> >> - change state of object while a task is running, hence >> having the task doing something differently than >> intended > Change of state, That is intentional - see above.
> like destruction of objects Well, that is what we discuss :-) 3 would delay destruction until its save (state is not needed anymore).
> or change of objects. What doe you mean here?
> Not to speak of synchronization conditions: supposed you > have non-atomic write operations (which is everything that > writes more than a single word to memory): do you thus > also enforce object locking by doing this? > If not, you can have inconsistent object state that can be > seen by one task, just because another task is halfway > through writing the object... (all classical problems of > shared-memory communication apply) See above. You are right, but I don't see a way around that, without causing more harm than good (child and bathtub come to my mind for some reason...).
BTW: the bulk optimization we have now assumes that tasks which run at the same time are, by their very definition, independent from each other, do not depend on any specific order of execution, and do not depend from each other in respect to object state. That are the very points we talk here about - I think its a very sensible assumption. I have the same behaviour on a unix shell BTW:
touch file date >> file & date >> file &
I would not be able to make assumptions about the file contents... (well, here I could make a save bet, but you know what I mean).
>> - limited control over resource deallocation > this is the same thing as above > > The problem really is that there is no "object lifecycle" > defined. There is no way to define which task or thread > might be responsible or even allowed to destroy objects or > change objects. Is it??? Yes, that is what I mean with limited control.
We had a discussion on this list and in Tokyo about the semantics of cancel(), which touches the same problem: should task.cancel() block until resources are freed? As we might talk about remote resources, and Grids are unreliable, we might block forever. That does not make sense, at least not always.
The resolution we came up with is that cancel() is advisory, so non-blocking, but can also use a timeout parameter (with -1 meaning forever) to block until resources are freed.
Timeouts do not make sense on destructors I believe, but 'advisory destruction' does, IMHO.
>> The advantages I see: >> >> - no copy overhead (but, as you say, that is of no >> concern really) > ok, but minor point. right. Lets forget that from now on.
>> - simple, clear defined semantics > no, it is the the most dangerous of the three versions Well, see above - I think its the most sensible semantics :-)
>> - tasks keep objects they operate on alive - >> objects keep sessions they live in alive - >> sessions keep contexts they use alive > what is the maening of "alive" here??? Now that you have > outruled memory management... see above: resources get freed if not needed anymore.
>> - sync and asyn operations operate on the same >> object instance. > Let's forget about "sync" here: it is the task that is > running in the current thread, so multiple tasks share > object instances. Well, it would be nice to have same semantics for sync and async, don't you think? :-)
>> Either way (1, 2 or 3), we have to have the user of the >> API thinking while using it - neither is not idiot >> proof. > Well, we should strive to limit the mental load on the > programmer as much as possible... > >> I think (2) is most problematic, if I understant your >> 'hand-over' correctly: that would mean you can't use the >> object again until the task was finished? > No, it means you will never ever again be allowed to use > these objects. (hand over includes the hand over of the > responsibility to clean up...) Right. So you can never do a async read, and then a sync seek, and then a async read again. At least not with sensible results.
Also, I need to create 100 file instances to do 100 reads? Remember that opening a file is a remote op in itself, potentially. Then we don't need the task model anymore.
That is broken IMHO.
Cheers, Andre.
> Thilo -- "So much time, so little to do..." -- Garfield ----- End forwarded message -----
-- Best regards, Pascal Kleijer ---------------------------------------------------------------- HPC Marketing Promotion Division, NEC Corporation 1-10, Nisshin-cho, Fuchu, Tokyo, 183-8501, Japan. Tel: +81-(0)42/333.6389 Fax: +81-(0)42/333.6382
Hi Pascal, Quoting [Pascal Kleijer] (Jul 20 2006):
In the case of non-OO languages it might be preferable to use by copy then by reference, this will however have an impact on the runtime but will avoid you a lot of troubles.
I would assume that a non-OO implementations uses pointers to structures as object representations, or handles or such. Hence a copy would in fact represent a reference-copy, and the same semantics can easily be achieved.
Well no. The problem with non-OO languages and C++ is that you must explicitly garbage collect your objects (call a destroy). So if you just copy the pointer you cannot test if it has been destroyed unless you handle the pointer to the pointer and can test for a NULL (agrl in Fortran). So the idea is to make a full copy of the object state and use a locale copy within the thread. This might cause a big overhead and be rather tricky to implement, especially if object contains other objects. The recursion in copy might be either impossible, longish or very complex.
I disagree, and for two reasons even (that must be worth something :-) For one, as I said in the thread earlier, I think that deep copy raises huge semantic problems: task t_1 = file.seek saga::task::task (100, saga::file::SeekSet); task t_2 = file.read saga::task::task (1, buf, &out); t_1.run (); t_1.wait (); t_2.run (); t_2.wait (); I, as a user, would assume that the read reads byte number 100. But if we do a deep copy on task creation, the object state (e.g. the file pointer) gets _copied_. So the seek operates on a different file pointer than the read, and the read in fact returns the first byte in the file. That breaks a large number of use cases. Secondly, you are right, C++ and C must handle destruction explicit - but that does not mean that you cannot copy references. It just means that your implementaion needs to keep track of them! Now, that might be difficult, but (a) there are many libs and tools helping with that and there also exist known practices on how to do that, and (b) SAGA is about ease-of-use, not ease-of-implementation. Hehe, now you need three counterarguments :-) Lets see what you come up with :-D Cheers, Andre. -- "So much time, so little to do..." -- Garfield
I'll just state the obvious here....
You could always make a C++ Ptr class that wraps access to a pointer
and does reference counting. If the objects in question do not
contain cyclic graphs, that would make things work just like java.
If they do contain cyclic graphs, you could at least arrange things so
that when you delete one copy of your Ptr<T>, all of them get set to 0
-- thus allowing you to test.
Cheers,
Steve
Quoting Andre Merzky
Hi Pascal,
Quoting [Pascal Kleijer] (Jul 20 2006):
In the case of non-OO languages it might be preferable to use by copy then by reference, this will however have an impact on the runtime but will avoid you a lot of troubles.
I would assume that a non-OO implementations uses pointers to structures as object representations, or handles or such. Hence a copy would in fact represent a reference-copy, and the same semantics can easily be achieved.
Well no. The problem with non-OO languages and C++ is that you must explicitly garbage collect your objects (call a destroy). So if you just copy the pointer you cannot test if it has been destroyed unless you handle the pointer to the pointer and can test for a NULL (agrl in Fortran). So the idea is to make a full copy of the object state and use a locale copy within the thread. This might cause a big overhead and be rather tricky to implement, especially if object contains other objects. The recursion in copy might be either impossible, longish or very complex.
I disagree, and for two reasons even (that must be worth something :-)
For one, as I said in the thread earlier, I think that deep copy raises huge semantic problems:
task t_1 = file.seek saga::task::task (100, saga::file::SeekSet); task t_2 = file.read saga::task::task (1, buf, &out);
t_1.run (); t_1.wait ();
t_2.run (); t_2.wait ();
I, as a user, would assume that the read reads byte number 100. But if we do a deep copy on task creation, the object state (e.g. the file pointer) gets _copied_. So the seek operates on a different file pointer than the read, and the read in fact returns the first byte in the file.
That breaks a large number of use cases.
Secondly, you are right, C++ and C must handle destruction explicit - but that does not mean that you cannot copy references. It just means that your implementaion needs to keep track of them!
Now, that might be difficult, but (a) there are many libs and tools helping with that and there also exist known practices on how to do that, and (b) SAGA is about ease-of-use, not ease-of-implementation.
Hehe, now you need three counterarguments :-) Lets see what you come up with :-D
Cheers, Andre.
-- "So much time, so little to do..." -- Garfield
Hi Steve, Quoting [sbrandt@cct.lsu.edu] (Jul 20 2006):
I'll just state the obvious here....
You could always make a C++ Ptr class that wraps access to a pointer and does reference counting. If the objects in question do not contain cyclic graphs, that would make things work just like java.
Actually, that is more or less what we do :-). We use boost::shared_ptr for that. Hence, the wrapper class gets actually copied, but the shared_ptr only increments the ref count in that case... Cheers, Andre.
If they do contain cyclic graphs, you could at least arrange things so that when you delete one copy of your Ptr<T>, all of them get set to 0 -- thus allowing you to test.
Cheers, Steve
Quoting Andre Merzky
: Hi Pascal,
Quoting [Pascal Kleijer] (Jul 20 2006):
In the case of non-OO languages it might be preferable to use by copy then by reference, this will however have an impact on the runtime but will avoid you a lot of troubles.
I would assume that a non-OO implementations uses pointers to structures as object representations, or handles or such. Hence a copy would in fact represent a reference-copy, and the same semantics can easily be achieved.
Well no. The problem with non-OO languages and C++ is that you must explicitly garbage collect your objects (call a destroy). So if you just copy the pointer you cannot test if it has been destroyed unless you handle the pointer to the pointer and can test for a NULL (agrl in Fortran). So the idea is to make a full copy of the object state and use a locale copy within the thread. This might cause a big overhead and be rather tricky to implement, especially if object contains other objects. The recursion in copy might be either impossible, longish or very complex.
I disagree, and for two reasons even (that must be worth something :-)
For one, as I said in the thread earlier, I think that deep copy raises huge semantic problems:
task t_1 = file.seek saga::task::task (100, saga::file::SeekSet); task t_2 = file.read saga::task::task (1, buf, &out);
t_1.run (); t_1.wait ();
t_2.run (); t_2.wait ();
I, as a user, would assume that the read reads byte number 100. But if we do a deep copy on task creation, the object state (e.g. the file pointer) gets _copied_. So the seek operates on a different file pointer than the read, and the read in fact returns the first byte in the file.
That breaks a large number of use cases.
Secondly, you are right, C++ and C must handle destruction explicit - but that does not mean that you cannot copy references. It just means that your implementaion needs to keep track of them!
Now, that might be difficult, but (a) there are many libs and tools helping with that and there also exist known practices on how to do that, and (b) SAGA is about ease-of-use, not ease-of-implementation.
Hehe, now you need three counterarguments :-) Lets see what you come up with :-D
Cheers, Andre.
-- "So much time, so little to do..." -- Garfield
-- "So much time, so little to do..." -- Garfield
Hello Andre, I will not come up with 3 counter arguments here. I will just show you that your first argument is flawed (or think it is). We assume that the following is the example code: task t_1 = file.seek saga::task::task (100, saga::file::SeekSet); task t_2 = file.read saga::task::task (1, buf, &out); t_1.run (); t_1.wait (); t_2.run (); t_2.wait (); In a traditional single machine implementation if we do a pure deep copy of the file object in the task then do the run-wait calls, yes it might be true that the read reads byte 1 and not byte 100. This depends on how the file handler is implemented internally, if we stick to a pure deep copy then it would mean the physical file is opened by two handles, thus the "seek" of t1 does not affect t2. In our case we are dealing with a _remote_ file due to the very nature of SAGA. If we assume a WSRF implementation then the file object in our sample code is noting more internally then a End Point Reference (ERP) or similar depending on your implementation. Nothing is against having two full duplicates (due to the deep copy) of the EPR but still point to the same physical file with the same handle on the server site. When you do the run-wait we can assume an atomic call, which means the byte that is read is effectively # 100. In a remote environment the real state of the object is not stored locally (client). So when you do a "seek" it happens on the remote object not locally. I would be a conception flaw if you did store the file internal pointer as well as other properties locally. My assumption is that you are mixing local and remote handling. Unless you want SAGA to handle within one object both aspects, which again is a conception flaw in my opinion. I hope I was clear enough, otherwise I will need an extra run to the coffee machine. :p
In the case of non-OO languages it might be preferable to use by copy then by reference, this will however have an impact on the runtime but will avoid you a lot of troubles. I would assume that a non-OO implementations uses pointers to structures as object representations, or handles or such. Hence a copy would in fact represent a reference-copy, and the same semantics can easily be achieved. Well no. The problem with non-OO languages and C++ is that you must explicitly garbage collect your objects (call a destroy). So if you just copy the pointer you cannot test if it has been destroyed unless you handle the pointer to the pointer and can test for a NULL (agrl in Fortran). So the idea is to make a full copy of the object state and use a locale copy within the thread. This might cause a big overhead and be rather tricky to implement, especially if object contains other objects. The recursion in copy might be either impossible, longish or very complex.
I disagree, and for two reasons even (that must be worth something :-)
For one, as I said in the thread earlier, I think that deep copy raises huge semantic problems:
task t_1 = file.seek saga::task::task (100, saga::file::SeekSet); task t_2 = file.read saga::task::task (1, buf, &out);
t_1.run (); t_1.wait ();
t_2.run (); t_2.wait ();
I, as a user, would assume that the read reads byte number 100. But if we do a deep copy on task creation, the object state (e.g. the file pointer) gets _copied_. So the seek operates on a different file pointer than the read, and the read in fact returns the first byte in the file.
That breaks a large number of use cases.
Secondly, you are right, C++ and C must handle destruction explicit - but that does not mean that you cannot copy references. It just means that your implementaion needs to keep track of them!
Now, that might be difficult, but (a) there are many libs and tools helping with that and there also exist known practices on how to do that, and (b) SAGA is about ease-of-use, not ease-of-implementation.
Hehe, now you need three counterarguments :-) Lets see what you come up with :-D
Cheers, Andre.
-- Best regards, Pascal Kleijer ---------------------------------------------------------------- HPC Marketing Promotion Division, NEC Corporation 1-10, Nisshin-cho, Fuchu, Tokyo, 183-8501, Japan. Tel: +81-(0)42/333.6389 Fax: +81-(0)42/333.6382
Hi Pascal, no need for coffee (hehe, apart from the basic need we probably all have), the example is good. Indeed, it depends on where you maintain state. The problem is, however, that we cannot expect remote state management as a default. For example, a GridFTP based access to a remote file, which is not far fetched as you will probably agree, would need to maintain local state, as FTP does not maintain server side state (for our purposes that is). It is possible to keep API objects and state management separate on client side (so within the saga implementation). Indeed that ouwld solve the problem as well, as you then, in some respect, avoid a really deep copy, but only copy references to the real state management. That in fact, as a boundary case, corresponds to the usage of reference counters and shallow copies again: the state management would need to use the same refcounting and alloc/free policies that one would use for object references in the first place. Cheers, Andre. Quoting [Pascal Kleijer] (Jul 21 2006):
Hello Andre,
I will not come up with 3 counter arguments here. I will just show you that your first argument is flawed (or think it is).
We assume that the following is the example code:
task t_1 = file.seek saga::task::task (100, saga::file::SeekSet); task t_2 = file.read saga::task::task (1, buf, &out); t_1.run (); t_1.wait (); t_2.run (); t_2.wait ();
In a traditional single machine implementation if we do a pure deep copy of the file object in the task then do the run-wait calls, yes it might be true that the read reads byte 1 and not byte 100. This depends on how the file handler is implemented internally, if we stick to a pure deep copy then it would mean the physical file is opened by two handles, thus the "seek" of t1 does not affect t2.
In our case we are dealing with a _remote_ file due to the very nature of SAGA. If we assume a WSRF implementation then the file object in our sample code is noting more internally then a End Point Reference (ERP) or similar depending on your implementation. Nothing is against having two full duplicates (due to the deep copy) of the EPR but still point to the same physical file with the same handle on the server site. When you do the run-wait we can assume an atomic call, which means the byte that is read is effectively # 100.
In a remote environment the real state of the object is not stored locally (client). So when you do a "seek" it happens on the remote object not locally. I would be a conception flaw if you did store the file internal pointer as well as other properties locally. My assumption is that you are mixing local and remote handling. Unless you want SAGA to handle within one object both aspects, which again is a conception flaw in my opinion.
I hope I was clear enough, otherwise I will need an extra run to the coffee machine. :p
In the case of non-OO languages it might be preferable to use by copy then by reference, this will however have an impact on the runtime but will avoid you a lot of troubles. I would assume that a non-OO implementations uses pointers to structures as object representations, or handles or such. Hence a copy would in fact represent a reference-copy, and the same semantics can easily be achieved. Well no. The problem with non-OO languages and C++ is that you must explicitly garbage collect your objects (call a destroy). So if you just copy the pointer you cannot test if it has been destroyed unless you handle the pointer to the pointer and can test for a NULL (agrl in Fortran). So the idea is to make a full copy of the object state and use a locale copy within the thread. This might cause a big overhead and be rather tricky to implement, especially if object contains other objects. The recursion in copy might be either impossible, longish or very complex.
I disagree, and for two reasons even (that must be worth something :-)
For one, as I said in the thread earlier, I think that deep copy raises huge semantic problems:
task t_1 = file.seek saga::task::task (100, saga::file::SeekSet); task t_2 = file.read saga::task::task (1, buf, &out);
t_1.run (); t_1.wait ();
t_2.run (); t_2.wait ();
I, as a user, would assume that the read reads byte number 100. But if we do a deep copy on task creation, the object state (e.g. the file pointer) gets _copied_. So the seek operates on a different file pointer than the read, and the read in fact returns the first byte in the file.
That breaks a large number of use cases.
Secondly, you are right, C++ and C must handle destruction explicit - but that does not mean that you cannot copy references. It just means that your implementaion needs to keep track of them!
Now, that might be difficult, but (a) there are many libs and tools helping with that and there also exist known practices on how to do that, and (b) SAGA is about ease-of-use, not ease-of-implementation.
Hehe, now you need three counterarguments :-) Lets see what you come up with :-D
Cheers, Andre.
-- "So much time, so little to do..." -- Garfield
Dear Andre and Pascal, dear all on the list, we seem to have different opinions on the proper handling of shared objects and of concurrency control. I had been silent as I did not see a way how to resolve this. First of all, we all should separate our own opinions from our own work, favourite language, and implementations... I think there are a few facts that we need to accept: 1. SAGA is about simplicity of use. Application programmers should require only a minimum of additional mental burden to use SAGA. 2. The SAGA spec MUST be independent of all possible implementation languages. The spec should allow language bindings that make SAGA-Java to "feel like" normal Java programming, SAGA-C++ to feel like normal C++ programming, and SAGA-Fortran to feel like normal Fortran programming... Along the same line, a SAGA binding to C++ must not require using a library like boost, or equivalent. 3. With the task model, SAGA has introduced a concurrent programming model. The recent discussion on the list has been revolving around: a) object lifecycle (and memory) management b) concurrency control c) being prescriptive or leaving freedom to the application programmer In previous mails, I have argued for some amount of being presriptive. By now, I think, we should give up being presciptive all together, paving the road for both simplicity and flexibility for application programmers. (This will have a price to pay, though, see below.) And here comes my suggestion what to do with with a) and b): a) SAGA does NOT handle object life cycle and memory management at all. It is ALWAYS the responsibility of the application programmer to create and destroy objects, even if objects are created via SAGA constructors. SAGA thus NEVER destroys objects. (with the exception of data structures that are internal to its implementation, that are not exported/visible at the API) The application programmer has to do object creation and destruction, with the respective means of the programming language in use. b) SAGA does NOT impose any concurrency control mechanisms. It is the responsibility of the application programmer to ensure the correctness of his/her program. In case of SAGA's tasks, the application programmer has to make sure that all tasks and the main program execute correctly, in any possible interleavings or execution orders. Whenever tasks share objects with each other or the main program, the application programmer must ensure that no race conditions will occur. The application programmer is free to use separate concurrency control mechanisms from his/her own programming environment, like mutexes, semaphors, monitors, locks, etc. The related problem of thread-safety for SAGA operations is not applicable on the level of the SAGA specification. It has to be addressed on the level of language bindings. Thilo -- Thilo Kielmann http://www.cs.vu.nl/~kielmann/
Hiho, Quoting [Thilo Kielmann] (Aug 01 2006):
a) SAGA does NOT handle object life cycle and memory management at all. It is ALWAYS the responsibility of the application programmer to create and destroy objects, even if objects are created via SAGA constructors. SAGA thus NEVER destroys objects. (with the exception of data structures that are internal to its implementation, that are not exported/visible at the API) The application programmer has to do object creation and destruction, with the respective means of the programming language in use.
I am not sure I understand the proposal (a) really. Please don't take my questions not as criticism, it's really about clarifying what you mean. And sorry for the examples beeing C++ :-) If you say that the application programmer MUST destroy objects explicitely, that would mean that the code: { saga::file (url); file.copy (tgt); } would leak memory? File goes out of scope, but as the instance is not explicitely destroyed by the programmer (no file.destroy() called or whatever), the object stays active? Or would you include 'going out of scope' into the 'means of the programming language'?
b) SAGA does NOT impose any concurrency control mechanisms. It is the responsibility of the application programmer to ensure the correctness of his/her program. In case of SAGA's tasks, the application programmer has to make sure that all tasks and the main program execute correctly, in any possible interleavings or execution orders. Whenever tasks share objects with each other or the main program, the application programmer must ensure that no race conditions will occur.
The application programmer is free to use separate concurrency control mechanisms from his/her own programming environment, like mutexes, semaphors, monitors, locks, etc.
I agree to (b) so far, totally. These separate control mechanisms are aviable and not Grid specific - so there is no need for SAGA to re-invent them.
The related problem of thread-safety for SAGA operations is not applicable on the level of the SAGA specification. It has to be addressed on the level of language bindings.
I agree to that to some extend - the spec should, IMHO, limit itself to the statement 'SAGA implementations MUST be thread safe'. Cheers, Andre. -- "So much time, so little to do..." -- Garfield
Hoho, I think you understand...
Quoting [Thilo Kielmann] (Aug 01 2006):
a) SAGA does NOT handle object life cycle and memory management at all. It is ALWAYS the responsibility of the application programmer to create and destroy objects, even if objects are created via SAGA constructors. SAGA thus NEVER destroys objects. (with the exception of data structures that are internal to its implementation, that are not exported/visible at the API) The application programmer has to do object creation and destruction, with the respective means of the programming language in use.
I am not sure I understand the proposal (a) really. Please don't take my questions not as criticism, it's really about clarifying what you mean. And sorry for the examples beeing C++ :-)
If you say that the application programmer MUST destroy objects explicitely, that would mean that the code:
{ saga::file (url); file.copy (tgt); }
would leak memory? File goes out of scope, but as the instance is not explicitely destroyed by the programmer (no file.destroy() called or whatever), the object stays active? Or would you include 'going out of scope' into the 'means of the programming language'?
please excuse my ignorance, you mean someting like the following? { saga::file f(url); f.copy (tgt); } Then f is getting out of scope and will be destroyed by the C++ runtime system. That's fine, a "means of the programming language". Does your agreement also imply you agree that this is obsoleting sections 1.3.7 Life Time Management and 1.3.8 Freeing of Resources and Garbage Collection of the current "strawman" spec?
b) SAGA does NOT impose any concurrency control mechanisms. It is the responsibility of the application programmer to ensure the correctness of his/her program. In case of SAGA's tasks, the application programmer has to make sure that all tasks and the main program execute correctly, in any possible interleavings or execution orders. Whenever tasks share objects with each other or the main program, the application programmer must ensure that no race conditions will occur.
The application programmer is free to use separate concurrency control mechanisms from his/her own programming environment, like mutexes, semaphors, monitors, locks, etc.
I agree to (b) so far, totally. These separate control mechanisms are aviable and not Grid specific - so there is no need for SAGA to re-invent them.
It never was about re-inventing. The alternative would be prescribing to use such mechanisms.
The related problem of thread-safety for SAGA operations is not applicable on the level of the SAGA specification. It has to be addressed on the level of language bindings.
I agree to that to some extend - the spec should, IMHO, limit itself to the statement 'SAGA implementations MUST be thread safe'.
Disagree. This would again prescribe the use of thread-safe implementations. So you would enforce the overhead of thread-safety (e.g. locking all over the place) also to Fortran ??? Really, "we don't prescibe anything" also excludes prescribing thread safety. Keep it for the language binding. That's where it belongs. Thilo -- Thilo Kielmann http://www.cs.vu.nl/~kielmann/
hi, Quoting [Thilo Kielmann] (Aug 01 2006):
The related problem of thread-safety for SAGA operations is not applicable on the level of the SAGA specification. It has to be addressed on the level of language bindings.
I agree to that to some extend - the spec should, IMHO, limit itself to the statement 'SAGA implementations MUST be thread safe'.
Disagree. This would again prescribe the use of thread-safe implementations. So you would enforce the overhead of thread-safety (e.g. locking all over the place) also to Fortran ??? Really, "we don't prescibe anything" also excludes prescribing thread safety.
Keep it for the language binding. That's where it belongs.
Fortran is a good point... I agree. -- "So much time, so little to do..." -- Garfield
and hi :-) Quoting [Thilo Kielmann] (Aug 01 2006):
Quoting [Thilo Kielmann] (Aug 01 2006):
a) SAGA does NOT handle object life cycle and memory management at all. It is ALWAYS the responsibility of the application programmer to create and destroy objects, even if objects are created via SAGA constructors. SAGA thus NEVER destroys objects. (with the exception of data structures that are internal to its implementation, that are not exported/visible at the API) The application programmer has to do object creation and destruction, with the respective means of the programming language in use.
I am not sure I understand the proposal (a) really. Please don't take my questions not as criticism, it's really about clarifying what you mean. And sorry for the examples beeing C++ :-)
If you say that the application programmer MUST destroy objects explicitely, that would mean that the code:
{ saga::file (url); file.copy (tgt); }
would leak memory? File goes out of scope, but as the instance is not explicitely destroyed by the programmer (no file.destroy() called or whatever), the object stays active? Or would you include 'going out of scope' into the 'means of the programming language'?
please excuse my ignorance, you mean someting like the following?
{ saga::file f(url); f.copy (tgt); }
yep, sorry...
Then f is getting out of scope and will be destroyed by the C++ runtime system. That's fine, a "means of the programming language".
ok, understand what you mean now.
Does your agreement also imply you agree that this is obsoleting sections 1.3.7 Life Time Management and 1.3.8 Freeing of Resources and Garbage Collection of the current "strawman" spec?
Aehm, no ;-) Now, here is the counterargument: IO think what you propose works nicely, is simple, and is well defined (simple rules). However, I think its awkward to use. Our favourite bulk file copy code snippet would, for example, cease to work: ----------------------------------------------- saga::task_container tc; for (i = 1; i < 1000; i++) { saga::file f (url[i]); saga::task t = f.copy saga::Task ("/tmp/"); tc.add (t); } tc.run (); tc.wait (); ----------------------------------------------- Its not only the file object which goes out of scope here, its also the task! Also, it basically means that a programmer needs to keep track of sessions! ----------------------------------------------- // pseudo code saga::session s; saga::directory d(s, url); saga::file f = d.open (name); d.destroy (); ----------------------------------------------- can/must session be destroyed here? No, as it is still needed by file. The situation gets much worse with many objects, and tasks, which are passed around in subroutines. Effectively, the application programmer has to do reference counting, and needs to know exactly where a session is inherited, and where not. All that is simple to do in the SAGA implementation, which _knows_ where it is inherited, and which will do ref counting anyway... Again, I think your proposal is simple and it works, so its an valid option -- but IMHO it makes programming SAGA applications an ugly business. Nix f"ur ungut :-) Cheers, Andre.
b) SAGA does NOT impose any concurrency control mechanisms. It is the responsibility of the application programmer to ensure the correctness of his/her program. In case of SAGA's tasks, the application programmer has to make sure that all tasks and the main program execute correctly, in any possible interleavings or execution orders. Whenever tasks share objects with each other or the main program, the application programmer must ensure that no race conditions will occur.
The application programmer is free to use separate concurrency control mechanisms from his/her own programming environment, like mutexes, semaphors, monitors, locks, etc.
I agree to (b) so far, totally. These separate control mechanisms are aviable and not Grid specific - so there is no need for SAGA to re-invent them.
It never was about re-inventing. The alternative would be prescribing to use such mechanisms.
The related problem of thread-safety for SAGA operations is not applicable on the level of the SAGA specification. It has to be addressed on the level of language bindings.
I agree to that to some extend - the spec should, IMHO, limit itself to the statement 'SAGA implementations MUST be thread safe'.
Disagree. This would again prescribe the use of thread-safe implementations. So you would enforce the overhead of thread-safety (e.g. locking all over the place) also to Fortran ??? Really, "we don't prescibe anything" also excludes prescribing thread safety.
Keep it for the language binding. That's where it belongs.
Thilo
-- "So much time, so little to do..." -- Garfield
Our favourite bulk file copy code snippet would, for example, cease to work:
----------------------------------------------- saga::task_container tc;
for (i = 1; i < 1000; i++) { saga::file f (url[i]); saga::task t = f.copy saga::Task ("/tmp/"); tc.add (t); }
tc.run (); tc.wait (); -----------------------------------------------
Its not only the file object which goes out of scope here, its also the task!
Indeed, this code is broken. It might artificially be made to work by all the fancies that you guys have put into the implementation. In proper code, objects would be created on the heap (with new). You better do that anyway as you don't know how big your stack is and how big the objects are that some 3rd-party library creates. (Esp. if you have multiple threads, you cannot assume stacks to grow infinitely.) Yes, with C++ you have to do life cycle management of object references. I know, by keeping SAGA out of the memory management business, the price for this clarity is manual destroying of objects. But: we should stop arguing over artificial examples that are only designed for proving a certain argument. Sessions are by definition rather static; in a real application I can not see a session to be destroyed before the end of the program anyway.
All that is simple to do in the SAGA implementation, which _knows_ where it is inherited, and which will do ref counting anyway...
No way. SAGA needs to stay away from object lifecycle management.
Again, I think your proposal is simple and it works, so its an valid option -- but IMHO it makes programming SAGA applications an ugly business.
Yes, in less advanced languages there is some price to pay. But there is no single reason to spoil the language-independent spec, just because C++ doesn't have proper memory management. Thilo -- Thilo Kielmann http://www.cs.vu.nl/~kielmann/
Quoting [Thilo Kielmann] (Aug 01 2006):
Our favourite bulk file copy code snippet would, for example, cease to work:
----------------------------------------------- saga::task_container tc;
for (i = 1; i < 1000; i++) { saga::file f (url[i]); saga::task t = f.copy saga::Task ("/tmp/"); tc.add (t); }
tc.run (); tc.wait (); -----------------------------------------------
Its not only the file object which goes out of scope here, its also the task!
Indeed, this code is broken. It might artificially be made to work by all the fancies that you guys have put into the implementation.
In proper code, objects would be created on the heap (with new). You better do that anyway as you don't know how big your stack is and how big the objects are that some 3rd-party library creates. (Esp. if you have multiple threads, you cannot assume stacks to grow infinitely.)
Ok, assume we define the async methods to return pointers to tasks, not task instances (you cannnot use 'new()' to create a task): ---------------------------------------------- saga::task_container tc; for (i = 1; i < 1000; i++) { saga::file * f = new saga::file (url[i]); saga::task * t = f->copy saga::Task ("/tmp/"); tc.add (t); } tc.run (); tc.wait (); ---------------------------------------------- The the code is valid, but, again the user needs to keep track of the instances as they are used _internally_ in saga.
Yes, with C++ you have to do life cycle management of object references.
Of course - but that does not mean you cannot do it IN the library. For example QT, which is regarded as a poster child and easy-to-use C++ library, object life time is managed in the libqt, not in the application (the application CAN manage the lifetime explicitely though). WT uses following mechanisms: - shared pointer (QPointer) - special slots (delate_later, destroyed) to signal internal object destruction - object dependencies (if parent gets destroyed, children get destroyed). There is no reason to put that burdon to application level. I don't see that as an language issue BTW - I know that can be done in C (hence in also in Fortran bindings), Python and Perl, and I am rather positive for Java, too.
I know, by keeping SAGA out of the memory management business, the price for this clarity is manual destroying of objects.
I think we differ in what clarity means. You propose a clear definition, and simple rules. I would argue that clear and simple application code is more important.
But: we should stop arguing over artificial examples that are only designed for proving a certain argument.
I'd be happy to see code proving the opposite :-) But really, all what should matter is the end is what the application programmer will see, and use, and that is code. It does not help us to define simple rules etc, if the resulting SAGA code is cumbersome to use...
Sessions are by definition rather static; in a real application I can not see a session to be destroyed before the end of the program anyway.
Maybe. But that is definitely note the case for tasks and other object dependencies.
All that is simple to do in the SAGA implementation, which _knows_ where it is inherited, and which will do ref counting anyway...
No way. SAGA needs to stay away from object lifecycle management.
Can you explain why?
Again, I think your proposal is simple and it works, so its an valid option -- but IMHO it makes programming SAGA applications an ugly business.
Yes, in less advanced languages there is some price to pay. But there is no single reason to spoil the language-independent spec, just because C++ doesn't have proper memory management.
No, thats not a language issue - see above. Andre. -- "So much time, so little to do..." -- Garfield
Hi Thilo,
we seem to have different opinions on the proper handling of shared objects and of concurrency control. I had been silent as I did not see a way how to resolve this. First of all, we all should separate our own opinions from our own work, favourite language, and implementations...
I think there are a few facts that we need to accept:
1. SAGA is about simplicity of use. Application programmers should require only a minimum of additional mental burden to use SAGA.
Agreed 100%
2. The SAGA spec MUST be independent of all possible implementation languages. The spec should allow language bindings that make SAGA-Java to "feel like" normal Java programming, SAGA-C++ to feel like normal C++ programming, and SAGA-Fortran to feel like normal Fortran programming...
Agreed as well.
Along the same line, a SAGA binding to C++ must not require using a library like boost, or equivalent.
Strongly disagree. 1. Boost is nothing else than the C++ Standards library of tomorrow. In fact about 10 of Boosts libraries are accepted in C++ TR1 or TR2 already. More to follow. 2. The Boost libraries are peer reviewed and nowadays used by half of the C++ community. The code quality and portability is much more better than everybody could easily craft. If we were not using Boost we'ld have to write all this code ourselfs - which doesn't make any sense to me. 3. You are talking about keeping this discussion free from any language specifics, but OTOH you're trying to push through a requirement, which is highly language specific.
3. With the task model, SAGA has introduced a concurrent programming model.
Agreed.
The recent discussion on the list has been revolving around: a) object lifecycle (and memory) management b) concurrency control c) being prescriptive or leaving freedom to the application programmer
Since you've started the C++ business youself here, I'll try to answer your questions wrt the C++ implementation we have so far.
In previous mails, I have argued for some amount of being presriptive. By now, I think, we should give up being presciptive all together, paving the road for both simplicity and flexibility for application programmers. (This will have a price to pay, though, see below.)
And here comes my suggestion what to do with with a) and b):
a) SAGA does NOT handle object life cycle and memory management at all. It is ALWAYS the responsibility of the application programmer to create and destroy objects, even if objects are created via SAGA constructors. SAGA thus NEVER destroys objects. (with the exception of data structures that are internal to its implementation, that are not exported/visible at the API) The application programmer has to do object creation and destruction, with the respective means of the programming language in use.
That's exactly the way it is in the C++ ref implementation we have so far.
b) SAGA does NOT impose any concurrency control mechanisms. It is the responsibility of the application programmer to ensure the correctness of his/her program. In case of SAGA's tasks, the application programmer has to make sure that all tasks and the main program execute correctly, in any possible interleavings or execution orders. Whenever tasks share objects with each other or the main program, the application programmer must ensure that no race conditions will occur.
That's exactly the way it is in the C++ ref implementation we have so far, with the exception, that SAGA protects internal data structures to avoid internal race conditions.
The application programmer is free to use separate concurrency control mechanisms from his/her own programming environment, like mutexes, semaphors, monitors, locks, etc.
IIUC, this mainly means, that the SAGA implementations should not prevent to use the application programmer whatever concurency model/library he wants to use. I don't see anyproblems here, since all the concurency handling introduced by an implementation is internal to this implementation anyways (at least as long this implementation is crafted in a sensible way).
The related problem of thread-safety for SAGA operations is not applicable on the level of the SAGA specification. It has to be addressed on the level of language bindings.
I agree with your observations. Regards Hartmut
Thilo -- Thilo Kielmann http://www.cs.vu.nl/~kielmann/
Dear Hartmut, I am happy to hear about the acceptance of BOOST. However, we have to be careful as we are working on standardizing the language-independent SAGA API here. The C++-related features should go to the C++ language binding. In there, we must use today's C++ standard, unless there is strong evidence that "tomorrow" (meaning: BOOST becoming standard) is going to happen really soon now. But this discussion should be deferred to the writing of the C++ language binding. (We just should keep the language-independent SAGA spec clean of this issue.) After another night of thinking: I believe that the language-independent SAGA spec must not prescribe any memory management and object lifetime issues at all. This is purely to be addressed in the language bindings, same as with thread safety. So, my revised suggestion is: a) The SAGA specification does NOT address issues of object life cycle and memory management at all. It is subject to the language bindings of SAGA to define this in a way that suits the respective programming languages. Further, I think we should have a disclaimer in the SAGA spec like: The programmaning examples used in this document are for illustrative purposes only. They do NOT prescribe any bindings to particular programming languages. These will be defined in companion documents. Thilo -- Thilo Kielmann http://www.cs.vu.nl/~kielmann/
Thilo,
I am happy to hear about the acceptance of BOOST. However, we have to be careful as we are working on standardizing the language-independent SAGA API here. The C++-related features should go to the C++ language binding. In there, we must use today's C++ standard, unless there is strong evidence that "tomorrow" (meaning: BOOST becoming standard) is going to happen really soon now. But this discussion should be deferred to the writing of the C++ language binding. (We just should keep the language-independent SAGA spec clean of this issue.)
We should keep Boost out of the spec. I agree here. And we should keep it out of the language binding as well (at least as far as a certain Boost feature isn't expected to be part of the next C++ Standard). I agree here as well. But we don't have to keep Boost out of our C++ implementation. That's the point where I disagree.
After another night of thinking:
I believe that the language-independent SAGA spec must not prescribe any memory management and object lifetime issues at all. This is purely to be addressed in the language bindings, same as with thread safety.
So, my revised suggestion is:
a) The SAGA specification does NOT address issues of object life cycle and memory management at all. It is subject to the language bindings of SAGA to define this in a way that suits the respective programming languages.
Agreed as long as we require some kind of generalized 'deep copy semantics' for all saga objects. Please don't get me wrong. That doesn't mean an implementation has really to use deep copy semantics, this only means, that for the application programmer it has to appear to have deep copy semantics (how this might be implemented is highly implementation defined).
Further, I think we should have a disclaimer in the SAGA spec like:
The programmaning examples used in this document are for illustrative purposes only. They do NOT prescribe any bindings to particular programming languages. These will be defined in companion documents.
Disclaimers are always a good thing! You should add something as: please use SAGA only if you know what you're doing :-P Regards Hartmut
Thilo -- Thilo Kielmann http://www.cs.vu.nl/~kielmann/
I don't think we disagree here: On Tue, Aug 01, 2006 at 07:07:30PM -0500, Hartmut Kaiser wrote:
We should keep Boost out of the spec. I agree here. And we should keep it out of the language binding as well (at least as far as a certain Boost feature isn't expected to be part of the next C++ Standard). I agree here as well. But we don't have to keep Boost out of our C++ implementation.
This is perfectly fine. Thilo -- Thilo Kielmann http://www.cs.vu.nl/~kielmann/
Quoting [Thilo Kielmann] (Aug 02 2006):
Further, I think we should have a disclaimer in the SAGA spec like:
The programmaning examples used in this document are for illustrative purposes only. They do NOT prescribe any bindings to particular programming languages. These will be defined in companion documents.
That statement is in the spec already :-) A. -- "So much time, so little to do..." -- Garfield
participants (7)
-
'Andre Merzky'
-
'Thilo Kielmann'
-
Andre Merzky
-
Hartmut Kaiser
-
Pascal Kleijer
-
sbrandt@cct.lsu.edu
-
Thilo Kielmann