|
|||
|
On Wed, Oct 18, 2006 at 09:37:51PM +0000, Hans Aberg wrote:
> In article <20061018212528.GC18480@morgenes.shire.sytes.net >, Robbert > Haarman <comp.lang.misc@inglorion.net> wrote: > > > Implementing a proper garbage collector isn't that hard, really. > > Do you mean implementing it or getting it to work with C++ classes? - If > you have some inputs on the latter I am interested in knowing. The problem > with C++ is that it does not supply runtime root set information, which is > needed in a tracing GC. Well, I was assuming that you only have to implement the collector for the few classes that your program actually uses, and that your program contains one or a few data structures that refer to all objects that must be traced. E.g., if you have one class (or struct) for your particles, and one container class (or other data structure) that points to them, you can easily implement tracing and copying for those two classes. You don't have to support all of C++ and figure out a generic way to get at the root set to make it work for your program. > > And you > > can always use the Boehm collector, which can be used as a drop-in > > replacement for malloc and free (meaning you don't need to rewrite any > > code). > > Good input, though it is not a C++ collector proper, but goes into the > layer between the compiler and the platform. Yes. The great advantage of the Boehm collector is that you can simply recompile your (C, I don't know if it works for C++, too) program so that malloc uses Boehm's allocator, and free() becomes a no-op. You could use this with your existinc C program to get a handle on how well it does. I don't know how well it works with C++; I would imagine there are certain features of C++ that would not play well with it (for example, does the Boehm collector call destructors?). > And I am not sure how fast it is, if the idea is to use it is to speed > up the program. There have been a few studies that compare the performance of programs using malloc/free, and the same programs recompiled to use the Boehm collector. I think there's also a study that uses C++ programs, but I couldn't find a link just now. > And it is also conservative, meaning that it guesses what to collect, > and thus does not collect everything. That's correct. I think, all in all, your best bet is to run a few experiments and see how it performs for you. > One way to implement quick allocation is to not collect at all: Make a > large array, with a pointer moving, and just move it when allocation space > is needed. You mean like a circular buffer? Does your allocation/deallocation behavior fit such a system? In that case it might indeed be the most efficient option. The only possible performance problem I can think of from the top of my head is that locality of reference may not be as good as what you would get from a garbage collector. Regards, Bob --- Time flies like an arrow. Fruit flies like a banana. |
|
|
||||
|
||||
|
|
|
|||
|
On Wed, Oct 18, 2006 at 09:37:51PM +0000, Hans Aberg wrote:
> In article <20061018212528.GC18480@morgenes.shire.sytes.net >, Robbert > Haarman <comp.lang.misc@inglorion.net> wrote: > > > Implementing a proper garbage collector isn't that hard, really. > > Do you mean implementing it or getting it to work with C++ classes? - If > you have some inputs on the latter I am interested in knowing. The problem > with C++ is that it does not supply runtime root set information, which is > needed in a tracing GC. Well, I was assuming that you only have to implement the collector for the few classes that your program actually uses, and that your program contains one or a few data structures that refer to all objects that must be traced. E.g., if you have one class (or struct) for your particles, and one container class (or other data structure) that points to them, you can easily implement tracing and copying for those two classes. You don't have to support all of C++ and figure out a generic way to get at the root set to make it work for your program. > > And you > > can always use the Boehm collector, which can be used as a drop-in > > replacement for malloc and free (meaning you don't need to rewrite any > > code). > > Good input, though it is not a C++ collector proper, but goes into the > layer between the compiler and the platform. Yes. The great advantage of the Boehm collector is that you can simply recompile your (C, I don't know if it works for C++, too) program so that malloc uses Boehm's allocator, and free() becomes a no-op. You could use this with your existinc C program to get a handle on how well it does. I don't know how well it works with C++; I would imagine there are certain features of C++ that would not play well with it (for example, does the Boehm collector call destructors?). > And I am not sure how fast it is, if the idea is to use it is to speed > up the program. There have been a few studies that compare the performance of programs using malloc/free, and the same programs recompiled to use the Boehm collector. I think there's also a study that uses C++ programs, but I couldn't find a link just now. > And it is also conservative, meaning that it guesses what to collect, > and thus does not collect everything. That's correct. I think, all in all, your best bet is to run a few experiments and see how it performs for you. > One way to implement quick allocation is to not collect at all: Make a > large array, with a pointer moving, and just move it when allocation space > is needed. You mean like a circular buffer? Does your allocation/deallocation behavior fit such a system? In that case it might indeed be the most efficient option. The only possible performance problem I can think of from the top of my head is that locality of reference may not be as good as what you would get from a garbage collector. Regards, Bob --- Time flies like an arrow. Fruit flies like a banana. |
|
|||
|
In article <20061019064737.GD18480@morgenes.shire.sytes.net >, Robbert
Haarman <comp.lang.misc@inglorion.net> wrote: > > > Implementing a proper garbage collector isn't that hard, really. > > > > Do you mean implementing it or getting it to work with C++ classes? - If > > you have some inputs on the latter I am interested in knowing. The problem > > with C++ is that it does not supply runtime root set information, which is > > needed in a tracing GC. > > Well, I was assuming that you only have to implement the collector for > the few classes that your program actually uses, and that your program > contains one or a few data structures that refer to all objects that > must be traced. E.g., if you have one class (or struct) for your > particles, and one container class (or other data structure) that points > to them, you can easily implement tracing and copying for those two > classes. You don't have to support all of C++ and figure out a generic > way to get at the root set to make it work for your program. Steve Strassman who wrote a Dylan*compiler mentioned to me that it is experimental difficult to implement a GC in C++ proper. Then*I tried to do it for a polymorphic (virtual) class hierarchy, and it was sort of*impossible to get hold of the root set. So I only use a reference count. The next C++ revision might have some GC writing support, though. > I don't know how well it works with C++; I would imagine there are > certain features of C++ that would not play well with it (for example, > does the Boehm collector call destructors?). I do not know much about it, in particular not this. It seems likely it is just the memory allocation/deallocation operators that are changed, as*constructors/destructors can be used to control other semantics. > > One way to implement quick allocation is to not collect at all: Make a > > large array, with a pointer moving, and just move it when allocation space > > is needed. > > You mean like a circular buffer? Does your allocation/deallocation > behavior fit such a system? In that case it might indeed be the most > efficient option. The only possible performance problem I can think of > from the top of my head is that locality of reference may not be as good > as what you would get from a garbage collector. I just took out the*garbage*collection of a two-space GC. It the latter, one first fills up one space, and when it is full, the stuff is copied over to the new other space, and so on. It can be made arbitrarily*faster by enlarging the copying spaces. So if it is*sufficiently large, no GC will take place. -- Hans Aberg |
|
|||
|
In article <20061019064737.GD18480@morgenes.shire.sytes.net >, Robbert
Haarman <comp.lang.misc@inglorion.net> wrote: > > > Implementing a proper garbage collector isn't that hard, really. > > > > Do you mean implementing it or getting it to work with C++ classes? - If > > you have some inputs on the latter I am interested in knowing. The problem > > with C++ is that it does not supply runtime root set information, which is > > needed in a tracing GC. > > Well, I was assuming that you only have to implement the collector for > the few classes that your program actually uses, and that your program > contains one or a few data structures that refer to all objects that > must be traced. E.g., if you have one class (or struct) for your > particles, and one container class (or other data structure) that points > to them, you can easily implement tracing and copying for those two > classes. You don't have to support all of C++ and figure out a generic > way to get at the root set to make it work for your program. Steve Strassman who wrote a Dylan*compiler mentioned to me that it is experimental difficult to implement a GC in C++ proper. Then*I tried to do it for a polymorphic (virtual) class hierarchy, and it was sort of*impossible to get hold of the root set. So I only use a reference count. The next C++ revision might have some GC writing support, though. > I don't know how well it works with C++; I would imagine there are > certain features of C++ that would not play well with it (for example, > does the Boehm collector call destructors?). I do not know much about it, in particular not this. It seems likely it is just the memory allocation/deallocation operators that are changed, as*constructors/destructors can be used to control other semantics. > > One way to implement quick allocation is to not collect at all: Make a > > large array, with a pointer moving, and just move it when allocation space > > is needed. > > You mean like a circular buffer? Does your allocation/deallocation > behavior fit such a system? In that case it might indeed be the most > efficient option. The only possible performance problem I can think of > from the top of my head is that locality of reference may not be as good > as what you would get from a garbage collector. I just took out the*garbage*collection of a two-space GC. It the latter, one first fills up one space, and when it is full, the stuff is copied over to the new other space, and so on. It can be made arbitrarily*faster by enlarging the copying spaces. So if it is*sufficiently large, no GC will take place. -- Hans Aberg |
|
|||
|
On Thu, Oct 19, 2006 at 07:03:13AM +0000, Hans Aberg wrote:
> In article <20061019064737.GD18480@morgenes.shire.sytes.net >, Robbert > Haarman <comp.lang.misc@inglorion.net> wrote: > > > > One way to implement quick allocation is to not collect at all: Make a > > > large array, with a pointer moving, and just move it when allocation space > > > is needed. > > > > You mean like a circular buffer? Does your allocation/deallocation > > behavior fit such a system? In that case it might indeed be the most > > efficient option. The only possible performance problem I can think of > > from the top of my head is that locality of reference may not be as good > > as what you would get from a garbage collector. > > I just took out the*garbage*collection of a two-space GC. It the latter, > one first fills up one space, and when it is full, the stuff is copied > over to the new other space, and so on. It can be made arbitrarily*faster > by enlarging the copying spaces. So if it is*sufficiently large, no GC > will take place. Ah, so no circular buffer, just allocating without ever collecting anything? How feasible will that be with your large data set? Wouldn't you need an unaffordable amount of memory? Regards, Bob --- "A good engineer will go to any amount of effort to avoid extra effort." |
|
|||
|
On Thu, Oct 19, 2006 at 07:03:13AM +0000, Hans Aberg wrote:
> In article <20061019064737.GD18480@morgenes.shire.sytes.net >, Robbert > Haarman <comp.lang.misc@inglorion.net> wrote: > > > > One way to implement quick allocation is to not collect at all: Make a > > > large array, with a pointer moving, and just move it when allocation space > > > is needed. > > > > You mean like a circular buffer? Does your allocation/deallocation > > behavior fit such a system? In that case it might indeed be the most > > efficient option. The only possible performance problem I can think of > > from the top of my head is that locality of reference may not be as good > > as what you would get from a garbage collector. > > I just took out the*garbage*collection of a two-space GC. It the latter, > one first fills up one space, and when it is full, the stuff is copied > over to the new other space, and so on. It can be made arbitrarily*faster > by enlarging the copying spaces. So if it is*sufficiently large, no GC > will take place. Ah, so no circular buffer, just allocating without ever collecting anything? How feasible will that be with your large data set? Wouldn't you need an unaffordable amount of memory? Regards, Bob --- "A good engineer will go to any amount of effort to avoid extra effort." |
|
|||
|
In article <20061019071313.GE18480@morgenes.shire.sytes.net >, Robbert
Haarman <comp.lang.misc@inglorion.net> wrote: > > > > One way to implement quick allocation is to not collect at all: Make a > > > > large array, with a pointer moving, and just move it when allocation space > > > > is needed. > > > > > > You mean like a circular buffer? Does your allocation/deallocation > > > behavior fit such a system? In that case it might indeed be the most > > > efficient option. The only possible performance problem I can think of > > > from the top of my head is that locality of reference may not be as good > > > as what you would get from a garbage collector. > > > > I just took out the*garbage*collection of a two-space GC. It the latter, > > one first fills up one space, and when it is full, the stuff is copied > > over to the new other space, and so on. It can be made arbitrarily*faster > > by enlarging the copying spaces. So if it is*sufficiently large, no GC > > will take place. > > Ah, so no circular buffer, just allocating without ever collecting > anything? How feasible will that be with your large data set? Wouldn't > you need an unaffordable amount of memory? It is audacious, and will clearly work only in special circumstances, but is fastest. Just mention it as a possibility. And if one has virtual memory and the active parts are not spread over many pages, the unused stuff will be swapped out on a hard disk. The cleanup takes place when the program is taken down. Just for programs running short periods of time, with no excessive needs of deallocation. :-) -- Hans Aberg |
|
|||
|
In article <20061019071313.GE18480@morgenes.shire.sytes.net >, Robbert
Haarman <comp.lang.misc@inglorion.net> wrote: > > > > One way to implement quick allocation is to not collect at all: Make a > > > > large array, with a pointer moving, and just move it when allocation space > > > > is needed. > > > > > > You mean like a circular buffer? Does your allocation/deallocation > > > behavior fit such a system? In that case it might indeed be the most > > > efficient option. The only possible performance problem I can think of > > > from the top of my head is that locality of reference may not be as good > > > as what you would get from a garbage collector. > > > > I just took out the*garbage*collection of a two-space GC. It the latter, > > one first fills up one space, and when it is full, the stuff is copied > > over to the new other space, and so on. It can be made arbitrarily*faster > > by enlarging the copying spaces. So if it is*sufficiently large, no GC > > will take place. > > Ah, so no circular buffer, just allocating without ever collecting > anything? How feasible will that be with your large data set? Wouldn't > you need an unaffordable amount of memory? It is audacious, and will clearly work only in special circumstances, but is fastest. Just mention it as a possibility. And if one has virtual memory and the active parts are not spread over many pages, the unused stuff will be swapped out on a hard disk. The cleanup takes place when the program is taken down. Just for programs running short periods of time, with no excessive needs of deallocation. :-) -- Hans Aberg |
|
|||
|
Daniel Hornung wrote:
> Personally I'd prefer C++, operator overloading is a nice thing ![]() Yes. You can start by trying to recompile the C code as C++ and replacing function calls with overloaded operators. You could try using various STLs but, in my experience, they often aren't very space or time efficient. Regarding GC, don't bother trying to retrofit one onto C++ or writing your own. If you want to try GC then pick a fast GC'd language like Stalin Scheme or OCaml. If your program is memory limited then you should probably do manual memory management and use custom data structures to squeeze memory usage. -- Dr Jon D Harrop, Flying Frog Consultancy Objective CAML for Scientists http://www.ffconsultancy.com/product...for_scientists |
|
|||
|
Daniel Hornung wrote:
> Personally I'd prefer C++, operator overloading is a nice thing ![]() Yes. You can start by trying to recompile the C code as C++ and replacing function calls with overloaded operators. You could try using various STLs but, in my experience, they often aren't very space or time efficient. Regarding GC, don't bother trying to retrofit one onto C++ or writing your own. If you want to try GC then pick a fast GC'd language like Stalin Scheme or OCaml. If your program is memory limited then you should probably do manual memory management and use custom data structures to squeeze memory usage. -- Dr Jon D Harrop, Flying Frog Consultancy Objective CAML for Scientists http://www.ffconsultancy.com/product...for_scientists |
|
|||
|
On Thu, 19 Oct 2006, Robbert Haarman wrote: > On Wed, Oct 18, 2006 at 09:37:51PM +0000, Hans Aberg wrote: >> Robert Haarman <comp.lang.misc@inglorion.net> wrote: >>> And you >>> can always use the Boehm collector, which can be used as a drop-in >>> replacement for malloc and free (meaning you don't need to rewrite any >>> code). >> >> Good input, though it is not a C++ collector proper, but goes into the >> layer between the compiler and the platform. > > Yes. The great advantage of the Boehm collector is that you can simply > recompile your (C, I don't know if it works for C++, too) program so > that malloc uses Boehm's allocator, and free() becomes a no-op. You > could use this with your existinc C program to get a handle on how well > it does. > > I don't know how well it works with C++; I would imagine there are > certain features of C++ that would not play well with it (for example, > does the Boehm collector call destructors?). It had better not! Destructors in C++, unlike (IIRC) finalizers in Java, are meant to be invoked deterministically whenever an object goes out of scope or is deleted. I never thought about it before, but this is sort of conflating two --- or maybe four --- ideas that are separated or nonexistent in GC'ed languages (and IMVHO should be separated in all OO languages with C's model for automatic ("stack") variables, if possible). The first idea is that when a local scope ends, all the objects declared local to that scope get deleted. This is totally deterministic and statically-analyzable. The second idea is that when an object is no longer reachable, it should be deleted. This is totally impossible to analyze at compile time. C++ leaves it up to the programmer to explicitly invoke 'delete' at the point where the object should be deleted; Java goes ahead and invokes the finalizer on whatever it garbage-collects, IIRC. In C++, both of these tasks are performed by the class destructor, which makes sense as long as you don't try to introduce a garbage collector. As soon as you do that, you have put the first task under the programmer's control and the second task under the GC's control. (There's nothing intrinsically wrong with that, but it means that you have two different "people" using your destructors for two different tasks... which suggests to me that there ought to be two different functions.) Or maybe the ideas should be categorized this way: On the one hand, C++ destructors are functions that get called when an object is ready to be destroyed, and they can do whatever bookkeeping needs to be done --- removing table entries, printing log messages, whatever. This task is all about the semantics of the program. On the other hand, C++ destructors are to-do lists describing how to go about freeing the resources in use by a particular object. This task is all about memory management and keeping the heap in good working condition. Keeping the first task of the second categorization (i.e., that destructors can do anything a function can do) in mind, the answer to Robbert's question seems clear: A garbage collector for C++ should never call destructors. It doesn't know what side effects the destructor might have... and after all, if the programmer had meant for those side effects to take place, he could have invoked the destructor explicitly! (Or implicitly, when the stack-local variable went out of scope. The compiler inserts those destructor calls, not the GC.) And as for the other purpose of destructors, namely to free unused resources on the heap... well, that's why we have a GC in the first place. The GC shouldn't need the destructor's help in that respect, if it's a mark-and-sweep as I believe Boehm is. (If it's a reference-counting GC, you're in trouble to start with.) -Arthur |
|
|||
|
On Thu, 19 Oct 2006, Robbert Haarman wrote: > On Wed, Oct 18, 2006 at 09:37:51PM +0000, Hans Aberg wrote: >> Robert Haarman <comp.lang.misc@inglorion.net> wrote: >>> And you >>> can always use the Boehm collector, which can be used as a drop-in >>> replacement for malloc and free (meaning you don't need to rewrite any >>> code). >> >> Good input, though it is not a C++ collector proper, but goes into the >> layer between the compiler and the platform. > > Yes. The great advantage of the Boehm collector is that you can simply > recompile your (C, I don't know if it works for C++, too) program so > that malloc uses Boehm's allocator, and free() becomes a no-op. You > could use this with your existinc C program to get a handle on how well > it does. > > I don't know how well it works with C++; I would imagine there are > certain features of C++ that would not play well with it (for example, > does the Boehm collector call destructors?). It had better not! Destructors in C++, unlike (IIRC) finalizers in Java, are meant to be invoked deterministically whenever an object goes out of scope or is deleted. I never thought about it before, but this is sort of conflating two --- or maybe four --- ideas that are separated or nonexistent in GC'ed languages (and IMVHO should be separated in all OO languages with C's model for automatic ("stack") variables, if possible). The first idea is that when a local scope ends, all the objects declared local to that scope get deleted. This is totally deterministic and statically-analyzable. The second idea is that when an object is no longer reachable, it should be deleted. This is totally impossible to analyze at compile time. C++ leaves it up to the programmer to explicitly invoke 'delete' at the point where the object should be deleted; Java goes ahead and invokes the finalizer on whatever it garbage-collects, IIRC. In C++, both of these tasks are performed by the class destructor, which makes sense as long as you don't try to introduce a garbage collector. As soon as you do that, you have put the first task under the programmer's control and the second task under the GC's control. (There's nothing intrinsically wrong with that, but it means that you have two different "people" using your destructors for two different tasks... which suggests to me that there ought to be two different functions.) Or maybe the ideas should be categorized this way: On the one hand, C++ destructors are functions that get called when an object is ready to be destroyed, and they can do whatever bookkeeping needs to be done --- removing table entries, printing log messages, whatever. This task is all about the semantics of the program. On the other hand, C++ destructors are to-do lists describing how to go about freeing the resources in use by a particular object. This task is all about memory management and keeping the heap in good working condition. Keeping the first task of the second categorization (i.e., that destructors can do anything a function can do) in mind, the answer to Robbert's question seems clear: A garbage collector for C++ should never call destructors. It doesn't know what side effects the destructor might have... and after all, if the programmer had meant for those side effects to take place, he could have invoked the destructor explicitly! (Or implicitly, when the stack-local variable went out of scope. The compiler inserts those destructor calls, not the GC.) And as for the other purpose of destructors, namely to free unused resources on the heap... well, that's why we have a GC in the first place. The GC shouldn't need the destructor's help in that respect, if it's a mark-and-sweep as I believe Boehm is. (If it's a reference-counting GC, you're in trouble to start with.) -Arthur |
|
|||
|
In article <Pine.LNX.4.61-042.0610191247100.7581@unix36.andrew.cmu.edu>,
"Arthur J. O'Dwyer" <ajonospam@andrew.cmu.edu> wrote: > ... A garbage collector for C++ should > never call destructors. It doesn't know what side effects the destructor > might have... and after all, if the programmer had meant for those > side effects to take place, he could have invoked the destructor > explicitly! (Or implicitly, when the stack-local variable went out of > scope. The compiler inserts those destructor calls, not the GC.) > And as for the other purpose of destructors, namely to free unused > resources on the heap... well, that's why we have a GC in the first > place. The GC shouldn't need the destructor's help in that respect, > if it's a mark-and-sweep as I believe Boehm is. (If it's a > reference-counting GC, you're in trouble to start with.) One example that comes to my mind, is a window whose closing is implemented as a destructor. When the class object that creates the window goes out of scope, the window should be immediately closed, otherwise the GUI would be rather confusing. But the memory resources used by the window can be collected*at need later by the GC. -- Hans Aberg |
|
|||
|
In article <Pine.LNX.4.61-042.0610191247100.7581@unix36.andrew.cmu.edu>,
"Arthur J. O'Dwyer" <ajonospam@andrew.cmu.edu> wrote: > ... A garbage collector for C++ should > never call destructors. It doesn't know what side effects the destructor > might have... and after all, if the programmer had meant for those > side effects to take place, he could have invoked the destructor > explicitly! (Or implicitly, when the stack-local variable went out of > scope. The compiler inserts those destructor calls, not the GC.) > And as for the other purpose of destructors, namely to free unused > resources on the heap... well, that's why we have a GC in the first > place. The GC shouldn't need the destructor's help in that respect, > if it's a mark-and-sweep as I believe Boehm is. (If it's a > reference-counting GC, you're in trouble to start with.) One example that comes to my mind, is a window whose closing is implemented as a destructor. When the class object that creates the window goes out of scope, the window should be immediately closed, otherwise the GUI would be rather confusing. But the memory resources used by the window can be collected*at need later by the GC. -- Hans Aberg |
|
|||
|
On Thu, Oct 19, 2006 at 10:15:20PM +0000, Hans Aberg wrote:
> One example that comes to my mind, is a window whose closing is > implemented as a destructor. When the class object that creates the window > goes out of scope, the window should be immediately closed, otherwise the > GUI would be rather confusing. I would argue that relying on destructors/finalizers to be called at a specific time is a bad thing to do. If you wanted the window to close at a specific time, you should have explicitly closed it, or perhaps used a with-window (analogous to Common Lisp's with-open-file) construct. Regards, Bob --- An astronaut in space in 1970 was asked by a reporter, "How do you feel?" "How would you feel," the astronout replied, "if you were stuck here, on top of 20,000 parts each one supplied by the lowest engineering bidder?" |
|
|
![]() |
| Thread Tools | |
| Display Modes | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Re: Data simulation functions and seeds | nospam@HOWLES.COM (Howard Schreier | Newsgroup comp.soft-sys.sas | 0 | 05-23-2006 01:51 AM |
| Re: Data simulation functions and seeds | Nordlund, Dan | Newsgroup comp.soft-sys.sas | 0 | 05-22-2006 10:37 PM |
| Data simulation functions and seeds | Hans Reitsma | Newsgroup comp.soft-sys.sas | 0 | 05-22-2006 09:37 PM |
| Re: A simulation question | Junwu Shen | Newsgroup comp.soft-sys.sas | 0 | 05-05-2005 03:18 PM |
| Re: A simulation question | Bob Abelson | Newsgroup comp.soft-sys.sas | 0 | 05-05-2005 02:58 PM |