First of all regarding the language. deterministic garbage collection was always a goal of Qore's because of it's focus on the RAII idiom (wikipedia) for exception-safe programming and resource management (I also found the following very interesting link on this subject: http://www.hackcraft.net/raii/).
Some examples of RAII in Qore are (a subset of possible examples):
- the AutoLock class releases the Mutex in the destructor (this class is designed to be used with scope-bound exception-safe resource management)
- the Datasource class closes any open connection in the destructor, and, if a transaction is still in progress, the transaction is rolled back automatically and an exception is thrown before the connection is closed
- the File class closes the file in the destructor if it's open
- the RWLock class throws an exception if destroyed while any thread is still holding the lock; note that in this case the underlying object is only destroyed when all threads holding locks have released their locks; this is handled with Qore's thread resource handling and strong references to the underlying RWLock object while thread resources are held; thread resource handling is another potential topic for a blog post
- the Socket class first shuts down any TLS/SSL connection and then closes the connection in the destructor if it's open
- the ThreadPool class detaches all currently in-progress worker threads, cancels pending tasks not yet executed (by calling their cancellation closure, if any), terminates the worker thread and destroys the thread pool
Basically one of Qore's primary design points is to free Qore programmers from worrying about the details of memory and resource management; the use of the RAII idiom is a large part of this; also above you can see an example of negative feedback provided to programmers when mistakes are made - deleting a RWLock object while a lock is held (note that there is also scope-related resource management support in Qore in the form of the on_exit, on_success, and on_error statements).
Therefore since support for the RAII idiom is a critical feature of Qore's design, the language should always guarantee that objects are destroyed and therefore their resources are managed and associated memory freed when objects go out of scope. This is tricky when there are circular references involved in the objects, particularly since Qore uses a reference-counted solution with atomic references due to its multi-threaded nature. Consider the following post on this subject regarding .NET: http://blogs.msdn.com/b/brada/archive/2005/02/11/371015.aspx. This gives a lot of background to the same problem that Qore (and Java) have for providing for deterministic garbage collection.
Basically to summarize, RAII is not supported in Java and .NET because it's hard to do. Consider also VB6, basically if you have recursive object references in VB6, those objects will never be destroyed or collected by the system; the programmer has to delete objects with recursive references manually in order to destroy them. This is basically the same situation as Qore before the introduction of deterministic GC.
The current status of the deterministic GC support in Qore svn is stable; it appears to finally be working in large complex object-oriented multi-threaded programs with good performance and (at the moment - knock on wood) free of deadlocks. In another post I described the high-level approach with emphasis on the deadlock-avoidance approach since this was one of the trickiest parts to solve. I'd like to provide more detail here on the recursive graph detection and recursive reference algorithm.
To find recursive graphs and calculate the number of recursive references for each member of the graph, Qore does a scan of some starting object. Note that the recursive graph detection algorithm has to produce the same results for any given recursive graph independently of the starting node in the graph. So from a high level, Qore scans through all reachable objects from a start object, and, when a recursive reference is found, increments the recursive count for that object (in fact this is done in a transactional way where all changes are committed atomically to the entire recursive graph at the end of the transaction or rolled back in case of certain kinds of lock contention in order to avoid deadlocks, but this is described in my previous blog post). If a "path" is found from any given object to itself, the recursive count is set to one for all elements of the path, and any recursive reference then has its recursive count incremented. Due to the recursive nature of the algorithm, during path detection new elements can be added to the recursive set while scanning an object, so processing the path has to also take this into account by checking if parent objects are reachable from existing recursive sets and must process their recursive counts appropriately (for example, if, while processing a path, one of the objects in the path already has a non-zero pending recursive count, then if the preceding object in the path was not already in the set, then its pending recursive count is incremented, otherwise it is not). Additionally, if any predecessor in the current path of the current recursive node having a different recursive set is reachable from the current recursive set, then the recursive sets are merged into one.
STL sets are used to provide for O(ln(n)) searches of recursive sets found in the scan, however this may not be the ideal algorithm for small sets due to the overhead in managing red-black balanced binary trees normally used to implement sets in STL. Also these sets are iterated several times, so fast iterator performance is important. I believe that some analysis of the usage of data structures in the recursive scan could provide some optimizations.
Recursive scans are performed when changes are made to objects that could potentially create a new recursive set or change an existing recursive set. The approach also features several checks designed to limit the cases where the recursive graph detection algorithm needs to be applied; I believe more cases can be found to further improve performance.
I have not made a systematic performance test of Qore with deterministic garbage collection enabled (it's currently enabled by default in Qore svn trunk), but from subjective testing on existing code, it seems fast.
You can see if deterministic garbage collection is enabled in Qore by typing qore -V; you should see a line like this if it's enabled:
version 0.8.12-6768 (builtin features: sql, threads, DGC)
If you see "DGC" in the output, then it's supported, otherwise not (currently requires svn trunk - 0.8.12 has not yet been released).
In Qore code, there is a parse define, HAVE_DETERMINISTIC_GC, that can be used to optionally parse code depending on the presence of deterministic garbage collection or not; for example, there are now regression tests in Qore that are executed when deterministic garbage collection is present.
I found the following link: http://blogs.msdn.com/b/x/archive/2012/01/13/x-deterministic-garbage-collection.aspx
ReplyDeletewhich describes Microsoft's X++ deterministic garbage collector, which has an algorithm that seems similar to Qore's.
They describe problems with the algorithm when very large collections of objects are built; Qore's algorithm may be subject to the same problems, although the X++ description refers to object graphs distributed over a network, so it's not totally clear. Anyway, I need to do more testing with very large object graphs in Qore.