Translate

Monday, April 30, 2012

Preparing for the 0.8.4 Release

I just finished doing a major rewrite of the internal lvalue handling in Qore.  Basically now most types lvalues are stored in a union which consists of one of a 64-bit int, a double, a bool, or a pointer to a generic Qore value object.

The thing with Qore is that at the beginning, all values were dynamically allocated objects derived from a common virtual base.  This was to allow for atomic reference counts and a copy-on-write approach to managing data, which is very efficient for large data structures.  In this way you can pass a large data structure (such as a hash, list, or object) to a function by value, and the value is only copied if it's changed.  Even then, only the top level of the data structure is copied, because each of the values is also a reference-counted object, so, unless they change as well, they are only copied by reference (meaning a pointer is copied and its reference count is incremented).

However this approach is not efficient for small, discrete values such as integers and floating-point values.  It's even worse for boolean values, which can be stored in as little as 1 bit.

Qore has an optimization for special values like True and False and some others, whereas there is only 1 single value in the Qore library that is not subject to reference counting.  However this is not possible with ints and floats.

The solution that I implemented for lvalues is to use the union as described above; the type of the union is set based on the lvalue's type restriction -- so if you declare a local variable as "int" or "softint", then it will be internally stored and operated on only as an integer (the same with "float" or "softfloat").

This allows Qore to store and operate directly on the base data type, instead of always working with another level of indirection (a pointer to a generic value object) and also eliminates the associated dynamic memory management.   So this approach has both memory and speed benefits.

This work showed me a clear way forward for doing some very cool optimizations in Qore regarding value handling - basically long-term I plan on making all Qore values some sort of union like this, which will allows Qore always in every instance to operate directly on the base data type when possible.  This will be necessary before starting llvm integration as well.

This will be a lot of work and will start some time after the upcoming 0.8.4 release.

I also implemented user thread initialization - you can now set a closure or call reference to be executed any time a new thread is started in the current Program object (or any time another Program object accesses the current Program object in a new thread) - this can be used to initialize thread-local data in the Program.

Also I implemented an optional maximum size for the Queue class - if a maximum size is set then writes to the Queue will block if the Queue already has the maximum allowed number of elements in it.  In this way, Qore Queue objects can be used like a buffered Go channel.

At the moment, Qore 0.8.4 is feature complete and stable in svn, however there's still some more packaging work etc to be done before the actual release, which hopefully will be pretty soon (I'm aiming for sometime in the next 2 weeks).

Saturday, April 21, 2012

User Modules

I've recently committed support for user modules in Qore; this will allow the language to be extended in an organized and predictable way with Qore-language code.

Before this was only possible with modules written in C++.

The current documentation for user modules is online here: http://qore.org/manual/current/lang/html/qore_modules.html#user_modules (note: edited to reflect a perma-link for the user module documentation in the latest qore docs)

User modules have the following features:
  • code embedding safety: modules work with Qore's functional domain permission/protection framework so that embedded code can only use modules that use authorized functionality
  • encapsulation: only symbols marked as public are exported; everything else is private to the module
  • uniqueness: multiple pieces (source files) can "require" a module safely - also when embedding Qore code, when multiple Programs use a module, there is only one copy of the module and of its private data (single global state)
Note that also there has been a nearly complete rewrite of the namespace code and handling to facilitate user modules - particularly to enable public and private symbols in modules. For example, now global variables are also contained in namespaces (hence it's possible to have more than 1 copy of a global variable with the same name in different namespaces).

The next step will be to integrate a separate program called "qdx" which converts Qore code to a c++-like format for doxygen parsing so that doxygen documentation can be generated from Qore modules and those can be integrated into Qore's reference documentation (at least for the modules that will be shipped with Qore - this will be the start of Qore's Qore-language runtime library).

I have already added a couple of user modules to the Qore source in svn (HttpServer.qm and Smtp.qm) and updated the build and packaging code accordingly.

The new directory location for the runtime library is the Qore version string as a subdirectory of "qore-modules" (where binary/c++ modules are installed). For example on UNIX this might be:

/usr/lib64/qore-modules/0.8.4

This directory is automatically added to the QORE_MODULE_DIR search path.

I hope this will enable more collaboration to be made on Qore and of course for the language to be more transparent and useful for more people.