New BRL-CAD Support Library, LIBbs
Mike Muuss
Background
For some time there has been the desire to package up the numerous
utility routines presently found in LIBRT into a new
support/utility library, separate from all the
geometry, database, and ray-tracing routines.
It has been proposed that this new library LIBbs have
a single consolodated header file for it, much like raytrace.h
covers most of LIBRT.
All routines in this new library would have a new prefix for their
function and global variable names, in place of their current rt_
prefix.
Because the library name and matching prefix have not been decided upon,
in this document I continue to use the rt_ prefix for clarity.
A compatibility header file would be provided to map the old
names into the new names. In those cases where the ordering (but not
the semantics) of the parameters was changed the compatability macros
would take care of that change as well.
The BRL-CAD convention for parameter ordering in function arguments
is to place output arguments first, followed by input arguments.
Input arguments should all be marked as CONST, both as an aid to optimizing
compilers and as an aid to quick understanding of the routine.
While for the most part it is easy to determine which routines should
be moved into this new library, there are some complications.
Most of the general support/utility routines depend on
rt_malloc()
which in turn depends on the
RES_ACQUIRE()/RES_RELEASE()
macros,
the struct resource in raytrace.h,
the res_syscall entry in struct rt_g,
and the parallel processing support in
librt/machine.c.
It also depends on RES_INIT() having been
called by the user's application code before any parallel
operations have been performed.
The question thus becomes one of deciding how to package up
the parallel processing routines in a more generic form.
A partial proposal
-
Rather than using pointers to res_syscall type
entries in struct rt_g
for the arguments to RES_ACQUIRE() and RES_RELEASE(),
I propose that small integers be used instead.
-
A convention for those few crictical section semaphore interlocks
required by LIBbs itself should be documented.
I propose that semaphore zero take the place of res_syscall
for interlocking all operating system syscalls.
-
I also propose that the number of available interlocks be increased
from the present five to 16. All platforms which support parallel
operations have hardware support for at least 16 interlocks.
-
I propose that rt_parallel() invoke the indicated subroutine in
each parallel thread with a pointer to a new library-provided structure
(struct rt_thread) which incorporates the "cpu" number
and other per-processor informatin.
-
I propose that a new subroutine set rt_pmalloc, rt_pcalloc,
and rt_pfree be provided.
These routines would manage a pool of memory that is private to
a given thread ("cpu") and which will not be available to
other processors.
This pool of memory could be managed without having to semaphore interlock
memory allocation requests with other processors, resulting
in an increase in performance and a much higher ceiling on the number
of processors which could be efficiently used.
Also, many parallel architectures provide a pool of memory local to
each processor that is not part of the global shared memory pool;
typically the local memory is substantially faster to access.
The spline routines in particular would benefit from this.
Each routine would take a pointer to the relevant struct rt_thread,
where internal pointers to the necessary freelists would be kept.
- I propose that a new routine rt_parallel_init() be provided
to take the place of the existing macro RES_INIT().
Rather than having to call RES_INIT() for each semaphore to
be initialized, the new routine would initialize them all at once.
It would also check for double-invocation, to catch user errors.