New BRL-CAD Database Format, Version 5

(Draft, 19 July)


Lee A. Butler
Michael John Muuss
Paul J. Tanenbaum
John R. Anderson
Robert G. Parker
Ronald A. Bowers
Christopher T. Johnson
Eric W. Edwards

Ballistics & NBC Division
Survivability/Lethality Analysis Directorate
U.S. Army Research Laboratory
Aberdeen Proving Ground, MD 21005-5068 USA

Index

Update this list

Goals

BRL-CAD presently uses version 4 (v4) of its binary ".g" database format, which has been stable for more than 10 years.

Revising BRL-CAD to use a new database format (v5) is intended to provide an evolutionary, upward-compatible capability with a minimal implementation and deployment burden for the users.  Specific goals include:

Implementing the new database format requires small code changes in many places throughout the BRL-CAD package. Version 5 of the database format is not likely to appear in BRL-CAD Release 5.0 but instead will probably form the foundation of BRL-CAD Release 6.0. (Historical note: BRL-CAD Release 4.x. is the only version of the software in which the database format number and the software release number have been the same; they aren't intended to be linked.)


Background and Terminology

BRL-CAD is a constructive solid geometry (CSG) modeling system. Primitive shapes are combined using boolean operations to form regions of homogeneous material.

The database is organized as a directed acyclic graph (DAG), which comprises

In a slight abuse of terminology, the DAG is often spoken of as a tree or collection of trees. In this context, the primitive shapes are also called leaves.


Format of Data Elements/Database External Format

The external format has several important properties, especially with regard to the Object_Body:
 
*NOTE:  The term "big-endian" is derived from Gulliver's Travels, Jonathan Swift's classic allegory in which two neighboring groups (the Big-Endians and the Little-Endians) are in a constant state of struggle because of, among other things, their opposing philosophies of how one should break an egg-by the big end or by the little end.



 

Definition of a Generic Database Object


The database access library stores objects as a collection of data with a globally unique name and places no interpretation on the content of those data. The object is the smallest granularity of an item in the database, and it must be read from and written to the database in a single atomic operation.

In the case of librt, each database object will contain exactly one combination node or leaf (primitive shape) node.

Overall Object Structure

The on-disk version of each object consists of three distinct parts: Object Header, Object Interior, and Object Footer. This is called the external format of the object.

The routines db_get_external() and db_put_external() are used to move objects in external format between memory and the database disk file. The routines db_wrap_external() and db_unwrap_external() are used to wrap and unwrap the

(Object_Body or Object_Interior)

(already in external form) in a standardized database object's wrapper.

Table 1 is provided to graphically illustrate the structural elements available to each object.
 


Part Element Comments
Object Header:
(not compressible)
Magic1 Required
Flags:
HFlags
AFlags
BFlags
Object Type:
Major_Type
Minor_Type
Object_Length Required
Object Name:
Name_Length
Name_Data
Conditional on
flag bit N,
Required for
Application Data
Object Interior:
(individually
compressible)
Object Attributes:
Attribute_Length
Attribute_Data
Conditional on
flag bit A
(ZZZ compression)
Object Body:
Body_Length
Body_Data
Conditional on
flag bit B
(ZZZ compression)
Object Footer:
(Not Compressible)
Padding As Required to
maintain 8-byte
object boundaries
Magic2 Required

All objects share certain common properties, which are stored in a standardized object wrapper consisting of an Object Header and an Object Footer.  In addition, objects contain properties that are specific to themselves.  These are contained in the object payload (or Object Interior), which may or may not consist of Object Attributes and an Object Body.

Object Wrapper

The Object Header consists of: The Object Footer consists of:
Magic Numbers

In order to detect corruption in the database and, if necessary, perform data recovery, every object stored in the BRL-CAD database begins and ends with a magic number.  These numbers are used as object tags to signal the boundaries between objects, which is especially important when the bytes of one object are inadvertantly overwritten by those of another, thereby corrupting the rest of the database that follows.  By searching for the locations in the database where one object begins (Magic1) and the one before it ends (Magic2), corrupted data can be identified and skipped, and then valid data can be identified and recovered.

The use of magic numbers continues to be a subject for debate in computer programming.  In the end, the issue boils down to a tradeoff between available storage space and the ability to easily detect and correct data corruption. 

Object Payload

Objects may store application-specific information in an Object Interior.
NOTE:  An object can now have (1) EITHER an attribute OR a body, (2) BOTH an attribute and a body, or (3) NEITHER an attribute nor a body (although the utility of option 3 is somewhat questionable).

 

Flags

The Flags element consists of three 8-bit fields: HFlags, AFlags, and BFlagsThe HFlags field is 1 byte containing flag bits that pertain to the noncompressible basic header and the database object as a whole. The AFlags (which pertain to the attribute portion of the object interior) and  BFlags (which pertain to the body portion of the object interior) fields are each 1 byte containing flag bits that pertain to the (potentially compressed) attributes and body, respectively.

Table 2.  The Flags Structures

HFlags
 
AFlags
 
BFlags
7
6
5
4
3
2
1
0
 
7
6
5
4
3
2
1
0
 
7
6
5
4
3
2
1
0
Wid N Wid r DLI   Wid P r r ZZZ   Wid P r r ZZZ

DLI flag (should dli be all caps or lowercase??)

The DLI flag is a 2-bit flag that indicates whether the object is an Application Data Object or a Database Layer Internal Object. The bits are interpreted as shown in Table 3.

Table 3.  Interpretation of the DLI Flags

dli Bits
Meaning
00
Application Data Object 
The object contains application-specific data. N must be 1. A and B are determined by what the application presents for storage in the object; both may be 0 (empty Object_Interior). 
01
Database Layer Internal, Header Object 
A Header Object must be the first object encountered in the database. In order to support direct concatenation of two existing databases into one new database, additional header objects may appear elsewhere in the database. The header object has no object name, object attributes, or object body (e.g., N=0, A=0, B=0). Major_Type=RESERVED, Minor_Type=0. 
10
Database Layer Internal, Free Storage. 
Unused space in the database is kept using a special Free DB Storage object that has no object name, no object attributes, and no object body.  Like all other objects, the total length of the object will be a multiple of 8 bytes. N=0, A=0, B=0. Major_Type=RESERVED, Minor_Type=0. 
11
Database Layer Internal, Reserved 
This value is reserved for future use.

The DLI flag is not available to the higher database access layers.
 
 
IMPLEMENTATION NOTE: Before writing a new object into the database in a free area, the library should read the object header from the database and confirm that the space is indeed free. Similarly, additions to the end should be checked by ensuring that the file hasn't been extended. In case the check fails, the database write should fail, the user should be notified, and the internal library mode (not the operating system file access permissions) should be changed over to read-only access so that no further attempts to write will be issued. These checks will provide protection against two or more users trying to modify the same database simultaneously and accidentally coming in conflict with each other. In the NFS world, file locking isn't a strong enough assurance.

Wid flags

The length of an object or subelement in the database is recorded using an unsigned integer. These are variable-width fields based on the magnitude of the maximum number needed. The Wid bits specify the size of the unsigned integer employed in each instance. There are four 2-bit width (Wid) flags: Object_Wid and Name_Wid (stored in HFlags), Attribute_Wid (stored in AFlags), and Body_Wid (stored in BFlags). The Wid fields are interpreted as follows in Table 4.
 
 

Table 4.  Interpretation of the Wid Flags

Wid Bits
Width (in bits) of
Associated Length Fields
00
8
01
16
10
32
11
64

Doublecheck on the underscore issue

The Object_Wid flag, at the high end of HFlags, encodes the width of the Object_Length field. The Name_Wid flag, in bits 3 and 4 of HFlags, encodes the width of the Name_Length field (when the name element is present; see the N bit, shown later.). Attribute_Wid (or Body_Wid, as the case may be) encodes the width of the Attribute_Length field (when the Object_Attributes [or Object_Body]) element is present. See #Pbit below .)

The rationale for allowing the width of the Object_Length field to be specified independently of the other widths is to save space on objects in which the values in many of the length fields nearly overflow the specified field width, so that their sum requires a wider field. For example, for four 255-byte interior fields, the corresponding length fields need be no more than 8 bits wide, so the choice Interior_Wid=00 suffices, but their combined length of 1020 bytes would require Object_Wid=01. Because all of the length fields besides Object_Length must have the same width, the largest of the values stored in these length fields determines the value of Interior_Wid required. Both Object_Wid and Interior_Wid may vary from object to object. It is expected that the routines that write an object to the disk will use the narrowest width possible for each object.

Each of the N and the two P flag bits indicate the presence (1) or absence (0) of the corresponding element. This eliminates the need to specify a length of zero for optional fields that are not present in a given object.


Table 5.  ZZZ Flag Compression

ZZZ Bits
Compression Algorithm
000
None
001
GNU GZIP
010
Burroughs-Wheeler
011
Reserved
100
Reserved
101
Reserved
110
Reserved
111
Reserved

The remaining flag bits "r" in HFlags, AFlags, and BFlags are reserved for the design committee to waste on additional optional fields in the object.
 

Object_Type

As shown in Table 6, the Object_Type element is always 16 bits wide, organized into two 8-bit-wide fields: the Major_Type and the Minor_Type.
 
 

Table 6.  Object Type Elements (?How important is it that we included the underscore throughout?)

15
0
Major_Type
Minor_Type

Each different Major_Type value is assigned to a different class of database object (as shown in Table 7).
 
 

Table 7.  Major Type Class Assignments

Major_Type
Value
Object Class
0
Reserved
1
BRL-CAD Constructive Objects (e.g., Primitive Shapes and Combinations)
2
Attribute-Only Objects
8
Experimental Binary Objects (Unrecorded Structure) (Minor Type Unspecified)
9
Uniform Array Binary Objects, (Type Described in Minor Type)
10
MIME_Typed Binary Objects (Attribute "mime_type" Describes Format)
16-31
Registered-Type Binary Objects
128
First Non-ARL Type Begins Here

 

The remainder are available for extending the types of objects that may be stored in the database, allowing BRL-CAD users to extend the database for their own particular purposes far beyond what the "attribute" method permits.

Major_Type = 0: Reserved

Major Type 0 is illegal. The rationale is to provide the library an opportunity to detect incompletely filled in data structures.

Major_Type = 1: BRL-CAD Constructive Objects

This class of objects is private to librt, concerning all nongeometric and geometric objects needed by the library.  Typically, there will be one g_xxx.c module in librt for each minor type in this category.  The Minor_Type values are defined in Table 8 and Table 9 for both nongeometric and geometric objects, respectively.
 
 

Table 8.  Major_Type = 1 BRL-CAD Nongeometry Objects

Minor_Type
value
Object Type
0
Reserved for sanity check
3
Combination
2
Grip Nongeometric(??)
2
Joint Nongeometric(??)

 

Table 9.  Major_Type = 1 BRL-CAD Geometry Objects

Minor_Type
Value
Object Type
0
Reserved for sanity check
1
Torus (TOR)
2
Truncated General Cone (TGC)
3
Ellipsoid (ELL)
4
ARB
5
ARS
6
Half-Space (HALF)
7
REC (TGC Special Case)
8
Reserved*
9
B-Spline 
10
Sphere (ELL Special Case)
11
n-Manifold Geometry (NMG)
12
Extruded Bitmap (EBM)
13
Volume (VOL)
14
ARBN
15
PIPE
16
Particle
17
RPC
18
RHC
19
EPA
20
EHY
21
ETO
22
Grip Nongeometric
23
Joint Nongeometric
24
Reserved**
25
Displacement Map (DSP)
26
2d Sketch
27
Extrude
28
Submodel
29
BOT
30
Cline
>=128
User-Provided Extensions
                                                                                *Previously the value for the Polysolid.
                                                                                **Previously the value for the Height Field.
 

The details of these Minor_Types are provided in Section IV.

All other values are reserved for future expansion.

Major_Type = 3: Attribute-Only Objects

This type of object stores only attributes in the object interior section; it has no object body elements.

For example, if several objects need to have the same shader parameters, it would be possible to create one attribute-only object to hold these common attributes and serve as a simple form of "macro". Objects that needed to share these attributes could all reference the same attribute object. If the attribute object is altered, then all of the objects that reference it would be updated together. Without this capability, the user would have to update each element individually to alter the attributes.

Conventions will have to be established regarding which attributes of an attribute-only object will be used when a macro reference is performed. For example, rt shaders will only be interested in the value of the "oshader=" attribute, while librt's tree-walker might also be interested in the "rgb=", "giftmater=", "nsn=", "material=", and "los=" attributes (assuming that a convention was developed so that a combination could macro-reference an attribute-only object too).

An attribute-only object may not have an object body; thus, flag bit B must always be zero for this type of object.

As used by the rt family of applications codes, these attribute-only objects will contain "macros" for shaders. The shader name and its parameters shall be encoded as a single ASCII string, which is the value of the "oshader=" attribute. An rt shader named "macro" (or equivalent) would take a single parameter "obj=", which would specify the name of the attribute-only object in the database from which the actual shader and shader parameter information would be extracted.

There will be one attribute-only object with a reserved object name of "_GLOBAL" that will be used to contain various kinds of states that are global to the entire ".g" database and that had previously been found in the database header itself. There will be the following BRL-CAD-specific attributes whose meaning is predefined for the _GLOBAL object:

title=
The database "title" string previously found in the database header.
units=
The most recent editing units, specified as an ASCII string with a floating point conversion factor. For example, the conversion factor for inches to millimeters would be 25.4.
regionid_colortable=
A string that contains a collection of all the information previously found in "struct material_rec ID_MATERIAL" records. Exact encoding yet to be determined; it's a collection of integer 5-tuples of the form: {low, high, r, g, b}.
In addition, the "comment=" attribute of the "_GLOBAL" object may be used to store human-readable remarks about the database that are not more properly associated with a specific database item. These might include remarks about data sources, model evolution, security classification, and release restrictions. In the absense of some outboard revision-control system, this might also be a place to record modification history, although such use is discouraged.

Bulk Binary Objects (Major_Types 8-31)

This class of objects contains various "bulk" binary data that might otherwise have been placed in auxiliary files.

MGED and standalone commands must be built to store/extract these opaque?? binary objects between a ".g" file and stdin/stdout/auxiliary files. A user might want to use those same MGED commands to store/extract the binary object body of any object for external processing. An easy example to imagine is the importing and exporting of texture maps for external processing, but the same commands could be used for importing and exporting primitive shape parameters in their external binary form.

These objects may be referenced in combination nodes, for organizational purposes, but they cannot be drawn in MGED or raytraced, and doing so would result in a warning message being printed by the tree walker as that arc is traversed. This class may be used by all applications and layers.

The data's purpose may be placed in the "purpose=" attribute. (????????Need a table/registry of presently known values for this attribute.)

Routines that retrieve bulk binary objects should check the minor type and the "purpose=" attribute and send a warning message in the event of a mismatch, but best-effort processing of the object should continue. This will permit some degree of error checking, which should benefit novice users without standing in the way of "creatively" reusing one set of data, (e.g., using one array of values as both a height field and a bwtexture). This allows common data perversion practices, such as interpreting an array of floats as an array of bytes, to continue.

Each application will need to have its own syntax for the user to specify whether the data source is an outboard file or a raw-binary object. For example, the current RT sh_texture module uses the keyword file="name" to indicate an outboard file; that might be supplemented with an additional obj="name" possibility for retrieving from an inboard raw-binary object.

Major_Type = 8: Experimental Binary Objects

This class of objects contains bulk binary data and is intended for experimental use by applications developers. Each time a database containing objects of this type is opened, BRL-CAD will issue a user-visible warning. Production software and databases should not use these objects. Developers should obtain registered 16-bit object types from the website in order to avoid collisions with other applications.

Major_Type = 9: Uniform Array Binary Objects

This class of objects contain various "bulk" binary data that might otherwise have been placed in an auxiliary file.  This sentence is a repeat of what was said a few paragraphs earlier.  Do we want to mention that it is an 8-bit object, etc., as we've done for the other types?

Point of Discussion?????Has ramifications... we have to implement type advising, so that applications that use these data can compare the type provided in the minor type code with the type that they're expecting and advise the user (with a warning message) that there is a potential type mismatch.

Need to include a mention of Table10.
 
 

Table 10.  Major_Type = 9 Uniform Array Binary Objects

Minor_Type
7
6
5
4
3
2
1
0
r r Wid S Atom

The 3-bit "Atom" flag indicates the fundamental data type of the atomic elements in the array as shown in Table 11.
 
 

Table 11.  The Atom Flag Elements

Atom Bits
Data Type
000
Reserved for sanity check
001
Reserved
010
float (IEEE, network order)
011
double (IEEE, network order)
100
8-bit int
101
16-bit int
110
32-bit int
111
64-bit int

The "S" bit indicates whether an integer type is signed (1) or unsigned (0). Floats and doubles (i.e., atomic types with the highest atom bit equal to 0) are explicitly signed, so they will have the "S" bit equal to 1. (The bit patterns corresponding to unsigned floats and doubles are reserved for possible other use.)

The 2-bit "Wid" flag specifies the length (in atomic elements) of the array elements (as shown in Table 12).
 
 

Table 12.  Wid Flag Array Length (???)

Wid Bits
Atoms per
Array Element
00
1
01
2
10
3
11
4

The remaining Minor_Type bits "r" are reserved for the design committee to waste on gratuitous other stuff, possibly including extensions of the "Atom" and/or "Wid" flags.

As examples, data in PIX(5) format, which might be used for a texture map, would have Minor_Type "0010 0100", indicating a triple of unsigned char, and CMYK data might be stored with Minor_Type "0011 1011", indicating a quadruple of doubles.

The data's purpose (e.g., height field, texture, bump, displacement, etc.) may be placed in the "purpose=" attribute. ?????Point of Discussion???(Need a table/registry of presently known values for this attribute.)

Major_Type = 10: MIME-Typed Binary Objects

This class of objects contains data, the format of which is specified in the attribute "mime_type". The Minor_Type of these objects should always be zero.

Major_Type = 16-31: Registered-Type Binary Objects

This class of objects contains application-specific bulk binary data and is intended for use in production software and databases. Developers can obtain registered 16-bit object types from the website to identify these objects. The data's purpose, (e.g. height field, texture, bump, displacement, etc.) may be placed in the "purpose=" attribute. (Need a table/registry of presently known values for this attribute).

Major_Type = 255: Database Layer Internal Objects

A Minor_Type of 1 indicates that this is a contiguous block of free storage.

A Minor_Type of 2 indicates that this is a database header.
 

Object Length

The Object Length specifies the number of 8-byte chunks used to store an object. This includes all bytes from Magic1 through Magic2, inclusive.
 

Object Name

The Object_Name element is a string that holds a name unique to that object and drawn from a name space that is global to the database. The Object_Name element is mandatory for all allocated storage in the database. Database free-space managment objects are the only objects for which the Object_Name element is optional.

The name is specified in 8-bit ASCII. There is no support for UNICODE. The name is null-terminated, and the null byte is included in Name_Length.

See the section on DLI flags. In the case of Free objects, the name is not retained. Undeleted objects have a different DLI flag code.
 

Object Attributes

An object may optionally have an Object_Attributes element, which stores an association list

aname1=value1, aname2=value2, ..., anameN=valueN

binding attributes to values.

These are ASCII strings of unlimited length. These attributes are intended for direct use by programs. There will be a WWW registry of attribute names presently in use to prevent two application developers from using the same attribute_name for different purposes.

For attribute names and attribute values, The decision was taken to support 8-bit ASCII only. The on-disk encoding of this will simply be:

aname1 NULL value1 NULL ... anameN NULL valueN NULL NULL

where NULL represents a byte with all bits zero. The NULL in place of anameN+1 signals the end of the attribute data, simplifying the job of the reader. 

Every object in the database may have zero or more attributes attached to it; the meaning of these attributes will vary depending on which application or library processes them.

There are several aname conventions that all BRL-CAD applications are expected to respect. There will be a WWW extendable registry of "in-use" anames, so that independent applications developers may select aname strings for their own use without fear of name conflicts later. The initial registry would include:

comment=
Every object may optionally have a comment that contains a string of an arbitrary number of newline-terminated lines of text. These are strings for use by humans only. None of the BRL-CAD software may parse or interpret these strings other than to print them and edit them when requested by the user. They are provided for the modeler to place notes in.
nsn=
The American National Stock Number (NSN) for this part, when known.
material=
The format of this string is not currently defined as there are conflicting naming/coding conventions employed by the various standards organizations (e.g., ISO, ASME, etc.).
region=
For combinations, indicates this combination is a region. Boolean.
inherit=
For combinations, indicates whether attributes from lower combinations in tree will replace higher ones. Boolean, default=0.
oshader=
For combinations, read by the "rt" program, optical shader name and parameter string (separated from each other by white space). Meaningful only at or above a region node, and only on a combination, or in an attribute-only "macro".
rgb=
For combinations, when present indicates optical rgb color is specified.
region_id=
For regions, GIFT compatability. Integer.
giftmater=
For regions, GIFT compatability. Integer. (Point of Discussion?????Should we use negative values for air codes, positive for nonair, so we can eliminate air codes?)
aircode=
For regions, air code. Integer. 0 is the same as attribute not specified. (Point of Discussion?????Possibly eliminated in favor of negative giftmater values).
los=
For regions, GIFT compatability. Integer.
component=
For regions, the name of the MUVES component containing this object.
rlist=
The proposed BRL-CAD "replacement list" field would be stored on a binary-block attribute ("rlist="). [deferred implementation]
macro=
If present, specifies name of an attribute-only object to be consulted for additional attribute values.
All other attributes, from whatever source, would be stored similarly, including application-specific and end-user-created attributes.
 

Object Body

The contents of the Object Body are opaque to the database layer. The contents of this element are interpreted based upon the Object_Type. The Object_Body is not constrained to start on a chunk boundary. 
 

Padding and Length Rounding

The minimal object is a Free object (with no name) 8 bytes long:
Magic1 (1byte)
HFlags = 000xxxxx (1 byte)
AFlags = 0000xx00 (1 byte)
BFlags = 0000xx00 (1 byte)
ObjType = Free (2 bytes)
ObjLen = 7 (1 byte)
Magic2 (1 byte).
This is why we have chosen the 8-bit size for our chunks. Pad bytes are inserted as necessary in the Object Footer immediately before the second magic number so that the final byte of the object is the Magic2 byte. The pad bytes are not counted as part of the Body_Length, but are counted as part of the Object_Length.

The minimal valid object is thus the following Free object:

Magic1 (1byte)
HFlags=00000010 (1 byte), Wid=00, N=0, DLI=02
AFlags=00000000 (1 byte), Wid=00, P=0, ZZZ=000
BFlags=00000000 (1 byte), Wid=00, P=0, ZZZ=000
Object_Type=RESERVED (2 bytes)
Object_Length=1 (1 byte)
Magic2 (1 byte).
The header of the database will always look like this:
Magic1 (1byte)
HFlags=000xxx01 (1 byte), Wid=00, N=0, DLI=01
AFlags=0000x000 (1 byte), Wid=00, P=0, ZZZ=000
BFlags=0000x000 (1 byte), Wid=00, P=0, ZZZ=000
Object_Type=RESERVED (2 bytes)
Object_Length=0 (1 byte)
Magic2 (1 byte)
The hex and ASCII dump of this object would look something this:
76 01 00 00 00 00 00 35 |v......5|
The minimal valid allocated database storage object (with an Object_Name, no Object_Attributes or Object_Body) would thus be:
Magic1 (1byte)
HFlags=001xxxxx (1 byte), Wid=00, N=1, DLI=00
AFlags=0000xxxx (1 byte), Wid=00, P=0, ZZZ=000
BFlags=0000xxxx (1 byte), Wid=00, P=0, ZZZ=000
Object_Type=OPAQUE?????_BINARY (2 bytes)
Object_Length=2 (1 byte)
Name_Length=2 (1 byte)
Object Name (1 character + null byte) (2 bytes)
Pad (5 bytes)
Magic2 (1 byte).
Without the padding, this (rather useless) object would be 10 bytes long. Given the rounding requirements, it is clear that all allocated storage objects in the database must be at least 16 bytes long. A database object with a minimal Object_Body would need 13 bytes, which would need to be padded out to 16 bytes as well:
Magic1 (1 byte)
HFlags=001xxxxx (1 byte)
AFlags=00x1xxxx (1 byte)
Object Type (2 bytes)
Object Length=16 (1 byte)
Name Length=2 (1 byte)
Object Name (1 character + null byte) (2 bytes)
Body Length=1 (1 byte)
Body Data (1 byte)
Pad (3 bytes)
Magic2 (1 byte).


Sentence to introduce Table 13.

Table 13. ??????

31
16
15
0
Magic1
HFlags
IFlags????
Object...
... Type
Object Length
Name Length
Name Char
Name Null
Body Length
Body Data
Pad
Pad
Magic2

Consider the external form of a sample object with the HFlags byte set to 01100000 (so Object_Wid=01, N=1) and the Interior_flags byte set to 01110000 (so Interior_Wid=01, P=1, and ZZZ=00). All length fields are thus 16 bits wide, each of the Object_Name, Object_Body, and Object_Attributes elements is present, and the object interior is not compressed. Then the object might have the byte layout shown in Table 14.
 
 

Table 14.  Byte Layout of Object with HFlags Set to 01100000

31
16
15
0
Magic1
HFlags
IFlags????
Object...
...Type
Object Length
Name...
...Length
Object Name...
...Object Name
Attribute Length
Attribute Data...
...Attribute Data
Body Length
Body Data
Pad
Magic2

In practice, this particular object could have been encoded with 8-bit-long fields.

The reason 1 byte of magic number at each end of an object suffices is because the byte before the Magic1 byte of an object should be the Magic2 byte of the previous object. That fact plus the 8-byte alignment constraint will make it possible to reliably locate the start of each undamaged object in the database even if portions of the database have been corrupted.

Extension mechanism: If additional database fields need to be added in subsequent revisions, they will be placed in the Object_Footer just above any pad bytes. The new fields will seem to older versions of the database reader to be pad bytes, and so will be silently skipped.

Compressed Objects

The nice layout of each object makes it possible to apply data compression techniques to the Object_Interior. Each object's internals will be compressed independently, as specified by the ZZZ flag; one database may have a mix of compressed and uncompressed objects. Only the Magic1, Magic2, Flags, Object_Type, Object_Length, and Object_Name elements would need to remain in uncompressed format, so that the object could be located and/or skipped without needing to decompress it.

When the ZZZ bits are 000, the object has not been compressed and needs no special processing. When the ZZZ flag is nonzero, the on-disk object has been compressed. When the interior of a compressed object is needed, an extra processing step is added: after reading the object into memory in a db_external structure, the decompression algorithm is run on the interior of the object, resulting in a new db_external structure. The table of values for the ZZZ flag has been given earlier.  Setup sentence

Table 15.  ??????

Before
Decompression
After
Decompression
Magic1 Magic1
Flags Flags
Object_Type Object_Type
Object_LengthB Object_LengthA
Object_Name Object_Name
Object_Interior
(compressed)
Object_Attributes
pad Object_Body
Magic2
  pad
Magic2

After decompression, the object would look as described earlier, but the value of ObjectLength in the decompressed object would be different than the value of ObjectLength in the compressed object.

Decompression of compressed objects will be automatic and will be performed only when needed. For example, objects will not be decompressed when scanning the database (e.g., to build the in-memory table of contents). Compression will only be performed by user command; it is envisioned that MGED will be given compress and decompress commands to allow the user to have complete control over the speed/space tradeoffs.

Compression support is mandatory for any implementation of a database reader library; it is optional for an implementation of a database writer library.

Point of Discussion????Proposal: The Object_Attributes and Object_Body should be separately compressed. This way, if only one or the other is needed, only that portion needs to be decompressed. Also, the ASCII strings of the attributes and the binary data of the body are likely to need different treatment for compression. Some compression algorithms can be given "hints" that they are compressing 16- or 32-bit-wide binary data, rather than streams of 8-bit characters.

????Statement about the compression algorithm being specified by ZZZ bits in AFlags for attributes and BFlags for body.???

In the compressed form, an extra field that has the uncompressed length is added.
 
 

Table 16. ?????

Before
Decompression
After
Decompression
Magic1
Magic1
Flags
Flags
Object_Type
Object_Type
Object_LengthB
Object_LengthA
Object_Name
Object_Name
Object_Attribute Length
(Compressed size)
Object_Attribute Length
Uncompressed Size
Object_Attributes
Object_Attributes
(Compressed)
Object_Body Length
(Compressed size)
Object_Body Length
Uncompressed Size
Object_Body
Object_Body
(Compressed)
pad
pad
Magic2
Magic2


Overall Database Organization

As a whole, the database is organized in such as way that the length of each object is potentially quite different (see Table 17).
 
 

Table 17.  Database Organization

Header Object
Object 1
Object 2
.
.
Object k
Optional Header Object
Object k+1
.
.
Object n

Each (nonfree) object must have a name that is unique in a global name space. The object's name is its retrieval key. In its most basic form, the database can be thought of as a relational database (table) with a single key.

Traditionally, BRL-CAD databases with no commonly named objects can simply be concatenated. When this is done, it results in a second header in the middle of the file, which makes such concatenation easy to detect. (Note: If two or more objects have the same name, when the concatenated database is opened, there will be only one object with that name--the last one encountered.)



 

Details of the Object Body Element for BRL-CAD Constructive Objects

Nongeometric Objects

The object type value will be used to indicate the type of BRL-CAD constructive object (i.e., primitive shape or combination) that is contained in the object body.

Combinations (Groups and Regions)

Combinations will use attributes to store unlimited length strings for:

Storing Binary Trees in Combinations

Table 18 illustrates the structure of the combination record.

Table 18.  Combination Record Structure

Wid
Matrix Count
Leaf Count
Expression Length
Expression Depth
Matrices
Leaves
Expression

The 2-bit "Wid" flag specifies the size of the "Matrix Count", "Leaf Count", "Expression Length", and "Expression Depth" fields with the same semantics used in the "Wid" flags of HFlags, AFlags, and BFlags above. The "Matrix Count" field indicates the number of matrices in the "Matrices" section, the "Leaf Count" field indicates the number of leaves in the "Leaves" section, and the "Expression Length" indicates the sum of the number of operators and the number of operands in the "Expression" section. The "Expression Depth" field indicates the maximum size of a stack (in number of operands or subexpressions) required to parse the expression.

The "Matrices" section is a list of matrices, each recorded as 16 IEEE doubles in network order.

The "Leaves" section is a list of 2 * LeafCount members (as shown in Table 19).
 
 

Table 19.  Need Title

0
name_0
1
mi_0
2
name_1
3
mi_1
.
.
.
2*LeafCount - 2
name_LeafCount
2*LeafCount - 1
mi_LeafCount

where "name_i" is the name of the ith leaf and "mi_i" is the index into the "Matrices" section for the corresponding matrix. To avoid having to store identity matrices, the "mi_i" value of 2w - 1, where w is the value encoded by the "Wid" flag (See the "Wid" flags of HFlags, AFlags, and BFlags above.) (in other words, all bits of "mi_i" are set), is defined to indicate the identity matrix.

The "Expression" section records the tree in postfix (a.k.a., reverse Polish) notation using the encoding shown in Table 20.

Table 20. ????

Token
Meaning
1
operand
2
union
3
intersection
4
difference
5
symmetric difference (XOR)
6
complement (NOT)

Each time the "operand" token is popped from the stack, it will be interpreted as "the next element in the 'Leaves section'".

An empty "Expression" section (which implies that "Expression Length" and "Expression Depth" both equal 0) has the special interpretation that every operator is a union.

As an example, the combination with following expression and matrices:
 

((
alpha U bravo ) - ( charlie * delta )) - ( delta echo )
  I   Mb   Mc   I   Md   Me
 

could be written out as shown in Table 21.

Table 21. Need title.

Section
Content
Meaning
Wid
00
8 bits
Matrix Count
0000 0100
4
Leaf Count
0000 0110
6
Expression Length
0000 1011
11
Expression Depth
0000 0011
3
Matrices
b1,1 ... b4,4
Mb
c1,1 ... c4,4
Mc
d1,1 ... d4,4
Md
e1,1 ... e4,4
Me
Leaves
"alpha"
 
1111 1111
Identity matrix
"bravo"
 
0000 0000
0 (i.e., Mb)
"charlie"
 
0000 0001
1 (i.e., Mc)
"delta"
 
1111 1111
Identity matrix
"delta"
 
0000 0010
2 (i.e., Md)
"echo"
 
0000 0011
3 (i.e., Me)
Expression
0000 0001
operand (leaf0 = (alpha, I))
0000 0001
operand (leaf1 = (bravo, Mb))
0000 0010
union
0000 0001
operand (leaf2 = (charlie, Mc))
0000 0001
operand (leaf3 = (delta, I))
0000 0011
intersection
0000 0100
difference
0000 0001
operand (leaf4 = (delta, Md))
0000 0001
operand (leaf5 = (echo, Me))
0000 0100
difference
0000 0100
difference

 
 

Geometric Objects

When Major_Type = 2, the Minor_Type value will be used to indicate the type of BRL-CAD geometric object that is contained in the object body.

The Ellipsoid

The object body for the ellipsoid primitive shape contains a center point V and three mutually perpendicular radius vectors: A, B, and C. Each number is stored in 8 bytes as double-precision IEEE floating point, written in big-endian order, so the object body for this primitive shape will require ((1 * 3) + (3 * 3)) * 8 = 96 bytes. The layout of Body_Data for the ellipsoid is given in Table 22.

Table 22.  Ellipsoid Body Data

V
point
A vector
B vector
C vector

The A, B, and C vectors define a local coordinate system for the ellipsoid. The lengths of the vectors establish the three radii of the ellipsoid, each in the direction of its respective vector.

The Sphere

In traditional BRL-CAD ".g" databases, the sphere has not been separately encoded. Instead, it has been stored as an ellipsoid in which the A, B, and C vectors all have the same length (within a tolerance value). The raytracing prep routines have determined which ellipsoid primitive shapes were actually spheres and revectored those sphere shapes to a special higher-performance ray-shape intersection routine.

It is more storage-efficient, and less prone to tolerance errors, to store spheres explicitly represented as a center point V and a radius r. Each number is stored in 8 bytes as double-precision IEEE floating point, written in big-endian order, so the object body for this primitive shape will require ((1 * 3) + 1) * 8 = 32 bytes. The layout of Body_Data for the sphere is given in Table 23.

Table 23.  Sphere Body Data

V
point
r scalar

The possible drawback to this change is that the vectors A, B, and C traditionally stored for the sphere are an explicit local coordinate system. This fact is used to aim existing spherical light sources, as the center of the beam of light is focused down the primitive shape's -Z axis, which is transformed into the desired aim direction either via rotating the A, B, and C vectors or by transforming the local coordinate system of the primitive shape with the homogeneous transform matrices in the arcs of the model DAG. It is not clear how this functionality could be preserved, except perhaps by making sure that light sources were explicitly modeled as ellipsoids rather than as spheres (perhaps with equal-length vectors). This would be something that an automatic database converter could guess at by looking for combinations with material type light and a single member that is an ellipsoid, but manual repairs would probably still be necessary.

The N-Manifold-Geometry Primitive Shape

For all intents, the on-disk format of NMGs has not changed between version 4 DB and version 5 DB. The disk version is documented below. When a pointer is serialized, it is actually turned into an index. Each NMG structure is assigned a location (order) on the disk based on kind and order of a structured walk of the model. When a pointer is serialized, it turns into a LONG int that indexes into the "disk" array of structures of that kind.
Version
A version number of the on-disk format of the NMG.
Kind Count Array
An array of 26 counts. Each count represents the number of structures of a particular kind.
Model
Long containing DISK_MODEL_MAGIC, a Long Version (Always zero), serialized BU_LIST of regions.
NMG Region
Long DISK_REGION_MAGIC, serialized BU_LIST of regions, serialized pointer to model, serialized pointer to region attributes, serialized BU_LIST of shells.
NMG Region Attribute
Long DISK_REGION_A_MAGIC, serialized point of minimums of bounding box, serialized point of maximums of bounding box.
Shell
Long DISK_SHELL_MAGIC, serialized BU_LIST of shells that are in the same NMG region, serialized pointer to owning region, serialized pointer to shell attributes, serialized BU_LIST of face uses in this shell, serialized BU_LIST of loop uses in this shell (edge groups), serialized BU_LIST of edge uses (if shell has wires), serialized pointer to a single vertexuse (if shell only has one vertex).
Shell Attribute
Long DISK_SHELL_A_MAGIC, serialized point of minimums of bounding box, serialized point of maximums of bounding box.
Face Use
Long DISK_FACEUSE_MAGIC, serialized BU_LIST of faceuses (of which this is one) that are part of this shell, serialized pointer to the owning shell, serialized pointer to the face use mate of this faceuse, long orientation, serialized pointer to the face, serialized BU_LIST of loops in this faceuse.
Face
Long DISK_FACE_MAGIC, serialized BU_LIST of faces (of which this is one) that reference the same face geometry, serialized pointer to the owning faceuse, serialized pointer to the face geometry used, long flag describing if face normal is flipped.
Face Geometry(Plane)
Long DISK_FACE_G_PLANE_MAGIC, serialized BU_LIST of faces sharing this surface, serialized plane equation of this face (Normal and distance from origin to plane).
Face Geometry(SNURB)
Long DISK_FACE_G_SNURB_MAGIC, serialized BU_LIST of faces sharing this surface, long U dimension order, long V dimension order, long U dimension knot size, long V dimension knot size, U knots, V knots, long US size, long VS size, long number of control points, list of control points.
Point of discussion?????????Of which I don't understand *half* of the above sentence and I don't believe it is totally correct.
Point of discussion????Use of "use" in the following chapters. consistency in one word vs. two.
Loop Use
Long DISK_LOOPUSE_MAGIC, serialized BU_LIST of the loop uses that belong to the owning face, serialized poiter to faceuse or shell that owns the loopuse, serialize loop use mate pointer, long orientation flag, serialized pointer to loop, serialized BU_LIST of edge uses or *A* vertex pointer.
Loop
Long DISK_LOOP_MAGIC, serialize pointer to owning loop use, serialized pointer to loop geometry.
Loop Geometry
Long DISK_LOOP_G_MAGIC, serialized minimums of bounding box, serialized maximums of bounding box.
Edge Use
Long DISK_EDGEUSE_MAGIC, serialized BU_LIST of edge uses that make up this loop or wire edge, serialized pointer to loopuse/shell owner of this edge use, serialized pointer to edge use mate, serialized pointer to the radially adjacent face use(or zero for wire edges), serialized pointer to edge, long flag for orientation, serialized pointer to the first vertex use of this edge use in this orientation, serialized pointer to linear segment or curved segment.
Edge
Long DISK_EDGE_MAGIC, long flag for is this a real (modeled) edge or an artifact from tesselatio??????, serialized pointer to one use of this edge.
Edge Geometry(Linear)
Long DISK_EDGE_G_LSEG_MAGIC, serialized BU_LIST of edge uses on this line. Point (3tuple of doubles), direction (3tuple of doubles).
Edge Geometry(CNURB)
Long DISK_EDGE_G_CNURB_MAGIC, serialized list of edge uses on this curve, long curve order, long size of knot, ??? of knot(?s), long curve size, long point type, long list of control points.
Vertex Use
Long DISK_VERTEXUSE_MAGIC, serialized BU_LIST of vertex uses of which this is one (CTJ ERROR), serialized pointer to shell/loopuse/edgeuse owner of this verte us???, serialized pointer to the vertex, serialized pointer to the plane/cnurb.
Vertex Use Attribute(Planer)
Long DISK_VERTEXUSE_A_PLANE_MAGIC, Normal (3tuple of doubles) which should be a unit vector.
Vertex Use Attribute(CNURB)
Long DISK_VERTEXUSE_A_CNURB_MAGIC, parameter space 3 tuple of U, V, and W.
Vertex
Long DISK_VERTEX_MAGIC, serialized BU_LIST of vertex uses of this vertex, serialized pointer to the vertex geometry.
Vertex Geometry
Long DISK_VERTEX_G_MAGIC, point (3tuple of doubles)
Unused*4
There are four entries in the kind table that are not used??????.
Doubles Array
Unknown? Nothing?

The ARB8

?????Are ARB4/5/6/7/8 stored as separate object types, as one object type with a subtype code, or with degenerate points, as at present?

In the v4 format, the ARB8 is stored as a base point V and seven vectors, making spatial translation computationally inexpensive. In the v5 format, the ARB8 will be stored as eight points in space to prevent values from mutating due to repeated vector addition and subtraction of V. The layout of Body_Data for the ARB8 is given in Table 24.

Table 24.  ARB8 Body Data

P1
point
P2 point
:
P8 point

Each number is stored in 8 bytes as double-precision IEEE floating point, written in big-endian order, so the object body for this primitive shape will require 8 * 3 * 8 = 192 bytes.

The TOR

In the v4 format, the two radii for the torus are encoded in the length of the corresponding vectors. The new format stores a center point, a normal to the plane, and two radii.

The Halfspace Primitive Shape

A plane is defined by a unit-length outward pointing normal vector N, and the (perpendicular) distance d of the origin from the plane. In BRL-CAD, this is often manipulated as a 4-tuple S, of which the first three elements are N and the last is d. The plane/surface consists of all points P = (x,y,z) such that
VDOT(P,N) - d == 0
or
VDOT(P,S) - S[3] == 0
That is,
N[X]*x + N[Y]*y + N[Z]*z - d == 0
or
S[X]*x + S[Y]*y + S[Z]*z - S[3] == 0
The layout of Body_Data for the halfplane is given in Table 25.

Table 25.  Halfplane Body Data

N
vector
d scalar

Each number is stored in 8 bytes as double-precision IEEE floating point, written in big-endian order, so the object body for this primitive shape will require ((1 * 3) + 1) * 8 = 32 bytes.

The other geometric types

Detailed specifications of each of these are necessary.
ARS,
B-Spline,
ARBN,
PIPE/Wire,
Particle,
TGC (includes REC),
Torus (TOR),
Grip Pseudo-Solid (Leaf) and Joint Pseudo-Solid,
Right Parabolic Cylinder (RPC),
Right Hyperbolic Cylinder (RHC),
Elliptical Paraboloid (EPA),
Elliptical Hyperboloid (EHY),
Elliptical Torus (ETO),
Extruded Bitmap (EBM,
Volumetric (VOL),
BOT

A Proposal for Parametric Databases (Do we want this written as to be done or already done?)

The basic goal is to enable a modest number of objects (both combinations and primitive shapes) to have some of their defining parameters (both numeric and string) taken from a TCL variable rather than a numeric constant, without loss of performance when processing nonparameterized objects.

These TCL variables can be set by prior code execution (e.g., from commands given in MGED or from commands placed on the RT command line or in an RT animation script), as well as by TCL code included in the Header object(s)'s object-body field. In order to prevent long dependency chains between objects from being formed, only substitution of existing variables can be performed when an object is read -- formulas and procedure calls will not be permitted at substitution time. As a result, all variables will have been given values before any objects are retrieved from the database.

If variables depend on the values of other variables, then the user must take into account the order of assignment, just as with other types of programs.

Those binary numeric fields in the object that are to be substituted will be filled with an IEEE Signaling NaN pattern, to distinguish it from a valid binary number. Each object will need to have an additional variable-length field that will be used to hold the replacement variable list. Two possible strategies:

        field_number=variable_name,
        field_number=variable_name,...
After the object was imported into internal form, the indicated structure fields would be updated by de-referencing the corresponding variables. The Signaling NaNs would be used only for protection against software error.
Both of these approaches are unwieldy for objects with large numbers of parameters, like NMGs and B-spline surfaces.

In this proposal, two different interfaces will be required for reading objects from the database:

db_get_internal() will continue to get the purely numeric form in rtgeom.h format.

Some other function will be required to get and put the objects in an internal symbolic form. It might be wise to make this have semantics as close as possible to the form of Glenn Durfee's TCL MGED "db get" command. That is, the editable symbolic form should be an ASCII text string in a format suitable for further TCL processing. As get/put operations on the symbolic form will be quite rare relative to the operation to get the numeric form, the extra overhead of using an ASCII representation is negligible.

There would be a great consistency advantage to be gained if the symbolic substitutions could be handled by the same routines as the "db get" support. The "db get" routines are presently local to TCL MGED, but the "get" and "put" operations need to become standard function table interfaces to the g_xxx.c modules in LIBRT. This raises the related question of how bulky objects like NMGs, B-splines, and height fields should be represented in their ASCII forms.

String parameters, such as the optical shader parameter string, can be passed through the TCL interpreter if a left-square-bracket ("[") character is seen in the string. This would permit the normal TCL syntax to invoking the TCL interpreter to be used. In this example, the following two lines have the same effect:

        plastic shine=[$shine] diffuse=[dfunc($statevar)]
        [plastic shine=$shine diffuse=dfunc($statevar)]

Highly Interactive Coupling Between Variables and the MGED Display

In order to allow highly interactive modifications of geometry that is defined parametrically, a set of extra support routines are desirable. A typical goal is to support the creation of a slider widget in MGED assigned to a database variable so that the user can adjust the slider and watch the geometry on the screen update "automatically".

The tree walker should keep an auxiliary table that is indexed by database variable name. For each database variable that has been seen while walking the tree, there will be a list of db_full_path structures for every primitive shape dependent on that variable, and a flag variable indicating whether the presently loaded version of each primitive shape is out of date with respect to the variables on which it depends. (What data structure to use for the many-to-many mapping of variables to full paths?) The tree walker will automatically associate a TCL hooked function with each TCL variable on this list. Whenever a new value is assigned to a TCL variable, the dependent primitive shapes will be marked as out of date, but no other action will be taken.

Application code (TCL code or C code) will be able to call a library routine to "flush" the modifications, that is, to re-process all out-of-date primitive shapes to bring their internal representations (whether wireframes on an mged screen or prepped primitive shapes for raytracing) into sync with the variables on which they depend. It is envisioned that there will a library routine (db_variable_flush) that the application can call, providing a pointer to an application-provided function that will be called once with each out-of-date full path. In the case of mged, after processing each batch of input commands and events and before updating the screen, db_variable_flush would be called with a handler that essentially reruns the mged "e" command for each stale full path, so that the vlist would be regenerated before the screen update. As long as the number and complexity of the primitive shapes to be regenerated per screen update are modest, this will be highly interactive.

The simplest version of a TCL command to implement a Tk scale widget that would have this auto-update property would look something like this:

        scale .sliders.f.kX -label "Change Variable ANTENNA_HEIGHT" \
                -variable antenna_height \
                -command "mged_variable_flush"


Requirements for the New Selector Node

Goal: To allow different geometric representations to be selected based upon the value of a TCL variable. This might be used to provide different levels-of-detail (LoD), or it might be used to enable or disable complex subsystems from being included in analyses where they are not necessary. For example, a substantial runtime savings can be realized when computing radar signatures by ignoring all insulators (such a rubber tires), leaving only conductors and dielectrics in the active geometry.

This might be accomplished in several ways:


Future Plans


Random Unresolved Questions

Perhaps encapsulate the dirbuild/get/put object routines into an OO layer of their own, to facilitate co-existence of v4 and v5. This would make it straightforward to have the v5 software at least continue reading v4 databases on a read-only basis.

Make the rtgeom.h-object-to-TCL conversion routines another access method for geometric objects. Question: How to convert nurbs, NMGs, HFs, etc.?

Perhaps export the C symbol for the rt_structparse table that corresponds to the rtgeom.h structure.

It would be nice if there were some way of "mounting" or "including" a subtree from a second ".g" file into the context of the current ".g" file, even if only on a read-only basis. Issues: name conflicts, ambiguity of names in full-path specs, multiple dbip in play for a "single" ".g" file, back/up references from the included subtree.


Community Feedback on the Proposal

David P. Kitzinger, ARA

From: "David P. Kitzinger" 
Subject: Re: Input Sought on V5 Database Format

A couple of thoughts on the database format. First, ARA would like some
resemblance of dynamically created geometry (geometry that gets 
introduced or turned on/off during an interrogation cycle).

While we've talked about being able to update the database and perform
incremental preps (which probably doesn't require database changes, just
new routines to accomplish the aforesaid), it may be attractive to
provide a mechanism whereby the client code can tell BRLCAD to ignore
or turn off a piece of geometry. This would require an extra bit
in the database for entities that BRLCAD code maintains.
Mike responds: The generalization of what you ask for are (a) some form of the selector node, and (b) the ability to direct LIBWDB to write geometry into the in-memory database, instead of into a disk file. The latter won't be any problem to provide, although it's only peripherally related to the new database format I'm hopeful that we'll implement that capability when we re-vamp LIBWDB to be aware of the new database. The former capability is something that I still don't have a good feel for. How complicated should the selection expression be? How many choices should be allowed? (two, or many?) Where do the variables come from that are used to make the selection?
Another perhaps minor feature that might be useful is the notion of 
database versions. Because a .g can be copied, the UNIX timestamp
is not adequate to know which version of a .g file is most up to date.
Also, if folks perform extra preprocessing on a .g file and save the
data offboard in another file, the possibility for out-of-sync problems
exist. It might be nice if the database could maintain a version number
field for the entire database, perhaps along with a field for the login
of the user making the last change. mged and/or libwdb routines would
then increment the version number whenever alterations to the .g file
occur, no matter how insignificant the change might be. mged could display
this info when it first brings up the file, and an access routine could
be provided to permit client code to check versions. This basically is
revision control, but unburdens client code from trying to manage the
same info off-board (since a modeler can always get into the .g and
bypass any extra layers an application might wish to maintain, there is
always risk when the application attempts to control this). Of course,
if a modeler goes outside of BRLCAD, like cat'ing .g files together instead
of using "concat", there may be problems but this should be rare.
Perhaps having the database maintain a checksum for the .g would also catch
even these problems.
Mike responds: A version number could certainly be added, but I'm not sure how useful it would be. If user B copies a database from user A at revision level 42, and user A makes one more change (A has rev 43), and user B makes 3 changes (B has rev 45), how would the version number help, other than to indicate that things are mis-matched?

What is really needed is a better method for interfacing to RCS, which can deal with branches off the main path of development, and can even perform merges between two branches. I don't know how well RCS would work on binary databases. Perhaps if we could supply RCS a BRL-CAD-specific item-by-item comparison function....?

I'm certainly more familiar with the BRLCAD interrogation routines than
actually constructing BRL-CAD models. Whenever certain operations
occur, old data in the .g file is marked ID_FREE and left in place.
Depending on the construction approach, this can leave rather large
pockets that make the file large and artificially jack-up memory usage
when interrogating. I solve this by converting to asc and then back to
.g file to eliminate these pockets. Is there an existing mged command
now that does basically the same thing?? Can the new database be designed
to minimize/utilize these holes and/or can you provide routines/mged 
commands to "compact" the database?? Since you are going to an IEEE
format g2asc will no longer be needed so my old approach will no longer
work.
Mike responds: The ID_FREE "pockets" are subject to re-use when the next database object is created. Also, they don't increase the amount of memory needed to hold the prepped database, although they do of course have to be read in so they can be skipped over. The worst-case (and very common) scenario in which these pockets can become quite plentiful is this:
        +----+----+----+
        | S1 |  R... R | End
        +----+----+----+
        +----+----+----+----+----+----+----+
        | S1 |  Hole   | S2 |  R........R  | End
        +----+----+----+----+----+----+----+
        +----+----+----+----+----+----+----+----+----+----+----+
        | S1 | S3 |Hole| S2 | Hole....Hole |  R..............R | End
        +----+----+----+----+----+----+----+----+----+----+----+
It would be possible to change the algorithm slightly, such that the old version of R was deleted before the new version of R was allocated and written. That would make for fewer "holes", but it wouldn't be a stable store algorithm, and thus both versions of R could be lost if the user typed ^C or the system crashed at the wrong moment.

As for the second part of your suggestion, it would be very easy to add a command to MGED to compact out all the free granules. It could be implemented in two ways: in place, which would risk database damage if the system (or program) crashed while it was happening, and compaction by copying, which would risk running the disk out of space while operating, but would leave the original uncompacted file untouched in case of failure. I prefer #2. I propose the command name "compact".

Another question regards construction rules. mged allows me to construct
a region by combining primitive shapes, other regions, and even groups with
boolean operations. As far as I
can tell this all seems to work fine, I've rendered some examples of such
regions and they seemed correct, although warning
messages are occasionally issued. Is this supposed to work? Is this a valid
construction method? When I shoot a ray through a top level region,
are lower pieces (groups and regions) treated as an integral part of the
overall region (i.e. does the partition list reflect only the top level
region?). Can this be clarified in the database format document or
somewhere else? It obviously is a powerful method, so if you don't 
officially support some permutation of the above, would you consider it
for the next release?
Mike responds: being able to combine regions with other regions is supported -- the higher level region's material definition persists all the way down. This is particularly useful for being able to correct overlaps by subtracting off the offending region, rather than having to identify the precise primitive shape that's overlapping.

Being able to boolean things from "upper level" combinations is also supported. In this case, a copy of the boolean operation is propagated downward to all affected regions, accumulating transformations as you would expect. This is especially nice for doing "cutaway views". Both of these things are features of the "new" tree walker in Release 4.

Keep up the good work,
                                           /|     /--\   /|
                                          / |    /   |  / |
David P. Kitzinger                       /__|   /___/  /__|
Staff Scientist                         /   |  /   |  /   |
                                       /    | /    | /    |
Applied Research Associates, Inc.        (505) 883-3636
4300 San Mateo Blvd. NE, Suite A-220     (505) 883-3673 FAX
Albuquerque, NM 87110                    kitzinger@ara.com

John Keyser, UNC

From: John Keyser 
Subject: Comments on database v5 proposal - handling transformations.
Date: Thu, 18 Jul 1996 11:20:15 -0400 (EDT)

        You spend some time in the version 5 proposal dealing with potential 
ways of handling the transformations for binary trees in the new database

format.  I thought of two other possibilities that might be possible, although
they certainly have their drawbacks.  For one, these methods might cause
too much havoc with the "cleanness" of the data structures.  Second, they
might pose problems for backward compatibility.  Also, they could cause the
tree to be larger than desired, but probably not by too much.  I am not
familiar with animation scripts so I cannot say how they would be affected.
        The first solution would be to create transformation _nodes_ that
are unary operators.  This would allow you to represent the tree similar
to the conceptual model of what is happening - a transformation node would
separate an object from the combination.  The in-memory model would be
exactly as desired.  This would mean having three types of nodes - primitive 
primitive shapes, combinations, and transformations.
        The second solution (and the one that seems better to me) would
be to have a transformation _primitive_ defined.  Thus, using the  
(A u B) - (C + D) example, a tree would look like:

                              -
                             / \
                            /   \
                           /     \
                          /       \
                         /         \
                        /           \
                       /             \
                      ?               ?
                     / \             / \
                    /   \           /   \
                   /     \         /     \
                  u      M1       +      M2
                 / \             / \
                /   \           /   \
               /     \         /     \
              ?       ?       ?       ?
             / \     / \     / \     / \
            /   \   /   \   /   \   /   \
           A    Ma B    Mb C    Mc D    Md
                         

Where the ? nodes could be any operation.  One advantage of this would
be that you could reference the same transformation matrix from several places
(thus making it more certain that you were applying a consistent transform
at several places in the tree).
        Both methods seem to handle "chains" of transformations (such as might 
occur when an arc is deleted) fine (transformations do not need to be 
combined).  Also the idea of a "selector node" could be used to allow a user 
to change translations easily.  Another big advantage would be that for
identity matrices, nothing would need to be stored - no need for tolerancing.

                                        - John Keyser
                                          keyser@cs.unc.edu

Database Library Application Programming Interface (API)

This section documents how to use the database library for programmers developing internal code for LIBRT or MGED. The LIBRT API is not affected by the database change; (e.g., rt_dirbuild(), rt_gettrees(), and rt_shootray() are not affected).

The primary API for the database revolves around the routines rt_db_put_internal() and rt_db_get_internal(), plus the "internal" data structures vectored out from struct rt_db_internal: union tree for combinations, from "h/raytrace.h", and the family of struct rt_xxx_internal structures from "h/rtgeom.h".

The main issue for the database API becomes one of how a retrieved and "unwrapped" in-memory database object looks to an application. The most natural way would be to have a variable-length array of structures something like this:

        struct attributes  {
                char    *aname;
                char    type;
                long    len;
                genptr_t value;
        } attrib[];
where type would be either ASCII string, binary block, or binary integer (perhaps in several widths).

Then, to retrieve a BRL-CAD object (in external form) from the database, one would merely call:

        db_get_external(ext, dp);
        db_unwrap_v5_external(atab, ext);
        id = db_get_integer_attribute( atab, "t" );
        buf = db_get_block_attribute( atab, "b" );
and then proceed with the BRL-CAD-level external-to-internal conversion from there.

The main questions are:


http://ftp.arl.mil/~mike/papers/brlcad5.0/appendix.htmlAppendix: Earlier Proposals for Database Organization


$Header: /vld/mike/public_html/papers/brlcad5.0/RCS/newdb.html,v 1.70 2000/06/30 15:07:48 jra Exp $
Collected by Mike Muuss
E-mail comments to mailto:acst@arl.mil