A Tour Through the Multi-Device Queueing System

Douglas P. Kingston III

Ballistic Research Laboratory
Attn: SLCBR-VL-V (Kingston)
Aberdeen Proving Ground, MD 21005-5066

Revised for MDQS 2.12: 23 February 1989

ABSTRACT

The Multi-Device Queueing System is a full- feature queueing system that has been developed for the UNIX^{(R) **} operating system. This document is an in-depth description of the implementation of the MDQS system and is meant to serve as a guide for those who will be maintaining MDQS. It will also be of use to those interested in queueing system implementations. A theoretical overview is given, followed by a description of the configuration of the system both at compile time and at run time. Lastly, a module-by-module description is given of the system with comments relating to design choices, portability and efficiency considerations, areas scheduled to be modified, and possible problem areas.

** UNIX is a registered trademark of AT&T.

Introduction

The Multi-Device Queueing System (MDQS) has changed and evolved somewhat since the first published information.^*** This paper is designed to describe in detail the workings of the MDQS system and to introduce the maintainer to the various modules and functions. As such, it will be most useful for the reader to have available a printout of the entire MDQS source for reference. I will start by describing the tailoring information, as this is likely to be the area of greatest interest, then move on from there, pretty much in the order of the printout (header files, then sources in alphabetical order).

*** Kingston, Douglas P. and Muuss, Michael J., ``The Multi-Device Queueing System'', 1982 Summer USENIX Conference.

The BIG Picture

MDQS was designed with almost everything in mind. What we currently have is a system to handle a variety of line printer devices (e.g. Dataproducts(R), Printronix(R), Versatec(R), etc.), phototypesetting emulation via an interface to VCAT, laser printers (e.g., Imagen(R)), and batch requests in a fairly civilized manner. There is also a ``network'' pseudo device that is used to send requests to remote machines for processing. There are currently network interfaces for Berkeley 2.8BSD TCP/IP, Berkeley 4.2BSD TCP/IP, and UUCP (not fully tested).

The theoretical model that MDQS implements is a very simple but flexible queueing system. The main entities of the system are requests, queues, devices, and mappings. A request contains all the information necessary for the system to determine where to put the request and how to service it. A request is built and sent to a queue. There will normally be several queues to handle different needs. A device is something that is assigned a request based on a mapping from queue to device. The device is a resource to be shared and normally maps to some real device, although this is not necessary. The key to MDQS's flexibility is the mapping function from queue to device. The mappings consist of (queue,device,server) triples and are logically thought of as a linked list attached to the specified device. If during operation MDQS discovers an empty device, it goes to the mapping table for that device and examines each queue in the list. If the queue is not empty, the first eligible request is assigned to the device. If that queue is empty, the search continues until the end of the table is reached, in which case the device is left idle. The table can be searched round-robin or in table order (which is the default). This system admits of two powerful constructions in addition to simple queue-to-device mappings. First, we can have multiple queues feeding a single device. The first queue scanned that has an eligible request is serviced. An example of this is using a single Versatec to service both a print and a plot queue. Second, there can be more than one device mapped to a given queue. Thus, if a system has several line printers, it can maintain a single print queue which is serviced by two or more printers. In fact, you can have two or more queues being serviced by two or more devices if you so desire.

The heart of the system is the daemon. The daemon is started by init through /etc/rc. The daemon wakes up every few seconds and stats a file called ``the prod file'' because it is used to ``prod'' the daemon to check for changes. The daemon will check for further changes whenever the modification time on the prod file changes. When the daemon is prodded, it stats a number of files looking for something that might have changed. Among these files are the configuration file, the ``new requests'' directory, the ``modified requests'' directory, and the device status files. If any of these have changed, appropriate action is taken after determining the nature of the change. After that, the daemon looks for delayed requests which have reached their activation times. These requests are queued to the appropriate queues. Finally, the daemon scans the devices and looks for devices which do not have a running request or have no request at all. These are given a request if possible and are then started up by forking a child to do the real work of completing a request.

There are some other activities that the daemon is involved with as well. The daemon keeps several informational files which are currently used by the status program (qstat) and the device control program (qdev). Since only the daemon knows the actual ordering of the queue, it writes out a compacted list of the queue requests in the order that the daemon has them in core. This information is used by qstat to give an accurate indication of the queueing order. The ``qdev'' program and the daemon jointly maintain the device control information.

Directory Structure

The MDQS directory hierarchy is designed to provide a maximum of security and a minimum need for linear searches on the part of the daemon. The philosophy behind this structure is that a non-privileged process places protected files into a temporary directory, then execs a setuid process that can chdir down into the main spooling directories. The setuid process immediately resets its uid and gid to that of the invoker and then moves the files from the temporary directory down into its current protected directory. In essence, we allow only trusted processes to put requests into the queueing directories.

The spooling directory hierarchy is as follows:

        /usr/spool/q/                   mode 755
                qtmp/                   mode 777
                lock/                           mode 700
                        home/           mode 755
                                cntrl/  mode 777
                                data/           mode 777
                                new/            mode 777
                                mod/            mode 777
                                adm/            mode 777
                                hold/           mode 777

The home of the daemon and other MDQS processes while they are running is usually the ``cntrl'' directory, as this is where all the request files are kept. Since many of the processes that are acting on the user's behalf are not running privileged most of the time, they refer to the other queueing directories (data, new, mod, adm, hold) by relative pathnames since they cannot traverse the ``lock'' directory. Examples of these would be the qstat, qmod, and qdev programs, as well as the second stage queueing process, queue2.

The ``cntrl'' directory contains request control files (one per request). The brother directory ``data'' contains the data that will be needed to work on the request (the files to be printed, the shell commands for a batch process, etc.). The ``new'' directory is normally empty and is used to signal the daemon that there is a new request to be processed. The daemon will stat this directory every few seconds to see if the modification time has changed. If it has, it will read the directory and add each request it finds to its internal queue, then unlink the file from ``new''. The file is actually a link to a file of the same name in the ``cntrl'' directory, but by placing the file in the new directory, we save the daemon the task of scanning the cntrl directory to discover which file is the new one. The mod directory works in much the same manner as the ``new'' directory, except it is used to indicate that the user has made a change to a request using the qmod program. For the ``mod'' directory, the file is not a link, but instead contains the old request information so the daemon can easily find the original request in its internal queues. The ``adm'' directory contains the daemon's queue status files, lock files, and device status files for the server processes. These device status files are used by the qdev command to communicate enabling/disabling and form changes to the daemon. In turn, the daemon communicates device failure and current active request data back to qdev and qstat through the same file. The ``hold'' directory is used by the daemon to save copies of the request files for those requests that cause uncorrectable or severe errors to the daemon or its servant processes. These are saved for the amusement and enlightenment of people like yourselves.

Configuration

Before I talk about source, let me explain the layout of the source directories. The directories are laid out to allow more than one installation's binaries to be maintained at once on the same system. To accomplish this, the source is in three directories (src, lib, libbprint), the common headers are in another directory (h), and each system is given a directory that is a brother of these four. The distribution comes with a prototype system generation directory called ``proto''. You should make a directory for your system and copy the files in the ``proto'' directory into the directory for your system. In addition, there are two other brother directories called ``misc'' and ``doc'' which contain some ancillary files and NROFF/TROFF source for the documentation on MDQS. The ``misc'' directory contains a sample /etc/rc entry, a replacement for the lp or lpr command, and some other sample files. Those preceded by lib.q belong in the MDQS library directory as specified in the qconf.c file.

There are three major configuration files. The first and most painful (in terms of recompilation) is qconf.h in the generation directory. Its contents include the following:

System-type

You must define your system type. This definition will control the inclusion of proper conditional code for the system in question. There is currently support for Version 6 (V6, unknown quality), Version 7 (V7), BRL's High Performance PDP/11 Unix (BRLUNIX), Berkeley 4.1 (BSD41), Berkeley 4.1c (BSD41c), Berkeley 4.2 (BSD42), and AT&T's System III and System V (SYS3). If your system does not fit in this group, examine the places where conditional code is used and create an appropriate define.

COREDUMP

If this is defined, the daemon will attempt to dump core in the Qcntrldir directory when it encounters fatal errors. This is used for debugging.

SCANWAIT

This sets the default time to be spent sleeping at the completion of each pass through the main loop. This will have a direct effect on how fast the daemon will respond to changes in the modification directory and the new request directory. It also determines the daemon's ``idle load'' on the system.

PROCWAIT

Since a wait(2) causes a process to block if there are no dead children, an alarm call is put around the wait(2) call so that if no children die after PROCWAIT seconds, the daemon will be interrupted and stop waiting. I know there are better ways in some versions of UNIX, but those are not portable. The value should be small (probably 2 seconds). On 4.2BSD, this value is ignored and a non-blocking wait is done instead (via the wait3() system call).

OPENWAIT

If the daemon cannot open a device, it will wait this many seconds before trying again.

The preceding are default times only; they can be changed at run time (see below).

SYSGRP

This group is treated as privileged by the queueing system. Members of this group can modify, flush, restart, and view all MDQS processes. On many systems this is the ``bin'', ``admin'', ``operator'', or ``system'' GID.

The next most important file is qconf.c in the system generation directory. This file contains most of the installation-dependent information. The first set of variables are pathnames. Only those that are absolute pathnames should need to be changed.

LOCKS

These should probably not be altered unless you must as the result of modifying the locking code.

LIMITS

Do NOT alter the priority limits. As for the copy limits, they are there to prevent someone from submitting a runaway job. Tailor these as you desire.

The Makefile has several variables that need to be tuned for a new site. At the top are the standard definitions for cc and ld. Following that are the flags variables, one each for compiling (CFLAGS), loading (LDFLAGS), and linting (LINTFLAGS). If you need to load the binaries with some special libraries, add the -l? specification to the end of LDFLAGS. AT&T Unix systems (those once supported by the Bell System Unix Support Group, aka USG) don't use the ranlib(1) program. If your system doesn't have ranlib, then use the second definition of RANLIB. USG Unix systems don't have dup2() or ftime(). If your system is a USG Unix system you will probably want to uncomment the line defining ``USGFAKES''. Comment this out if you have these routines. The Makefile has several variables that are used when installing the binaries using make(1). In particular...

QLIB

This is the directory into which ancillary MDQS files and programs will be put. These are not directly accessed by users, but should be accessible to a non- privileged process.

QBIN

The user interface programs will be placed in this directory. This should probably be /usr/local or /usr/bin.

QSPOOL

The root of the spooling directory tree. This is used by make when you ask it to make the directory hierarchy.

ROOT

The name of the superuser account on your system. On most systems this will be ``root''.

DAE

The name of the account to be used for MDQS privileged operations. On most systems this will be ``daemon''.

MDQS

The name of the MDQS placeholder account. This is basically so that mail has a place to originate from. This account will normally be used to run requests that originate from the network and therefore should have no special privileges and should be in its own group.

SYSGRP

The sys group is given special privileges by some programs. The value of SYSGRP here and in qconf.h should refer to the same group. On many systems this is ``bin'', ``admin'', ``operator'', or ``system''.

The Makefile has special make targets that will select the proper version of some modules based on your system type. Examine the various ``set-*'' entries in the Makefile and select the appropriate version for your system. You may have to edit copies of system headers; for example, on some Sun workstations, the time.h header would recursively include itself instead of /usr/include/time.h, due to the interaction of angle brackets in the header and cc -I. in the Makefile. You will probably also want to tailor the targets ``all'', ``lint'', and ``install'' to only include the programs you will be using on your system. Remember, there are some definitions that appear both in the qconf.c file and in the Makefile. Be sure to update both occurrences if you change something.

Finally, we have the run-time configuration file, qconf. This file is monitored by the daemon for any modifications. If the file is modified, the daemon re-reads the file and reconfigures its internal tables and variables accordingly. Injudicious editing can have severe consequences, so some caution should be exercised in changing this file.

There are four sections to this file. Each section is separated from the next by a line beginning with a hyphen (``-''). For visibility, I normally use a long row of them.

The configuration file can also have comments and blank lines. The comment character is the number sign (``#''). The number sign and all subsequent characters on that line are ignored. Blank lines are ignored. When MDQS processes the file, it takes each line, removes the comments if any, and then parses the line into arguments like the shell. Tokens on the line can be separated by any combination of spaces and tabs. Leading and trailing white space is ignored. Quoted strings (using double-quotes) are treated as single tokens. The qconf file must end in the three characters EOF, or the four characters EOF(newline). This is a safety measure so that the daemon knows that it has the entire configuration file. I am not 100% happy with the way this is done right now, so it may change, but it's better than nothing.

The first section contains changeable MDQS parameters. These entries have the form ``parametername value''. The following parameters are currently recognized. The programs affected are listed on the right-hand side

debug <number> daemon

If this value is non-zero, MDQS will print debugging information on stderr. The value of the number will adjust how much actually gets printed. Currently only 2 levels of debugging are supported. See console below.

openwait <number> daemon

procwait <number> daemon

scanwait <number> daemon

These values will override the #defined values in qconf.h (see above).

console filename daemon

The file named is opened as stderr. This is useful if you wish to temporarily redirect the MDQS diagnostics for debugging purposes. If the file cannot be opened, the original stderr is kept. If the open succeeds, a message is printed on the old console indicating the change, and the new console gets a message indicating it is the new console. A seek is done to end-of-file on the new stderr.

maxfailures <number> daemon

If this variable is non-zero, the daemon will flag a host as ``failed'' if the server on that device fails the number of times specified by ``maxfailures''. If this happens, the device can be restarted by disabling and re-enabling the device with qdev.

sysmgr <address> daemon, batch, qpr, verset

The daemon will mail orphan notices to the ``sysmgr'' address. This address defaults to ``mdqs''. On our system this must be a valid Internet address. On your system this restriction may not apply.

sysgrp <groupid> daemon

Users having this group ID are considered to be ``privileged''. This value will override the #define'd value in qconf.h (see above).

mdqsid <address/loginid> daemon, netdae

This login ID is used when processing requests that originate from the network. It is also currently given to the mail system to be used in the ``From:'' field. This dual usage should change.

accessfile <filename> netrecv

The access list file is used to restrict network access to your queueing system. This is optional. If no file is specified, then there are no restrictions. If the file is present, only those hosts listed are allowed to submit requests. If a list of queues follow a hostname, then that host can only submit requests to the queues listed.

hostname <hostid> batch, qpr, netrecv, verset

This is used to specify the hostname of your system. The hostname defaults to ``localhost''.

batch-forms <form> batch

batch-queue <queue> batch

batch-prior <priority> batch

batch-nice <niceness> shserver

These control the default behavior of the batch program. The form should be a valid MDQS form. The queue must be a valid queue specified later in the qconf file. The priority is the default priority that jobs enqueued by the batch program will be given. The niceness is the reduction in UNIX scheduler priority to be applied to batch jobs. This information is generally site dependent. Typical values would be ``Shell'', ``batch'', ``64'', and ``19'' respectively.

print-forms <form> qpr

print-queue <queue> qpr

print-prior <priority> qpr

print-hdr <headerfile> qpr

print-hdrdir <directory> qpr

These control the default behavior of the qpr program. The form should be a valid MDQS form. The queue must be a valid queue specified later in the qconf file. The priority is the default priority that jobs enqueued by the print program will be given. The print-hdr variable specifies the file to be used as the default line-printer header. The print-hdrdir variable specifies the name of a directory that holds a collection of header files for shared use. They can be referenced without giving their full pathnames by using the -H option of qpr. Typical values would be ``white'', ``lp'', ``64'', ``/usr/brl/lib/mdqs/header'', and ``/usr/brl/lib/mdqs/headers'', respectively.

vers-forms <form> verset

vers-queue <queue> verset

vers-prior <priority> verset

These control the default behavior of the verset program. The form should be a valid MDQS form. The queue must be a valid queue specified later in the qconf file. The priority is the default priority that jobs enqueued by the verset program will be given. Typical values would be ``roll'', ``verset'', and ``64'', respectively. The verset system is like VCAT but is local to BRL.

The next section contains the definitions for the logical devices. The form of each line is

``dname device status''

Dname is the logical name for a given device. There should be a one-to-one mapping from logical device name to real device name. The only time this should not hold is for pseudo-devices like ``net'' and ``batch'' which will normally use /dev/null for the ``real'' device. The logical device name is used as a convenience to refer to a particular device. The logical device name is specified by users when issuing qdev commands. The next entry, device, is the full pathname of the device (or file) in question. Lastly, status is a set of symbolic flags used to control the behavior of the device. There are several status flags currently implemented. ``anyform'' indicates that this device can accept requests regardless of what forms were specified for the request. This is used mostly for the network device. ``roundrobin'' causes the daemon to use a round-robin algorithm in choosing requests to run on the device. ``skipmsg'' disables the sending of completion messages on successful completion of a request on the device. This is useful on the network device where the user really wants to see the notification message from the remote system. Failure messages are not affected. As an example, here are some sample entries for devices:

     lp0     /dev/lp0
     lp1     /dev/lp1
     vp      /dev/versatec   roundrobin
     batch   /dev/null               anyform
     net     /dev/null               anyform,skipmsg

The list above should be fairly clear, with the exception of the last two entries, which are examples of queues which do not need a real device.

The next section describes the logical queues to which users can submit requests. These do not necessarily map to a specific device. The form of these entries is

``qname status''

The string qname is the name of the queue. The status field is fully implemented, but currently there is no need for flags on queues. This field should be left empty for now. Typical queue names include ``batch'' for the batch queue, ``lp'' for the primary line-printer queue, ``i300'' for a raw data queue for an Imagen laser printer, and `i300-cat'' for a queue of C/A/T typesetter codes to be converted then requeued to ``i300''.

The last section is the mapping table. This table is used by the daemon when trying to find a request for an empty device. A mapping triple consists of an entry of the form

``qname dname server''

where qname and dname must have been previously defined in the qconf file. A mapping triple allows requests from qname to be serviced by the device dname using the specified server. The server is the full pathname of a program that will process the request after the daemon has opened the device. This process will be run non-privileged and aliased to the submitter of the request. The server process must follow certain rules to interact successfully with the daemon (see ``Server Process Specification'', below). It is this table that allows the feeding of multiple queues into a single device, and the emptying of a queue by multiple devices.

The significance and use of this table is clearer if the internal queueing procedure is understood, so here is an explanation. The internal data structures consist of a linked list of queues and a linked list of devices. Attached to each device is a linked list of mapping entries which map to that device (``the mapping table''). This list may be empty. When a device is found to be idle, the daemon goes to the mapping table for the device and starts scanning the queue entries. If an entry is found, the daemon looks in that queue to see if there is a request to be processed that has matching forms and is not being held. If so, that request is dequeued and assigned to the device. If no valid request is found, the daemon continues to scan the mapping table looking for another queue with eligible requests. The process continues until either a request is found or the end of the table is reached. This system creates the possibility of having one or more queues feeding one or more devices. Normally the linked list of mappings is searched sequentially until a request is found. This has the side effect of giving the first entry in the table preference by virtue of its position. The priority/position is determined from the order in which the mappings are defined in the qconf file; the first mapping defined has highest priority. There is an option to force round-robin examination of the queues which is enabled by specifying the ``roundrobin'' device status flag.

For a mapping example, let's say we own a Versatec printer/plotter and we wish to be able to use it for either printing or plotting. We might make the following two table entries:

     lp        vp        /usr/brl/lib/mdqs/vlpserver
     plot      vp        /usr/brl/lib/mdqs/pltserver

Where lp and plot are the names of the print and plot queues respectively, and vp is the logical name for the Versatec device. In this configuration, requests in the lp queue would be processed before requests in the plot queue, assuming that both had requests which specified the currently loaded forms and we were using the default (priority) system for assigning requests.

The other non-trivial mapping possibility is for a single queue to feed more than one device. This might arise if you had more than one line printer on your machine and you wanted all line printer requests to be serviced by the next available machine. In this case, the mapping table entries would look like:

     print     lp0       /usr/brl/lib/mdqs/lpserver
     print     lp1       /usr/brl/lib/mdqs/lpserver

This system gives a great deal of flexibility with very little work. To disable a device, all one would need to do is to comment or remove all the mapping lines that would feed that device. In practice, enabling and disabling is normally handled with the qdev program.

Additional Configuration Possibilities

There are a couple of other places where tailoring may be necessary. Prime candidates for site-dependent changes are the files src/msgopen.c, src/netdae.c, src/netopen.c, and src/qlocks.c. Msgopen.c needs to know how to send mail on your system. Our system allows the inclusion of a subject line and the specification of the sender. Tailor this as necessary for your mail system. The files netopen.c and netdae.c need to know how to initiate and accept network connections for TCP/IP or similar networking protocols if your machine supports real networking to other machines. The qlocks.c file is where MDQS locking primitives are implemented. Unfortunately this is quite site-dependent since AT&T did not provide a good atomic locking mechanism until recently. We once installed the RAND exclusive open code in our PDP-11 kernels and found it to be a very nice mechanism for a system of cooperating processes such as our queueing system and mail system. A variation on the exclusive open code is provided for 4.1c/4.2BSD systems which also have an exclusive open. A second, inferior, locking mechanism is also provided for sites that do not have the RAND exclusive open. This second implementation relies on the atomic nature of the link system call. It creates a tmp file and then tries to link to the lock-file name. Unfortunately this system is not ideal and is susceptible to getting deadlocked if the system crashes or a program dies without removing a lock. This is the same problem that the news systems and UUCP have run into. I have not made an effort to make this a robust mechanism since it is inherently flawed, but I would welcome the opportunity to incorporate other implementations of the locking code into future releases. For those of you who must use this code, you will have to add code in /etc/rc to remove all the locks prior to starting the queueing system. It would probably also make sense to modify the locking code to ``expire'' locks whose mod times are older than X, and/or put the PID into the file. There are numerous possibilities for improvement. These extensions are left as an exercise for the maintainer. (Sorry) A version of the locking code for System V-style fcntl() locking is also provided, but as of this writing it hasn't been tested.

A Tour Through the MDQS Source

The Include Files

h/devstat.h

The devstat.h file contains definitions for the structures used in the device control files (home/adm/D_). Note that the main structure (struct devstat) is divided into to sections, daedevstat and usrdevstat. The daedevstat section is modified exclusively by the daemon. The usrdevstat section is modified exclusively by the qdev program. They are separated to eliminate any possibility of simultaneous update of the same information.

h/qreplys.h

This is a list of all the valid reply codes. The reply codes are used by the servers and interpreted by the daemon.

h/queue.h

This file contains the definitions for most of the data structures used by MDQS. The first is the request structure. The request structure is the internal representation used for request queueing information. The data is converted to ASCII for storage in the control files. All the information necessary for the daemon to queue and process a request is contained in this structure except for the submitter's address. The msgopen routine will fetch the address data from the text lines if needed.

The queue structure is the in-core representation of a queue. It has a name, a status, the head of a linked list of requests in this queue, and a link to the next queue. There is an internal linked list of queues, if you haven't guessed that already.

The device structure contains the logical name of the device, the pathname to the device, the current forms in the device (from reading the devstat file for that device), a status word, a count of the number of times this device has ``failed'', the PID of the subprocess currently serving this device, a pointer to the request currently being serviced, a pointer to the queue from which this request came, a pointer to the next device structure, a pointer to the head of a list of mappings which map queues to this device, and the time the devstat file was last modified.

The qmap structure simply has pointers to a queue and a server and a pointer to the next mapping. This is the internal version of (queue,device,server). Remember that this mapping hangs off a device structure, so we know what device is involved. All we need to keep around is the queue and server information.

The qdata structure defines the format of the entries in the queue list files (home/adm/Q_) which are written out by the daemon for informational purposes.

The file name format offsets are used to ease accessing the information contained in the queue request names which look like Q00044.123, where the first set of numbers is the UID and the second set is the per-user sequence number. The filename must start with a capital Q, followed by a five digit decimal version of the UID, a dot, and a decimal representation of the sequence number.

The status structure is used to allow easy access to the return information from a wait() system call. This may in rare cases be system dependent. Be careful about putting in padding elements since the compiler may not pack them in the manner you expect. Using chars is probably safest.

The last structure of interest is the paramtab structure. This structure is used to hold information to be used in auto-configuring the MDQS programs. Essentially, you list a variable string to look for in qconf (e.g., /usr/brl/lib/mdqs/qconf), the variable to be modified, the type of variable it is, and an optional function to call to do the work.

At the bottom of this file are extern definitions for the variables ``errno'' and ``sys_errlist''. These are normally used by perror(), but are used here directly because we need more flexibility in printouts. The variable ``errno'' contains the system error number after a system error. ``sys_errlist'' is simply an array of string pointers where each string is an explanation of the errno number that was used as an index. These are in the standard C library.

h/rawreq.h

All request control files have this structure at the beginning of the file so that they can easily read in without having to parse ASCII text lines. Since it is fixed-length, this imposes fixed maximum lengths for the request name, the queue name, and the forms name. This structure contains the information that is common to all the requests. The information in this structure is entirely ASCII strings so that it will be machine independent. There are two library routines which will convert between this structure and the internal request structure defined in h/queue.h (lib/gethdr.c, lib/puthdr.c). Request-specific information will follow in the control file as a series of text lines. The text lines are interpreted by the server process.

h/vp.h

This is used by the servers that play with the Versatec. It is probably valid only for Berkeley UNIX systems.

The Source Directory

src/batch.c

The batch(1) program queues requests for ``batch'' jobs that are to be run by the batch system. The first thing this program does is to set the effective UID/GID to the real UID/GID. This is done because later when the request is activated it will be started up with real and effective ID's set to the real UID/GID of the invoker. With this in mind, we want to test for file access and other permissions with the same UID/GID we will have later. Next it used the rdconf() function to auto-configure from the information in qconf. You will see rdconf() called in most of the MDQS programs so I will explain it once here. You pass rdconf() a list of variables you expect to contain interesting information. An array of paramtab structures is used to hold this information. There is no requirement that the specified variables be defined in qconf so programs should be compiled with the affected variables initialize to reasonable values. Another interesting part of this program is past the argument processing where it is setting up the request structure. There the forms are set up with the value of ``batch-forms'' from the qconf file. This will be some sort of dummy form since forms don't make a lot of sense in this forum.

The batch program will try to save the user's environment for re-creation (insofar as possible) when the request is activated. It first sets the shell to be used to the value of the environment variable ``SHELL'' (if found), unless it was specified by the -s option. If it is not found, /bin/sh will be used. Next, the function snapshot() is called. Snapshot() first calls dumpdir() to write out the current working directory. If this cannot be found, /tmp will be used. Following this, all the exported shell variables are recorded. These environment variables will be restored by the shserver program. The rest of the program is pretty vanilla ``queuer'' stuff.

src/copycnt.c

The copycnt() function will copy at most ``cnt'' character from one file to another using STDIO. The ``source'' string is used to identify the source file when there is an error.

src/dcontrol.c

The manipulation of requests is centered here. The scandevs function is called whenever we want to look for empty devices to restart. The algorithm goes like this:

for each device { if the device is disabled continue;

if the device is active continue;

else if there is a job assigned to the device (but it's idle, since no active process) devstart the device;

else the device needs a new request { for each mapping to this device { look at the queue and if there is an eligible request, assign it to this device and devstart the device. } } }

If, for a given mapping, the device is empty and there is no eligible request, then the device will be left empty in the hope that some other queue-to-device mapping to the same device will have an eligible request. If no mapping yields an eligible request, the device will be left idle. At the bottom of the loop is the code to implement the round-robin queueing. If there was a request found and the round-robin queueing option is selected, this code will rearrange the order of the queue so the next search will look at the oldest entry next.

The enroll() function is called to put a new request into the internal MDQS queues. Enroll() is simply given the name of a control file that popped up in the ``new requests'' directory. It calls makereq() to actually open the control file and read in and verify the request structure. If makereq() cannot read in the request structure due to lack of space, then we have real problems, but we try to deal with the situation intelligently. Here we set the variable Waveoff. This will prevent further attempts to enqueue more requests until there is space available.

There are two other possible error returns. A -1 indicates that the request is unserviceable in some way (open error, read error, etc.). A return of -2 is a security violation check. The -2 code is returned if the filename in the request structure and the actual filename do not match or if the UID in the file does not match the actual owner. In the case of either error return, the request is put in the holding directory.

The queue() function takes a request that has been read in and places it on the proper queue. There are basically three possible results of this operation. First, the request has not yet reached its start time, so it is inserted into the delayed request queue which is sorted by start time. Second, the request is ``ready'' to be run and is therefore sorted into one of the queues. In this case, sorting is done first by priority and then by submission time. Lastly, the request may not match any queue and is thrown out.

The function checkdelayed() is called by the main loop every few seconds to move requests from the delayed request queue into queues for appropriate devices.

The hold() function is used whenever we want to keep a copy of a request which causes the queueing system grief. The belief is that the system manager may want to look at these some time in the future.

The renew() function is used when the queueing system has to ``lay off'' a request because of a space limitation. If this happens, the request is linked back into the ``new- requests'' directory so that it will be re-entered into the queueing system at the next opportunity.

The update() function does the internal work for the ``qmod'' program. When a control file is modified using ``qmod'', only the modified request structure is written out, and then only into a new file with the same name as the original control file but in the modified requests directory. This means that when update is called with the name of a request file, it has two files to work with. The old one is used to ease the daemon's task of finding the request in the queues. Once found, the request is modified or deleted as necessary.

src/ddevstat.c

This file contains all the code for reading and updating the device status files. The devcheck() function is called from the main loop to find and read any device status changes caused by the qdev command. The devactive() function is used to set the current state of the device and is called by devstart() and pwait(). This function will set the current PID and daemon status bits in the file and read in any bits or forms the user may have changed. At the end of this function is the code that will re-enable a device marked as ``failed''. This is done by toggling the user- controllable ``disabled'' bit. The dsopen() function is used to open the device status file for a given device. It does the work of generating the filename and updates the ``last-modified'' time which is remembered in the device structure. Because we remember the time when we open the file, and then we write into it, we will often unnecessarily reread the file on the next pass through devcheck(), but this is a more robust algorithm which should prevent us from losing updates if the user updates at about the same instant we do.

src/dmain.c

First look at init(). Several interesting things happen here. The daemon chdirs to the control directory first. Next the daemon doubly detaches itself from its invoker to make sure we don't hang anyone who was manually invoking us. Several common signals are ignored, and SIGALRM and SIGTERM are trapped for special action. In particular SIGTERM will cause the daemon to die gracefully and to signal its children to die as well. All unnecessary file descriptors are closed and the daemon lock is set. This is to prevent more than one daemon from running simultaneously. An initial read of the configuration file is forced and the existence of the sequence file is ensured. If the sequence file is not present and writable to the world, the queuers (qpr, batch, etc.) will die unhappily. The prod file is then created. The function recover() is called to re-enroll all requests left over from the previous invocation of the queueing system, usually left around because the system was shut down or crashed. Finally, we set the ``delayupdate'' flag which will force the qlists() function to write out a qlist file for the delayed request queue. If this were not done here, the file would not be written out until the first file was enqueued onto the delayed request queue.

The main() function is quite simple, it just sits in a big infinite loop. The statfiles() function is called whenever the prod file is modified. Statfiles() will call other functions to act on any changes.

The prreq() function is only active when debugging is turned on and the value of the ``debug'' variable was more than 4. It will produce a detailed list of the status of the various queues on the stderr output every SCANWAIT seconds or so. This is quite verbose.

src/dmem.c

The first function, xmalloc(), is designed to act like malloc() except it always succeeds in getting the necessary space or it bomb()s the daemon. If necessary, xmalloc() will throw delayed requests back into the new directory in order to free up their request structures for use. For simplicity, we just burn the head of the delayed request list. Unfortunately for the queueing strategy, this means the person who is nearest to his start time is thrown out, but I couldn't see doing N-squared work to throw away the guy on the end each time I access the list. If this happens, we have real problems anyway.

The strdup() function makes a private copy of a string.

The zapdev() and zapque() functions are used to free all memory and files associated with a deleted device or queue.

src/dprocs.c

If there is any real workhorse module in the daemon, dprocs.c is certainly a contender. The first function, devstart(), is responsible for dispatching a filecontrol process to actually complete the user's request. After some paranoid checking of the device structure, the function does a fork. The parent (the daemon) immediately returns. The child is referred to as the ``filecontrol'' process and is commonly called filecontrol in this paper. Filecontrol has the responsibility for opening the device and control files and then forking and execing the server process as a child. First, the filecontrol process writes its PID and request name into the device status file via devactive() for the benefit of q(1) and qdev(1). Next it opens the device. If the device could not be opened, the filecontrol process will sleep for a period and then return to the daemon which will restart the request. This mechanism may be changed in the future for aesthetic reasons, but it keeps the daemon from sitting in a loop trying to open the device. Having filecontrol sleep also has the side benefit of allowing the request to be flushed.

Assuming the device was successfully opened, the control file is opened next. If access is denied for any reason, the request is flushed and held. The request is now committed.

The filecontrol process then sets up a pipe that will be used to communicate return status from the server to the filecontrol process without filecontrol having to do a wait on the process. This has been done so that there is no way for the server process times to be inherited by the parent process. The main reason for this is that on V6 and BRL PDP-11 systems, the ``process accounting'' totals are inherited by the parents of a process. In our system, the ultimate parent (INIT) is responsible for logging the accounting information which it gets by doing a fancy wait call that returns the better part of the user structure for the deceased process. I expect that the same type of accounting system may developed for 4.2BSD as well. The basic hooks are already there through their wait3 system call. So to ensure that the work of the servers is accounted properly, they are orphaned from the daemon by having filecontrol exit to the daemon after reading the return info on the pipe. Since the server's parent is now dead, INIT inherits the orphaned child. One important thing to notice is that the pipe is forced to be on file descriptor 3.

Finally we fork the server process and exec it after a setgid, setuid, and setpgrp. When the read from the pipe completes (successfully or otherwise), the filecontrol process exits with the status code it got from the pipe or generated itself if there was a pipe read error.

The pwait() function is called every few seconds by the main loop. It tries to collect as many children as it can but sets an alarm so that it does not hang waiting for a process that will be a long time in dying. When a child is collected, each device is examined to see if that PID matches the filecontrol PID for that device. If the PID is not found in any device structure, then the child was a ``miscellaneous child'' (probably a mail(1) reply) and no action is taken. If the PID is found in a device structure, then the status code is examined to find out what to do next. If filecontrol died a horrible death, then we ``hold'' the request. Holding a request consists of copying the request control file into the ``hold'' directory so that a systems person can look at it and then delete the request from the queues. Filecontrol should ``never'' die a horrible death, hence the hold.

If the return was good, the exit code is checked. Based on its value a variety of actions can take place. Except for the replies of RP_NOPEN (device open failed), RP_NEXEC (exec of server failed), and RP_RESTART, the request is dequeued. Whenever a request is dequeued, the Waveoff flag is cleared to indicate that there is a free request to be used for some deserving process if we have run out of address space.

Currently the daemon is not privy to the form of the control files past the header structure, with two exceptions. The msgopen() function knows how to find the address line that immediately follows the header and contains the INTERNET return address for notifications. The src/dzapreq.c module also knows how to read the control lines and knows what lines specify data files in the data directory that can be safely removed. This function is used to clean up after flushed processes.

The termall() function is called from the module dmain.c when the daemon gets SIGTERM signal. It uses the information in the device structures to send SIGTERM to all the filecontrol processes. Termchild() is called when filecontrol gets SIGTERM and in turn sends SIGTERM to the server if one is running.

src/dqlists.c

The daemon maintains a number of status files in the directory .../lock/home/adm which are used to find out the status of requests inside the daemon. First a file is written out for each logical queue, then one for the delayed requests which are lumped together in a single queue. Note that the new files are made in a temp file first and then a ``unlink; link; unlink;'' operation is done to make sure there is a small window when the status files are non- existent or incomplete.

src/drdconf.c

This file has the necessary routines for the queueing system to do run-time reconfiguration. From the first comment regarding interlocked access, one might suspect that an exclusive open should be used. This was rejected because the qconf file has to be generally readable, and a user could open the file for exclusive open and hold it, essentially preventing the daemon from reconfiguring.

First, all the current queue->device mapping are freed. Next, we mark each device and queue so that we can tell if one has been deleted from the configuration. Then we proceed to parse the configuration file dealing with each ``section'' in a separate manner. Filestate indicates what section of the file we are in. Extensive checking is done here to validate all lines. Lines with errors are complained about and discarded. Lines that are valid but unknown in the first section (parameter specifications) are ignored by the daemon and other processes since this section contains information that is used by all the MDQS processes, but not all information is needed by all processes.

Once the entire file has been processed, we go through the devices and queues expunging those that no longer exist. EXPUNGING A QUEUE THROWS AWAY ALL REQUESTS ASSOCIATED WITH THAT QUEUE. Removing devices only halts the current job in that device and requeues the request to be run again. Finally, the mapping tables (linked lists) on each device are reversed to represent the actual order of input. The list is built by adding new mappings to the head of the list, so the last entry would be scanned first in a search if the list were not reversed.

The setparam() function is called to set configuration variables. It currently knows how to handle integers and strings. The function is table driven and can do one of two things when a variable is encountered. The table describes the string to be matched (the variable name), the type of variable it is supposed to be (int, string), and optionally the variable address and a function to be called with the new value string array when you want to set the variable.

Note that the function is called with the same ``argv'' type of double pointer as the setparam() function is called with, so if the setting function wants the value instead of the variable name it has to bump the double pointer or access fp[1]. For integer variables, the values are just assigned, in the current implementation. The string variable ``console'' is set by calling the function setconsole() since a fair amount of checking needs to be done before the console is changed.

The functions dstatus and qstatus are used to parse the settings for the device and queue status variables. Currently there are no status flags for the queues. The devices have several possible flags.

src/drddir.c

Directory scanning (what there is of it) lives here. The rdnew() and rdmod() functions are almost identical save for what they do when they find a file. This module uses the POSIX directory access routines which makes the code independent of the format of the directory (see libndir/* if you do not have the code already and then USE IT!). Rmreq is pretty straightforward.

src/drecover.c

The recover module is called once when the daemon is invoked and is responsible for cleaning up after the previous daemon. All the request files in the control directory are scanned and queued into the daemon's internal queue. Any requests that do not fit in core are linked to the new request directory, which assures that they will be read in and processed later.

src/dumpdir.c

Dumpdir is used to record the current working directory in the specified request control file using the standard directory control line (D). If the current directory cannot be determined, /tmp will be used.

src/dzapreq.c

The daemon uses this function to clean up after flushed requests. Any control line characters that specify files in the data directory should be listed here so that the daemon knows to delete them.

src/hpljserver.c

This server is designed to drive Hewlett-Packard LaserJet printers over RS-232 lines. The hpljserver a modified qumeserver that does the proper ioctl calls to set the terminal to 9600 baud and LITOUT mode. It passes control characters unchanged. The terminal is left in cooked mode so that the terminal can flow control the computer to keep its buffers from being overrun.

Special versions of dorestart and doflush that reset the printer forcing out the current page have been incorporated into the server code.

Hpljserver assumes that certain BRL-specific support software (sources in hplj.support) is being used; for a more generic interface use the ttyserver.

src/imeserver.c

This server is designed to drive an Imagen laser printer over the Ethernet interface. It calls on the improcess module so as to yield some accounting for Imagen usage.

src/improcess.c

This module knows how to ``read'' all version of input to an Imagen running version 1.8 of the Imagen software. It should also be valid for later versions and perhaps earlier ones. It will return an accurate page count for the input sent. It understands Impress language, Daisy language, Printer language, and Tektronix language.

src/imtserver.c

This is like imeserver.c except this drives an Imagen over their RS-232 interface. It expects to be able to call the Imagen-supplied program ``IPS'' to actually send the data. Imtserver feeds IPS via a pipe. IPS talks to the Imagen using a proprietary packet protocol. IPS must be modified to act as a filter. The modifications are trivial (you basically comment out code), and a later version may have this mode of operation as an option.

src/log.c

This function is used to make a log of MDQS activity. The conditional code is used to take advantage of the ``open for append only'' mode if it is available.

src/lpserver.c

Lpserver is the first of two servers that I wrote for MDQS. Since then a number of other drivers have been written, primarily by my good friends at the Boeing Aerospace Corporation. This server is designed to handle most simple line printers, but it does not handle any special features. Some conditional code is used so that the same file can be used to produce several drivers by the using the appropriate #ifdefs (see the Makefile for details). The only filtering this server does is to add formfeeds between the header and each subsequent file and to swallow unprintable characters, printing them using the ``^C'' notation for control characters, and ``\000'' type notation for all other unprintables. All of the fancy print line processing is in the function copyn() which is found in the file lib/copyn.c. There is provision for a print limit that can be set by the user and is normally infinite (r.r_size). For the lpserver, the value is interpreted as the number of lines to be allowed per request. The size value is a long integer.

The server expects to find the control file open on the standard input and the device on the standard output. The first thing it does is to call the arginit() function which processes the command-line arguments passed to the server. Currently the only valid parameters are ``columns=N'', ``nheaders=N'', and ``ntrailers=N'', where N is some reasonable number, and ``noff'' to suppress the automatic form-feed between files. The signal trapping is set up for use when a restart or flush is performed. The setoptions() function is currently pretty dumb, and may need to be modified for your particular printers. One notably absent feature is an stty to turn on ``No form-feed at page break mode''. Our original PDP-11 lp driver automatically skipped three lines at the top and bottom of every page. Your particular driver's quirks should be handled here.

There are probably several ways this piece of code could be modified to make it more general and more flexible. I am considering taking the Berkeley filter idea and incorporating that, or coming up with another equally flexible method.

src/msgopen.c src/msgopen.mmdf.c src/msgopen.mail.c src/msgopen.file.c

This is a rather system-dependent module. It needs to know how to send mail on your system. Just tailor the sprintf accordingly. As distributed, it has the mail line for our system which allows the specification of a ``From:'' line and the ``Subject:'' line. One important detail to notice is that we do not call popen() since pclose does a wait system call, and for MDQS to work properly it must do all the waits in the pwait function. I have included a stripped down version of popen called procopen. There are three versions of this file currently, one for Bell-style mail, one for the MMDF system, and one that puts message into the error log file. There are several versions of this file, one of which should be copied to msgopen.c.

src/netaccess.c

This module implements network access restriction; currently it only verifies based on the originating queue and host.

src/netdae.c src/netdae.4.2.c src/netdae.11.c src/netdae.bbn.c (incomplete) src/netdae.inetd.c

The code for the network request server resides in this module. I currently have implementations for the Berkeley 4.2BSD, 4.3BSD, and 2.8BSD TCP/IP networking implementations. This daemon waits for incoming TCP connections on a specified socket and submits the incoming requests to the queueing system under the user name specified as the value of the ``mdqsid'' configuration variable. The network daemon forks a copy of the program ``netrecv'' to do the actual work but not before insuring that the child is running under the ``mdqsid'' login. This module is definitely dependent on the type of Unix your system is running. Many system may not even be able to support true network connections. As of this writing, a System V version is not available. The netdae program has a debugging option and an option to control the number of simultaneous connections being serviced. Another option controls which port netdae listens for requests on. It is normal for the netdae program to leave one or two children hanging in Zombie state until the next connection comes in. This is because the daemon spends most of its time waiting for a connection and only does waits after forking a child to process a connection. You are required to specify the program to handle the requests; normally this would be netrecv. In addition it is possible to specify arguments to be passed to netrecv. There are several versions of this file, one of which should be copied to netdae.c.

src/netopen.c src/netopen.4.2.c src/netopen.11.c src/netopen.bbn.c

Like the netdae module, the netopen code is also very dependent on the version of Unix your site is running. Netopen is passed a hostname and returns (if possible) a FILE pointer to the socket opened to that host. Versions of this file exist for Berkeley 2.8BSD, BBN's 4.1BSD, and 4.2BSD.

src/netrecv.c

This is a machine-independent routine for entering requests from the network into the queueing system. It basically expects a steady input stream of data after which it prints a message indicating the success or failure of the transfer. A message indicating success serves as a transfer of responsibility from the originating site to the receiving site. The originating site is then free to, and in fact obligated to, remove the request. The only fancy stuff in this module in in the ``doqueue'' function which has to change several of the fields in the header to reflect the new submitter. The netrecv program does little data verification; it relies on having a error-free network connection such as a TCP/IP virtual circuit.

src/netsend.c

Netsend is a standard MDQS server (dequeuer) used to send requests to remote machines via error-free network connections. It is the machine-independent portion of the program. It calls on netopen to open an actual connection to the remote host. Most of its work is involved with converting references to absolute paths into references to files in the data directory on the remote host. It has a couple of options worth noting. The -i option cause the program to use an interactive protocol similar to FTP. The -p option allows the use of an alternative TCP port number for the connection.

src/nextseqno.c

This function gets the next sequence number for the given UID. The sequence numbers are stored as ints in the Seqfile. Seqfile is a sparse binary file. The sequence number is found by seeking into the file by an amount proportional to the UID, then reading the int. The sequence file is then updated to reflect the change. This was done as a user convenience; a given user's sequence numbers will be monotonically increasing from 1. This should make it easier for him to remember what sequence number is associated with what job. Also, the sequence number will stay relatively small, especially if the sequence file is zeroed occasionally.

src/pcopyn.c

This is a version of the lib/copyn.c module set up to support the Printronix line printer. Since it was based on an earlier version of copyn, it has little in common with the current version. It is used by the plpserver program, which is a derivative of the lpserver.

src/procopen.c

Procopen is a stripped-down popen() routine. My only question here is whether we should open /dev/null for stderr or just leave stderr pointing at the current console. I have opted to leave it pointing at the console for now.

src/procserver.c

This is used to process requests through a process instead of a device. The process is assumed to be ``sensible'' in that you can believe its exit code since success or failure of the request will be based on this result.

src/qdev.c

The qdev program is responsible for all device management activities for users. Qdev is used to enable/disable devices, change forms, restart requests, and flush requests. It uses the device status file to find the PID of the filecontrol process and to read and write forms and status information. The daemon is responsible for creating the device status file. The program starts out set-UID to the superuser so that it will be able to send signals to any server if their invoker is so privileged. Currently this is determined by looking at the GID. If a user is a member of the systems group, he is allowed to cancel or restart any request. If the user is not the superuser or a member or the systems group, the program reverts to being non-privileged. Since a non-privileged process cannot send signals to a process with a different UID, requests are protected from being flushed by other than the person making the request or the superuser. If the program is going to update any information in the device status file such as the forms, the program locks the file to prevent simultaneous updates. This will not prevent the daemon from updating the file. When the device status files are initially made, the forms are initialized to ``*Empty*'', so you always have to set the forms when you create a new device or if the device status files get removed.

src/qlocks.c src/qlocks.link.c src/qlocks.brl11.c src/qlocks.4.1c.c src/qlocks.4.2.c src/qlocks.rand.c src/qlocks.fcntl.c

This is another file that will probably need tailoring for most systems until we build up a good library of locking code for various systems. The code in qlocks.rand.c uses the RAND style of exclusive open to accomplish locking. On other systems this may not be available but there may be some equally good mechanism which is guaranteed to be atomic. As another example of a locking mechanism, I have written a second set of locking code (qlocks.link.c) that uses links to known files to set locks. This type of locking has some inherent problems though which have been discussed at length on both ARPANET and USENET so I won't go into it here; but suffice it to say, this module may want to be changed appropriately. If you do come up with changes to this for your type of UNIX, I would be very interested in getting the changes for incorporation in later releases. Doug Gwyn has provided an implementation for System V-style locking via the fcntl() system call, but as of this writing it hasn't been tested. The clearlock() and freelocks() functions have been included in most implementations for completeness, but they are not currently used. There are several versions of this file, one of which should be copied to src/qlocks.c.

src/qmod.c

This module implements the program ``qmod'' which is used to alter the general queueing parameter of a request after the job has been spooled. The program can only modify the information in the header record of a request. The daemon is informed of the changed request by writing the new request record into a file in the mod directory. The original request file is left unaltered in the control directory. The daemon will update the copy in the control directory after it has verified the request. Both files have the same name as the old request. The old request is kept around so the daemon can use the information in that request record to locate the request efficiently. The process() function does all the interesting work and is called once for each request to be modified. The ``qmod'' program resembles the ``batch'' and ``qpr'' programs in its parsing section.

src/qpr.c

The qpr program is very similar to the batch program described above. One of the major changes is that multiple input files can be specified, hence the loop at the bottom of main. If multiple files are specified and none can be opened, nothing is queued and an appropriate message is printed. The mkdata() function is responsible for creating filenames for the data. These need to be unique on the machine or machines in the queueing system. The current algorithm (which has worked well here) is to creat a name using the PID and the time printed in hex.

This program makes use of a number of configuration variables including the ``print-hdrdir'' variable that allows the specification a set of ``system'' headers by just giving the simple name of the header file. Qpr will also allow you to specify the ``Source File:'' line on the header by giving a -t (for ``title'') option to the program. The program is careful to clean up after itself if it is interrupted before calling the ``queue2'' program.

src/qstat.c

The ``qstat'' program is the MDQS status program. It runs privileged so that it can read each of the control files, but it may need to be restricted at your site. There are two modes this can be used in. In the normal mode, it uses the queue status files created by the daemon to find the relevant information. The other mode, Slow mode, is invoked by using the -s option. In Slow mode, the ``qstat'' program treks through each control file in the control directory and prints the relevant information from each one. The program will also use the device status file to give information about the devices. The fast mode is handled in the pqueue() function, while the Slow mode is handled in the loop at the bottom of main. The maximum size for certain fields is hard-coded into the print functions towards the end of the module. If you modify the request structure, these will need to be modified.

src/queue2.c

``Queue2'' is the second half of the queueing process. It is set-UID so that it can get into the MDQS spooling directories which are normally protected. It immediately changes UID and GID back to the real UID and GID so that permissions are properly honored. The program is called with the name of a control file to be found in the generally accessible qtmp directory. Queue2 has enough knowledge about the format of control files to find lines that reference copied data files so that it can move them down into the spooling directories. First, each data file is moved into the data spooling directories. Next, we lock the sequence file and get the next sequence number for this user. Using that number we try to creat a unique control file name. If we succeed, we link the control file to this new name in the control directory and we are done.

src/qumeserver.c

The qumeserver a stripped down lpserver that knows how to stty a terminal properly for Qume/Diablo style output. It passes control characters unchanged. It will accept an argument which is the speed the device is configured for; this argument is simply the numeric baud rate (e.g. 1200, 9600, etc.). The default is compiled-in as 1200 baud on our systems. Handling of XON/XOFF is specifically left enabled so that the terminal can flow-control the computer to keep its buffers from being overrun.

The qumeserver is probably obsolete now that the ttyserver exists.

src/shserver.c

This is the server which runs batch jobs; as such it is somewhat special. The batch processor tries to recreate the environment when the batch request was made insofar as possible. This means trying to chdir to the appropriate directory and setting up the exported environment variables as they were when the request was made. Currently there is a limit to the number of environment variables that can be specified (100), but if someone would like to generalize the code, they are welcome. The shserver has its own trap routines for restart and flush. This was done so that the shserver can terminate the processes running under it.

src/ttyserver.c

The ttyserver, supplied by Doug Gwyn, is intended to be used for practically any kind of device attached via a terminal port (usually an RS-232 connection). We have used it for Hewlett-Packard pen plotters, LaserJet laser printers, and non-protocol serial-line connections to Imagen laser printers. The source code is designed so that it can be compiled on any version of UNIX (7th Edition or later); it will automatically configure itself to cope with known variations among terminal handlers. The ttyserver always uses DC3/DC1 (XOFF/XON, ^S/^Q) flow control, which is a near-universal de facto standard; if the device for some reason does not use flow control, this will have no adverse effect.

The only output data transformations performed are some that are selectable in the terminal handler: newline-to- CR/LF mapping (enabled by default; disabled by the ``crmod'' command-line or user option), horizontal-tab expansion (disabled by default; enabled by the ``xtabs'' command-line or user option), and a couple of convenience features: banner-page printing (disabled by default; enabled by the ``banner'' command-line option), and form-feed after each file (enabled by default; disabled by the ``noff'' command- line or user option).

There are also options pertaining to hardware port parameters: ``nohang'' to avoid disconnecting when DTR is lost (as occurs with Hewlett-Packard devices whenever they need to stall output), ``speed=N'' where N is the serial bit rate (9600 default), and ``parity=X'' where X is any of ``none'' (8-bit wide path, default), ``even'', ``odd'', ``mark'', ``space'', or ``any''.

src/uucpsend.c

The uucpsend program is used to feed MDQS requests to another machine via the UUCP network system. The basic strategy is to invoke netrecv on the remote system and feed a big file in on the standard input. The format of the input is basically the same as that used on live virtual circuits, but no interactive checking is done. MDQS considers the request sent when uux accepts the job. This does not ensure that the request made it to the remote site, unlike the netsend program.

src/vcatn.c

This is a version of VCAT modified to run as part of a MDQS server. This is the typesetting software; the MDQS interface is in vcatserver.c. (From BAC, untested.)

src/vcatserver.c

The MDQS interface for a VCAT that is set up to run as an MDQS server. (From BAC, untested.)

src/vcopyn.c

This is an old copyn modified for use with the Versatec printer/plotter running in print mode as a line printer. It is used by vlpserver. (From BAC, untested.)

src/versaplot.c

An MDQS interface for VERSAPLOT. (From BAC, untested.)

src/verset.c

Verset is an MDQS interface for the VERSET typesetting emulation package. Like VCAT it writes to the Versatec, but it had some special requirements to satisfy. It is a good example of a queuer (user interface) designed to handle a special application.

src/vpl.c

Based on vtc.c (below), this subroutine outputs UNIX ``plot'' format plot files on the Versatec. It is used by the vplserver.

src/vplot.c

This is part of a Unix vplot front end to the ``qpr'' command. I don't know much about this one but I think it accepts the standard UNIX device-independent plot commands and drives a Varian or Versatec electrostatic plotter. (From BAC, untested.)

src/vplotn.c

Vplotn() dumps raster images to the Versatec. It is used by vplotserver.c.

src/vplotserver.c

This is the control portion of the vplotserver program.

src/vplserver.c

This server outputs UNIX ``plot'' format graphic files on a Versatec electrostatic printer/plotter.

src/vserver.c

This is the server for the verset command.

src/vtc.c

Used by vtcserver.c, this contains a very good rasterization algorithm (thanks to Mike Muuss and Doug Gwyn) for outputting graphic line drawings on a Versatec printer/plotter.

src/vtcserver.c

This server is used to plot the BRL developed TIGpack (Terminal Independent Graphics package) plots on the Versatec. The real work is done by vtc(). TIGpack has been superseded by an enhanced version of standard UNIX plot.

The MDQS Library Routines

lib/banner.c

The banner subroutine is now just a call to the bprint library. The bprint routine is very nicely done and is all in core. My thanks to Boeing for integrating this into MDQS.

lib/bomb.c

Bomb is called by a number of programs when they want to print an error message on stderr and then die.

lib/checknum.c

Validation of numbers is packaged here for general use. The string must contain numerals, and the number must fall within the specified limits. The result is stored through the supplied pointer; a return of 0 indicates a valid number was given.

lib/chrcnv.c

This is a translation table used by lib/lex*equ.c to convert lower-case to upper-case.

lib/ckforms.c

This is pretty simple. The only item of note is that if the Formsfile is not readable, all forms will be considered valid.

lib/copy.c

File-to-file copy routine. It is passed the name of the source file only for printing error messages.

lib/copyn.c

A routine to copy a specified file to the standard output which is assumed to be a simple-minded line printer with overstrike capability. This routine does a lot of fancy line processing to handle overstriking, underlining, and printing of non-printing characters. See the excellent comments at the top of the module for more details. Many thanks to Doug Gwyn for the work on this routine.

lib/copyn.sim.c

A dumb version of the above function. Kept in case we need to build a copyn function for a really stupid printer, or a really smart one (they are hard to tell apart sometimes).

lib/dup2usg.c

A dup2 subroutine which makes the appropriate fcntl() system call for a Unix System III or System V.

lib/err.c

This module is trivial.

lib/fdverify.c

This subroutine is used by all the programs that are considered ``trusted'' by the MDQS system. This includes all processes that run privileged and/or have access to the queue directories. It verifies that file descriptors 0, 1, and 2 are all open and opens these file descriptors if necessary. It also forces all other file descriptors to be closed. This is all in the name of paranoid/defensive programming.

lib/findque.c

Findque is a simple function for use by the queuers to determine whether the named queue is currently configured. It is not permitted to queue a request to a non-existent queue.

lib/ftimeusg.c

This is another fake for use with USG Unix systems.

lib/getdate.y

This contains a date/time parser that is used to parse the specified start times. See the getdate.3 manual page.

lib/gethdr.c

The gethdr() subroutine should be the only means of reading in the header from a control file. It handles all the conversion from machine-independent ASCII to internal request structures.

lib/getwd.c

This is a subroutine to get the current working directory by calling the pwd program and collecting the output. On 4.2BSD systems, the getwd() subroutine in the C library is used.

lib/lexequ.c lib/lexnequ.c

These are like strncmp but the sense of the return value is reversed, and they make a case-independent comparison.

lib/mdqsmanager.c

This function returns non-zero if and only if the user is considered to have MDQS operator privileges, as determined by group ID or presence in the manager file.

lib/mkcontrol.c

This is used by all the enqueuers to create control files.

lib/mkdata.c

This is used by all the enqueuers to create data files.

lib/parse.c

Parse() is a simple little function that takes a string, breaks it up into n fields, and returns the number of fields parsed. The pointer to the individual string pointers (like argv) is returned in the supplied pointer location. Leading and trailing white space is removed. The parse() routine supports quoted strings of arbitrary characters.

lib/prheader.c

The prheader() function is used to produce line-printer header pages in a standard format. It will always produce ``numhdrs'' headers and expects a number of other global variables to be available as well.

lib/proddaemon.c

This routine should do whatever is necessary to get the daemon to look for new requests or changes in device status. Once that meant updating the access time on the prod file, but this was changed to update of the modification time when we discovered that Sun's NFS remote filesystem did not properly track remote file access times.

lib/prtrailer.c

Prtrailer() is like prheader() except it prints trailers; there are no user-specifiable files involved.

lib/puthdr.c

This function is the opposite of gethdr(). It should always be used when writing out control file headers.

lib/queueit.c

All the enqueuers call this function to actually enqueue a request. This function in turn calls the queue2 program. Queue2 is the trusted program that actually enters a request in the queue.

lib/rdconf.c

This is the function used by most of the MDQS programs other than the daemon to auto-configure based on the information in the first section of the qconf file. It is passed an array of information about the variables to be configured; if there is an entry in the qconf file for that variable, rdconf() will set up the program's internal variable with that value.

lib/restart.c

The two functions in this module are called when the appropriate signal is received by a server process (lpserver or shserver for example). They return the proper reply by writing the reply code on the pipe to the filecontrol process. Servers may provide their own version if they want to take other actions upon being signaled.

lib/retmsg.c

This provides a convenient means to return a status code and a message to the filecontrol process (child of daemon).

lib/rmdata.c

This is a cleanup function for use by the enqueuers so that they don't leave trash queue files around when they exit due to interrupt or error.

lib/statmsg.c

This writes a device status message into the device control file for printing by the qdev program. The string can be any informative string up to some reasonable number of characters (see qdev.c).

lib/strdup.c

This is just a handy little function to provide a copy of the supplied string in dynamically-allocated memory.

lib/xmalloc.c

A subroutine that checks the return from malloc and exits if it fails.

libbprint/*

This library is only partially portable to non-Berkeley Unix systems. Specifically, the large character set will not be able to be compiled without the long identifiers capability in the C preprocessor (recently also supported by USG systems). Since MDQS only uses the small characters, this does not pose a problem for us, but if you try to use the library for other purposes, there may be problems (see chset.h and chset1.h).

Writing Your Own Queuer

Although most queueing can be done simply via the qpr utility, there will probably be a need for you to write your own queuer to handle some specialized queueing task. The rules for doing this are fairly straightforward. First, the queuer process is expected to do as much as possible to verify arguments, file access, etc., so that errors are reported to the user at submission time. The batch and qpr programs are good examples of this type of program. The job of a queuer process is to create a control file and to spool the input data as necessary. The queuer process runs non- privileged and creates all of its files in the public access directory Tmpqdir[]. When it has finished creating these files, it calls the program ``queue2'', which is set-UID to the owner of the queueing system lock directory. Queue2 is called with the name of the control file made by the queuer and optionally a ``-v'' if the verbose flag is set. All this is handled in the queueit() subroutine (see lib/queueit.c). Queue2 moves the data files into the protected Qdatadir[], and then queues the control file which is moved into Qcntrldir, and then linked into the ``new requests'' directory. You should not have to write a queue2 program.

The control file is the only critical part of the queueing process. The control file consists of a raw request structure followed by one or more text lines. The first line should be of the form ``Uaddress'', where address is a valid mail address. The address is used by the daemon when it wants to send mail indicating success or failure to the person who made the request. The daemon will only look at the first line. There is one other type of text line that is special to the queueing process. If a file containing data was queued into Tmpqdir[], then it should be classed an input file, and prefixed with the letter I (i.e. ``Ifilename''). The same is true of banner data (for header pages) that is stored in the data directory by using ``Bfilename''. The full path should not be used, just the name of the file. The filename should be unique, and there is a function in the MDQS library to create each one (mkcontrol() and mkdata()). The second stage queueing process looks for ``Ifilename'' and ``Bfilename'' lines in the control file and moves those files to Qdatadir. If the user asks that a file not be copied, or if you don't want to queue a file into the Qdatadir, then use some other letter to specify the file; lpserver uses the letter ``F'' for this purpose and this should probably be used by other servers as well.

All the other lines are designed to have service- specific information such as input files, option lines, header file lines, etc. The only requirement is that a queuer and server agree on what the information will be. It is possible that the queuer process may be acting on behalf of someone who does not have an account on the system. In this case, the queuer should setgid and setuid to some generally harmless account before creating the control file or the datafiles. Both effective and real IDs should be changed. The owner of the control file and the UID and GID in the request structure MUST match.

Writing Your Own Server

As with queuers, there will undoubtedly be a need for many people to write specialized queue server processes for MDQS. In fact, the need for new servers arises much more frequently than for new queuers. (``qpr'' is generally sufficient for queueing.) The servers initially supplied with MDQS are good models to use, particularly lpserver. The basic environment of a server process is that its standard input is the control file, its standard output is the device of interest, and stderr is some appropriate place. The server will be execed in the directory Qcntrldir[] and will be running with the UID and GID of the person who queued the request.

The control file (standard input) contains first the request structure. All information after the request struct will be in plain ASCII text with a key letter followed by some text, one per line. These text lines are where ``device specific'' information should be placed. Following the request structure, there should be a line which gives the address of the requester, used to send notifications by mail. The line is not guaranteed to be there. If it is, it will be of the form ``Uaddress''. The rest of control file consists of lines that are meant to be interpreted by the server process. This generally includes an options line, ``Onumber'' or ``Ooption,option,...'', and the names of files to be processed. The first character of each line has been chosen to indicate what the rest of the line contains. This is an arbitrary choice, and in fact the control file can contain almost anything after the request structure. For the sake of consistency, we have standardized the use of some of these lines. Changes or additions should be coordinated with our master source or they may not make it to further distributions. I suggest use of the extended argument facility (see ``X'' below) whenever possible to avoid other changes to these string codes. As you can see above, ``U'' is being used for the username/return address information. ``O'' is used to specify options to the server. ``I'' has been used to indicate an input file which is located in the Qdatadir[]. A file specified using ``I'' should be deleted upon completion. ``F'' has been used to indicate an input file that has been specified using a full pathname. This file should NOT be deleted upon completion unless the user so indicates to the server through some option. ``T'' is used to specify the ``title'' or name of a file for printing in the header page or on listings from the ``q'' program. ``E'' is used to pass along environment variables when appropriate. ``X'' is used to pass along arbitrary strings to the server. This is used by the ``extended argument'' facility of qpr and should be used by any queuer/server pair that must communicate fancy arguments between themselves (e.g. fonts for a laser-printer server). The following letters are also in use: ``A'', ``B'', ``H'', ``P'', and ``S''.

As an example, the lpserver I wrote uses eight key letters. ``U'' indicates the submitter as above. ``Onumber'' gives the options for this print request. The number is converted back to an integer using atoi(3) and the resulting integer is used to determine the options. There is no reason why the options could not have been specified using ASCII strings; I just used the numbers to save some time. ``Hheaderfile'' contains the full pathname to a user- supplied header file (this line is optional). ``Bdatafile'' also contains a header, but it is stored in the data directory instead of being referenced with a full path. ``Ifile'' gives the name of a file to print from Qdatadir[]; it should be removed after that file has been printed and before the next is started. This prevents the file from being printed twice if the system crashes in the middle of a big job. ``Ffile'' gives the name of a file to be printed using the full pathname to the user's file. This file is not deleted except on explicit request. The ``Ttitle'' line is used to include the actual source of the following data file. Finally, the ``Ddirectory'' line is used to cause the daemon to change its working directory. All other lines are ignored.

The only really special key letters are ``U'' as mentioned earlier, which should be the first line following the request structure, and ``I'' or ``B'' which specify a file while implying its location in Qdatadir[].

Bringing up MDQS

If you haven't already done so, configure the necessary files to represent your specific system; see the section ``Configuration''. Once everything has been compiled you will be ready to install the MDQS files. The Makefile in the generation directory knows how to make the necessary directories that MDQS will need. On our system the ancillary files live in /usr/brl/lib/mdqs, but it could be almost anywhere that is write protected and generally accessible for reading and execution. I will call this directory QLIB.

First edit the Makefile to indicate where the root of the spooling directories will be; then become superuser and run ``make directories''. Examine the files misc/lib.q.forms and misc/lib.q.sitename, to make sure they contain appropriate information for your system. Mdqsforms(5) tells the particulars of the format of the forms file. The ``sitename'' file should contain a one-line name for your site like ``Sirius Cybernetics Corporation, Gargantubrain #4'', or some such. It should be one line with a trailing newline. Next, copy (possibly edited versions of) those files to the place you specified in the file qconf.c (normally QLIB). If the forms file is absent, all forms are considered valid. Put login names of users who will be MDQS managers in the Managers file specified in qconf.c.

Next compile and copy queue2, lpserver, shserver, mdqsdaemon, and any other servers you need to their appropriate places, usually QLIB. These are created by the Makefile with names beginning with ``x'', although the installed executable binaries should not have such names. The remainder of the binaries (qstat, batch, qpr, qmod, and qdev) should be put in some directory where they are accessible to users, probably /usr/bin or /usr/local/bin. If the Makefile is properly configured, ``make install'' should install fresh versions of all MDQS binaries.

Next you should place a copy of misc/lib.q.qconf in the location you specified in qconf.c (usually /etc/qconf or QLIB/qconf). This needs to be tailored for your own queueing needs. For starters, turn on debugging until the system seems to be working by setting the parameter ``debug'' to 1 in qconf. If you want a continuous printout of the state of the queue, set the debug variable to 5. You should now be ready to start the daemon. Since there will be a fair amount of debugging output, you will probably want to start the daemon on a spare terminal. Once MDQS is working, you should arrange for the daemon to be run automatically whenever your system is booted into multi-user mode; misc/etc.rc.local can be used as a guide.

Finally, you should examine the replacement ``lp'', ``lpr'', and ``ipr'' commands in the ``misc'' directory; these are just shell scripts that act as interfaces to ``qpr''. You may be able to use them as-is or adapt them to your needs.

Addenda

At this point you are pretty much on your own. The original author, Doug Kingston, no longer works at BRL (these instructions were recently revised by other BRL staff); if you cannot resolve a problem try sending network mail to ``BCST@ARL.MIL'' on the Internet. We cannot guarantee response to any outside requests, but since we maintain MDQS for a large number of machines here at BRL, we would appreciate knowing about any changes, bugs, or additions you make to the system. We hope to incorporate these bug fixes and additions into any public distributions that might be made in the future. At this time the system appears to be quite stable, but when significant improvements are made, another public distribution may be considered.

At ARL, MDQS runs on:

DEC Vax-11/780 platforms under 4.2 BSD and higher, and under Ultrix
Sun-3 and Sun-4 platforms under SunOS 4.1.*
Solaris 2.3 and higher
SGI 4D platforms running Irix 4.0.5, Irix 5.3, and Irix 6.3, including Power-Challenge Array systems
IBM-PC compatible platforms running BSD/OS 2.0.1 and higher
Cray YMP and Cray-2 platforms running UNICOS 5 and higher

In this day of downsizing, note that BRL is now called ARL (BRL--).