All database operations are based on the Unix notion of pipes and filters. Thus, programs tend to be small, doing one task. The output from any one program can be piped into another.
A data file is a simple ascii file, with columns separated by a single tab. Thus, two tabs in a row denote an empty field (e.g. a missing value). Data fields can contain anything, character or numbers, although some commands will strip the 8th bit since the Unix tools called upon will do so. Naturally, there is no built-in notion of maximum field-, record- or file-length.
The primary problem with flat Unix files is the loss of the data dictionary concept. This is partially solved by using the first two lines in any file as a header, where the first line contains the column names (separated by tabs) and the second line contains a series of dashes. This allows commands to operate on columns by their name, regardless of their position in the input file.
The only restriction on the format of the files is that a file must contain the same fixed number of tabs in every line.
This is best explained by way of an example:
xcol y z v u w i ---- - - - - - - 0 1 2 3 4 * 6 1 2 3 4 5 ** 7 2 3 4 5 6 *** 8 3 4 5 6 7 v 9 4 5 6 7 8 vi 10
When the fields are long, the columns will not look "straight" as in the above example, but that is irrelevant. The only important element is that a tab always separates the columns, and there are an equal number of tabs in every line.
Reldb commands can read this file from the standard input. Thus, one can select lines from the file (c.f. grep(1)), deal columns (c.f. cut(1)), use join to join it (sideways, like join(1) ) with another file or use union to concatenate two such files (like cat (1) ). Other operations include rename to rename columns, compute to compute new values into columns, addcol to add columns into a table, sort to sort a table, etc. etc.
Data from a reldb file can be trivially piped to "ordinary" Unix programs, via the tail +3 < data command. In particular, this allows a very easy interface with the excellent collection of statistical routines by Gary Perlman, |Stat, (previously Unix|Stat), also available as freeware.
The best feature of the reldb approach is the possibility of combining programs using Unix pipes. Thus one can first select lines where 'quantity' is at least one, and deal columns 'price' and 'quantity' from the output:
Similarly, one can compute new values into columns, select lines, deal columns and pipe the result through scat to prot the results.select 'quantity >= 1' < data | project price quantity
The column names should be restricted to ascii characters and underscores. On some systems international characters (i.e. ISO 8859/1, "upper half") will work, but a number of Unix implementations have awk's which strip the 8th bit. Furthermore, symbols such as a hyphen and an asterics will be misinterpreted in compute and select, which call upon awk(1) to do their work, using the column names as variable names. Thus the column names are bound to the same rules as variable names in awk.
Most of the commands have their man- page, but there will always be some that are missing, since programs will always be written before their documentation, contrary to what tends to be called good practise.
These programs would never have been written had VenturCom Inc. not refused to port their Prelude package to our (then) central computer. This taught us the lesson that one is never independent of hardware vendors unless one is independent of software vendors. The lack of Prelude on the new computer made us think about just how hard it would be to rewrite the bare essentials of a rdbs (here called project, select, union and jointable). We found this a trivial task and in time a large number of other programs followed. All of them were written as they were needed and most of them are based on corresponding Unix tools, making the "writing" incredibly easy in each case. Having gone through this, we now very much question the wisdom of everyone writing in proprietary 4GL languages and using expensive dbms for simple applications.
Prelude is a much more complete system than reldb, but the principal elements look much the same as reldb was modeled to resemble a very early version of Prelude. A user wanting a complete and bug-free rdbs might want to contact VenturCom, but of course reldb is free and the source code is available, to be readily ported to any system (no, we do not have any ties with VenturCom, other than being a fairly happy customer).
This approach, although very cheap in manpower, has its problems in terms of integration, performance and elegance of individual programs. Overall, the system has been found to be extremely portable, as it is based mainly on shell scripts, which call corresponding Unix tools, after dealing with the header lines. Efficiency, although nowhere nearly the best possible, has rarely been a bottleneck.
Bugs (with suggested corrections, please) should be reported to gunnar@hafro.is (Internet) or uunet!mcvax!hafro!gunnar (Uucp).
Edited for cLIeNUX PIRDE. Some of the above functionality may no longer be present, such as the stat stuff.