The Jigsaw resource factory is a piece of software that runs behind
the scene, and creates
HTTPResource
instances out of existing data. The factory currently knows about files and
directories of the underlying file system, but you can extend it to handle
more objects, at will.
This document describes when the factory is called, how it maps files or directories to resources, and provide a brief overview of the form-based configuration tool.
Each running server has a resource factory attached to it (which it might share with other server, but this is not relevant here). Any resource can call its server factory in order to create a resource out of an existing object. Currently, the only resource that does so is the DirectoryResource, which is the one that exports existing directories.
When queried for an URL component, at lookup time, the directory resource first checks its children resource store for a matching resource, if such a resource is found, than it is returned as the target of the lookup, otherwise, if the directory is flaged as extensible, the directory resource derives a file name from the resource's identifier, and goes to the resource factory to obtain a wrapping resource instance. If such a resource is built successfully by the factory, the directory resource installs it as one of its children resources, and manages its persistency.
Let's walk through this algorithm with an example. Suppose there is a directory
resource User
which wraps an underlying file-system directory
named User
. This directory resource will usually be created
empty (with no children resources). At some point, a client will ask for,
say, User/Overview.html
. The lookup process starts, and after
some iterations comes to the point were it looks for
Overview.html
in the directory resource User
. The
directory resource looks into its children resources to find it, as none
is found, it goes to the resource factory, and asks it to construct a resource
for the file Overview.html
. If a resource is returned (which
depends on the factory configuration), the directory
resource plugs the newly created resource into its resource store, and returns
it as the target of the lookup.
One important remark here: as resources are persistent objects (they persist
across Jigsaw invocations), resources that wrap existing objects are
created only once in the whole lifetime of the server. This means
that changing the factory configuration after a resource has been
indexed, has no effect on the resources that have already been created. This
is one of the features that makes the server fast: indexing an existing object
into a resource might be a costly process (it will involve querying multiple
databases, such as the extensions and directory templates database, etc.).
Caching the result of this operation allows the server to concentrate on
its real work, which is to serve data back to clients. You may still however,
want to change the resource factory configuration, and re-index part of your
information space with these new options. The DirectoryResourceEditor lets
you reindex files when needed. If you want the whole site to be re-indexed,
then one last resort is to stop the server, delete all the
.jigidx
files, and re-run it. This will make the server re-index
the whole site as it runs.
To index files and directories, the resource factory manages two databases, that are editable through a form based interface (see the factory configuration section). The first database, known as the extension database, record how files of a given extensions should be mapped to resources. The second database, known as the directory template database, records how directories are to be mapped to resources.
When the factory is called to index a normal file, the first thing it does
is to split the file name into its raw name, plus its set of extensions.
So, for example, if the file to be indexed if foo.en.html.gz
,
the raw name will be foo
, and the set of extensions will be
{en
, html
, gz
}.
It then take each extension description record, and look if it defines a
resource class. In a typicall setting, only the html
extension
will have an associated resource class, which is likely to be the
FileResource
class. This gives the factory the class of the resource to build for the
given file, so the factory carries on by creating an empty instance of this
class. It then creates a set of default attribute values, first by defining
the following pre-defined set of attributes:
identifier
defaults to the file name,
directory
defaults to the file directory,
last-modified
time defaults to the last-modified time of
the file
ontent-length
defaults to the length of the file.
Then for each of the file extensions, it looks into the associated database
record, and fill in the remaining attributes. The html
extension
record, for example, might define the default value for
content-type
to text/html. The en
extension
record will probably define the content-language
default value
to en, and finally the gz
extension record will probably
state that the resource's content-encoding
default value should
be x-gzip. Once the set of default attribute values is constructed,
the resource is initialized, and returned.
When the factory is called to index a directory, it examines its directory templates database. This database allows the web admin to map directory names to specific sub-classes of resources. Directory templates can be generic: in which case they apply to all directory below the named one.
For each directory template, the web admin first specifies an appropriate
resource class. A typicall setting, might specify, for example, that all
directory named Putable
should be exported by an instance of
the PutableDirectory
. Moreover, if the template is flaged as
generic, then all directories below the Putable
directory will
also be exported as PutableDirectory instances.
The class attached to a directory template needs not be a sub-class of the
DirectoryResource
.
You can specify, for example, that directories named CVS
should
be exported through a
CvsDirectoryResource
,
which will provide you with a form-based interface to CVS.
Configuring Jigsaw factory consists of editing the extensions and directory templatesdatabases. This can be done entirely through forms. This section describes how this works, you might also want to check the configuration tutorial.
Jigsaw release comes with a sample root directory that includes an
Admin
directory. This directory, in turn, provides two resources
that allows you to edit the factory configuration databases through forms.
The first one, usually named extensions
, will allow you to edit
the extensions database.
Point your browser to your /Admin/extensions
URL. This will
show up the sorted list of currently defined extensions. To remove an extension
record, mark it by clicking on the check box, and press the OK button:
the extension record is deleted from the database. To edit a particular extension
record, click on it. This will bring up a form, containing all the default
attribute values for the extension. This form changes depending on the class
that you have attached to the extension (extension with no class applies
to all resources, hence, they allow you to edit the
HTTPResource
attribute values). You can change any of these values, which will provided
as default attribute values for resources wrapping a file that matches this
particular extension.
To define new extensions, click on the /Admin/extensions
AddExtension link. This will popup a form querying you for the extension
name, and the (optional) attached class. Let's say you want to define the
extension ps
for exporting application/postscript files.
Type in the name of the extension (here ps
), and attach it the
w3c.jigsaw.resources.FileResource
class, then click on the OK button. This will popup the attribute
editor, state that the default value for the content-type
is
application/postscript, and press the OK button. You are done:
all files having the ps
extension will be exported through a
FileResource whose default value for the content-type
attribute
will be application/postscript.
Now, let's create some directory templates. Point your browser to
/Admin/DirectoryTemplates
. This will display the sorted list
of currently defined templates. To remove a directory template, just mark
it (by clicking the check box), and press the OK button. To edit the
attributes of a directory template, click on its name, this will display
the set of attributes for the directory template itself. If you want the
template to be generic, then turn its generic flag to true (it will
then apply to directory having the given name, but also to all directories
below it).
You will also see a link named ShadowAttributes. By following this
link, you will be able to edit the default attribute values for the resource
to be created when this template is used. For example, if your template is
attached to the
DirectoryResource
class, this will allows you to edit the default attribute of this resource
class.
Anselm Baird-Smith
$Id: indexer.html,v 1.1 1996/04/23 19:10:32 abaird Exp $