sbox: Put CGI Scripts in a Box

Abstract

sbox is a CGI wrapper script that allows Web site hosting services to safely grant CGI authoring privileges to untrusted clients. In addition to changing the process privileges of client scripts to match their owners, it goes beyond other wrappers by placing configurable ceilings on script resource usage, avoiding unintentional (as well as intentional) denial of service attacks. It also optionally allows the Webmaster to place client's CGI scripts in a chroot'ed shell restricted to the author's home directories.

sbox is compatible with all Web servers running under BSD-derived flavors of Unix. You can use and redistribute it freely.

The current release is 1.10. Download it from the Web at http://stein.cshl.org/WWW/software/sbox/.

Older versions are also available.

Introduction

Poorly-written CGI scripts are the single major source of server security holes on the World Wide Web. Every CGI script should be scrutinized and extensively tested before installing it on a server, and subject to periodic review thereafter.

For Web hosting services, however, this advice is impractical. Hosting services must sponsor multiple Web authors of different levels of competence and reliability. Web authors do not trust each other, and the Web hosting service does not trust the authors. In such a situation, CGI scripts are even more problematic than usual. Because all CGI scripts run under the Web server's user ID, one author's scripts can interfere with another's. For example a malicious author could create a script that deletes files created by another author's script, or even cause another author's script to crash by sending it a kill signal. A poorly written script that contains a security hole can compromise the entire site's security by, for example, transmitting the contents of the system password file to a malicious remote user. The same problems are faced by large academic sites which provide Web pages for students.

For most Web hosting services it would be impossible to subject each and every author's CGI scripts to code review. Nor is it practical to cut off CGI scripting privileges entirely. In the competitive world of ISP's, customers will just move elsewhere.

The most popular solution to this problem is the use of "wrapper" scripts. In this system, untrusted author's CGI scripts are never invoked directly. Instead a small wrapper script is called on to execute the author's script, the target. The wrapper is SUID to root. When the wrapper runs, it subjects the target to certain safety checks (for example, checking that the script is not world-writable). The wrapper then changes its process ID to match the owner of the target and executes it. The result is that the author's script is executed with his own identity and privileges, preventing it from interfering with other author's scripts. The system also leads to increased accountability. Any files that an misbehaving script creates or modifies will bear the fingerprints of its creator. Without a wrapper, it can be impossible to determine which author's script is causing problems.

The limitations of wrapper scripts are three-fold:

  1. Wrappers provide little protection against attacks that involve reading confidential information on the site, for example sensitive system files or protected documents.
  2. Wrappers expose the author to increased risk from buggy scripts. By running the author's script with his owner permissions, the wrapper grants it the ability to read, write or delete any file in the author's home directory.
  3. There is no protection against denial-of-service attacks. A buggy script can go into an endless loop, write a huge file into /usr/tmp, or allocate an array as large as virtual memory, adversely affecting system responsiveness.
A better solution is to box author's CGI scripts. In this solution, the CGI script is executed in a restricted environment in which its access to the file system and to other system resources is limited. This is what sbox (Secure Box) accomplishes. When run, it does several things:
  1. It checks the environment for sanity. For example, the script must be run by the Web user and group, and not by anyone else.
  2. It checks the target script for vulnerabilities, such as being world writable or being located in a world writable directory.
  3. It performs a chroot to a directory that contains both the script and the author's HTML files, sealing the script off from the rest of the system.
  4. It changes its user ID and/or group ID to that of the target script.
  5. It sets ceilings on the target script's CPU, disk, memory and process usage.
  6. It lowers the priority of its process.
  7. It cleanses the environment so that only variables which are part of the CGI protocol are available to the script.
  8. It invokes the target script in this restricted context.

sbox is highly configurable. It can be configured to chroot without changing its process ID, to change its process ID without performing the chroot, to change its group ID without changing its user ID, to establish resource ceilings without doing anything else, or any other combination that suits you.

System Requirements

sbox is designed to run with any Unix-based Web server. The package should compile correctly on any standard Unix system; however the resource limits use the BSD-specific setrlimit() and setpriority() calls. If you do not know whether your system supports these calls, check for the existence of the file /usr/include/system/resource.h. If this file does not exist, then chances are slim that you can use the resource limits. You can run sbox without the limits by setting the preprocessor define SET_LIMITS to FALSE (see below).


Installation

After unpacking the package, you should have the following files:

Makefile
README.html (this file)
README.txt  (this file as text)
sbox.h
sbox.c
env.c
You will first examine and edit the Makefile, then change sbox.h to suit your site configuration and preferences. It is suggested that you keep copies of the unaltered files for future reference.

Adjusting the Makefile

Using your favorite text editor, examine and change the value of the INSTALL_DIRECTORY variable. This is the location in which sbox will be installed, and should correspond to your site-wide CGI directory.

You may also need to fiddle with the options for the install program. The default is to make sbox owned by user "root" and group "bin", and installed with permissions -rws--x--x. This configuration is SUID to root, necessary in order for the chroot and process ID changing functions to work.

If you wish to adjust the C compiler and its flags, change the CC and CFLAGS variables as needed.


Adjusting sbox.h

This is the fun part. sbox.h contains several dozen flags that affect the script's features. These flags are implemented as compile-time defines rather than as run-time configuration variables for security reasons. There is less chance that the behavior of sbox can be maliciously altered if it has no dependences on external configuration files.

You should review sbox.h with a text editor and change the settings as needed. A typical entry looks like this:

/*
 * ECHO_FAILURES  -- If set to TRUE, will echo fatal error messages
 *              to the browser.  Set to FALSE to inhibit error messages.
 */
#ifndef ECHO_FAILURES
#define ECHO_FAILURES TRUE
#endif

This section sets a feature called ECHO_FAILURES to TRUE. To change the value to FALSE, simply edit the line that begins with "#define" to read like this:

#define ECHO_FAILURES FALSE

General Settings

These variables correspond to general sbox settings such as logging and environment consistency checking.

WEB_USER (default "nobody")
This defines the name of the user that the Web server runs under, "nobody" by default. If your Web server uses a different user ID, you must change this define to match.

WEB_GROUP (default "nobody")
This defines the name of the group that the Web server runs under, "nobody" by default. If your Web server uses a different group ID, you must change this define to match.

UID_MIN, GID_MIN (defaults 100,100)
These define the lowest UID and GID that the script will run a target CGI script as. On most systems, low-numbered user and group IDs correspond to users with special privileges. Change these values to be the lowest valid unprivileged user and group ID. Under no circumstances will sbox run a target script as root (UID 0.)

SAFE_PATH (default "/bin:/usr/bin:/usr/local/bin")
This defines the search path that will be passed to the author's CGI scripts, overriding whatever was there before.

USE_ABSOLUTE_ROOT (no default)
If set to an absolute path, sbox will chroot to a hard-coded directory and use that as its root. Use this if you want to have sbox work on a particular directory not related to a user's directory or the web root.
NOTE: the sbox binary you compile will work for that directory ONLY. If you want to use it for another directory, recompile and use a different binary

Logging Settings

sbox can be set to log all its actions, including both failures and successful launches of author's scripts. Log entries are time stamped and labeled with the numeric IDs of the user and group that the target script was launched under.

LOG_FILE (default none)
This specifies a file to which sbox will log its successes and failures. Set this to the full path name of the file to log to. An empty string ("") will make sbox log to standard error, which will cause its log messages to be directed to the ordinary server error log. Leaving LOG_FILE undefined will cause sbox not to log any messages.

ECHO_FAILURES (default TRUE)
If this define is set to a true value, any fatal errors encountered during sbox's execution will be turned into a properly-formatted HTML message that is displayed for the remote user's benefit. Otherwise, the standard "An Internal Error occurred" message is displayed.

Chroot Settings

These variables controls sbox's chroot functionality. The path names are relative to the document root. In the case of virtual hosts, this will be whatever is specified by the DocumentRoot directive in the server's configuration file. In the case of user-supported directories, it will be the user's public_html directory.

DO_CHROOT (default TRUE)
If set to a true value, sbox will perform a chroot to a restricted directory prior to executing the CGI script. Otherwise no chroot will be performed.

ROOT (default "..")
This tells sbox where to chroot to relative to the document root. This directory should ordinarily be a level or two above the document tree so that the script can get access to the author's HTML documents for processing.

CGI_BIN (default "../cgi-bin")
This define tells sbox where to look for the author's scripts directory, relative to his site's document tree. This directory should be contained within the directory specified by ROOT. For best security, you should specify a directory that is outside the document tree. The default is a directory named "cgi-bin" located at the same level as the document root.

SUID/SGID Settings

DO_SUID, DO_SGID (defaults TRUE, TRUE)
These defines control whether the script will perform an SUID and/or an SGID to the user and group of the target CGI script. From the author's point of view it's safer to perform an SGID than an SUID, and usually is more than adequate. If no SUID or SGID is performed, the author's script will be run with the Web server's privileges.

SID_MODE (default DIRECTORY)
This define controls whether sbox should use the ownership of the target script or the directory containing the target script to determine whose user ID and/or group ID to run under. Use directory mode if several users have authoring privileges for a single virtual host.

Resource Limitation Settings

SET_LIMITS (default TRUE)
If set to a true value, sbox will set resource usage ceilings before running the target CGI script. You may need to set this to FALSE if you are using a system that does not implement the setpriority() and/or setrlimit() calls.

PRIORITY (default 10)
This controls the priority with which target scripts are run. Values can range from -20 to 20. Higher numbers have less priority.

LIMIT_CPU_HARD, LIMIT_CPU_SOFT, LIMIT_FSIZE_HARD, LIMIT_FSIZE_SOFT...
These and similar defines control the resource ceilings. The definitions set caps on CPU usage, the number of processes the script can spawn, the amount of memory it can use, the size of the largest file it can create, and other attributes. For each resource there are two caps, one hard, the other soft. Soft resources can be increased by any program that desires to do so by making the appropriate calls to setrlimit(). Hard limits are inviolable ceilings that cannot be lifted once established, even by a privileged user. The hard limits should be rather liberal, the soft limits more strict. See the setrlimit() man page for details on each of these resources.

If you use LIMIT_FSIZE_HARD or _SOFT and are logging to stderr, be careful! If your web server error log is larger than the limit, no logging will occur.


Making and Installing the Binary

Compile the sbox binary by typing make. If it compiles successfully, become root and type make install to install it in your site's cgi-bin directory (at the location specified in the Makefile.)

You can also install sbox manually by copying it into your cgi-bin directory and settings its permissions to ---s--x--x. This can be done with the following commands while logged in as the root user:

# chown root sbox
# chgrp bin  sbox
# chmod 4111 sbox

Configuring the Server and User Directories

In order for sbox to be effective, CGI scripts should be turned off in all user-supported directories and document directories. All CGI scripts should be placed in the main cgi-bin directory. No one but authorized site administrators should have write or listing privileges for this directory. If you are using the Apache server, a typical entry for a virtual host will look like this:

<VirtualHost *>
ServerName www.fred.com
ServerAdmin  fred@fred.com

DocumentRoot /home/fred/sbox_root/html
TransferLog  /home/fred/sbox_root/logs/access_log
ErrorLog     /home/fred/sbox_root/logs/error_log

<Directory /home/fred/sbox_root>
    Options        MultiViews Indexes SymLinksIfOwnerMatch IncludesNoExec
    AllowOverride Options AuthConfig Limit
    order allow,deny
    allow from all
</Directory>

</VirtualHost>

(Please be sure to use Options and AllowOverride directives that match the security policy of your site.)

For a site that uses UserDir-style home pages (http://www.your.site/~username), a typical configuration is:

UserDir sbox_root/html

<Directory /home/*/sbox_root>
    Options       MultiViews Indexes SymLinksIfOwnerMatch IncludesNoExec
    AllowOverride Options AuthConfig Limit
    order allow,deny
    allow from all
</Directory>

Note that in both cases, the user's document root (where his HTML files go) is "~fred/sbox_root/html", that is, two directory levels below his home directory. When sbox runs, it uses the position of the user's document root to find its root and the cgi-bin directory. The suggested defaults defined in sbox.h make the ROOT equal to "..", and CGI_BIN equal to "../cgi-bin", both relative to the document root. Hence in the examples given above, sbox's root will be ~fred/sbox_root, and sbox will look for his CGI scripts in the directory ~fred/sbox_root/cgi-bin. When sbox runs in chroot mode, ~fred/sbox_root becomes the new top level ("/") directory, insulating the user's CGI script from the rest of his home directory, as well as the rest of the file system. This prevents the CGI script from inadvertently (or deliberately) doing something antisocial, but gives the script access to the user's HTML files, for filtering and templating.

Because the user's CGI script is cut off from the rest of the filesystem after the chroot call, dynamically linked programs (including interpreters and the like) will not be happy unless they can find the shared libraries they rely on. Therefore, the sbox root directory should be set up like a miniature root directory, and contain whatever binaries, configuration files and shared libraries are necessary for programs to run. This list is different from system to system. See Using the Miniroot and Tips for advice on setting this directory up.

Below is the structure of Fred's directory, assuming that the virtual host uses ~fred/sbox_root/html as its document root.

% ls -l ~fred/sbox_root
total 10
drwxr-xr-x   2 fred   users  1024 Oct 23 06:27 bin/     system binaries
drwxr-xr-x   3 fred   users  1024 Oct 19 20:44 cgi-bin/ CGI scripts
drwxr-xr-x   2 fred   users  1024 Oct 12 16:59 dev/     device special files
drwxr-xr-x   2 fred   users  1024 Oct 19 17:57 etc/     configuration files
drwxr-xr-x   2 fred   users  1024 Oct 22 19:14 html/    HTML document root
drwxr-xr-x   3 fred   users  1024 Oct 19 20:35 lib/     shared libraries
drwxr-xr-x   3 fred   users  1024 Oct 19 20:35 logs/    log files
drwxr-xr-x   2 fred   users  1024 Oct 23 05:48 tmp/     temporary files
drwxr-xr-x   2 fred   users  1024 Oct 23 05:48 usr/     files that belong in usr
drwxr-xr-x   2 fred   users  1024 Oct 23 05:48 var/     files that belong in var

If you do not take advantage of sbox's chroot feature, but just use it for its ability to change to the user's UID and GID, then you do not have to do any special directory setup.

See Supporting Apache .htaccess files and Rewrite-Rule Tricks for additional common configuration setups that make sbox more transparent to use.


Calling sbox

To use sbox create URLs like this one:

http://www.virtual.host.com/cgi-bin/sbox/script_name
       ^^^^^^^^^^^^^^^^^^^^              ^^^^^^^^^^^
        virtual host name               user's script

The first part of the URL is the path to the sbox script. The second part is the path to the user's script, relative to the cgi-bin directory in his home directory. If the user's script needs access to additional path information, you can append it in the natural way:

http://www.virtual.host.com/cgi-bin/sbox/script_name/additional/path/info

For user-supported directories, use this format:

http://www.virtual.host.com/cgi-bin/sbox/~fred/script_name

Users are free to organize their script directories into a hierarchy. They need only modify script URLs to reflect the correct path:

http://www.virtual.host.com/cgi-bin/sbox/foo/bar/script_name

Supporting Apache .htaccess files

If you are using the Apache web server and wish the user to be able to password-protect or otherwise modify access to his cgi-bin directory using a .htaccess file, then you will need to activate and use Apache's mod_rewrite module. Otherwise any .htaccess file located in the user's cgi-bin directory will be ignored. This method will also make it so that if the requested executable is not found in the cgi-bin directory, the error condition will fall through to Apache's error handling system (using ErrorDocument) rather than raising an sbox error.

First make sure that Apache was compiled with the mod_rewrite module and that the module is loaded at startup time. The relevant directive is:

LoadModule rewrite_module lib/apache/mod_rewrite.so

Now assuming that user cgi-bin directories are installed in ~user/sbox_root/cgi-bin, that the sbox executable is installed in /cgi-bin/sbox, and that user directories are located at /home/username, enter the following into your httpd.conf file:

For Virtual Hosts

RewriteEngine on
RewriteLog "/var/log/apache/rewrite_log"
RewriteLogLevel 0

RewriteCond %{REQUEST_FILENAME} ^/cgi-bin/sbox/(.+)
RewriteCond %{DOCUMENT_ROOT}/../cgi-bin/%1 !-F
RewriteRule ^/cgi-bin/sbox/(.+) %{DOCUMENT_ROOT}/../cgi-bin/$1  [L]
RewriteRule ^(/cgi-bin/sbox/.+) $1 [PT,NS]
(This goes into each VirtualHost section)

For User Directories

RewriteEngine on
RewriteLog "/var/log/apache/rewrite_log"
RewriteLogLevel 0

RewriteCond %{REQUEST_FILENAME} ^/cgi-bin/sbox/~([^/]+)/(.+)
RewriteCond /home/%1/sbox_root/cgi-bin/%2 !-F
RewriteRule ^/cgi-bin/sbox/~([^/]+)/(.+) /home/$1/sbox_root/cgi-bin/$2  [L]
RewriteRule ^(/cgi-bin/sbox/~.+/.+) $1 [PT,NS]

(This goes into the main section of httpd.conf)

These pretty complicated looking pieces of code says that for URLs that begin with /cgi-bin/sbox/~username/filename, first check whether the file /home/username/sbox_root/cgi-bin/filename exists and is available via Apache's access rules. If it isn't available, then rewrite the URL as /home/username/sbox_root/cgi-bin/filename and perform the usual processing as if it were a file (which will result in a 403 or 404 error). Otherwise, don't rewrite the URL and pass it through to the CGI handler. You will need to tweak these a bit if users' home directories are somewhere else than /home/user or if you have changed the names or positions of the sbox root and cgi-bin directories from their defaults.

To support users' ability to change access rights using .htaccess, make sure to enable AuthConfig and Limit in sbox_root if you haven't done so already:

<Directory /home/*/sbox_root>
    AllowOverride +Options +AuthConfig +Limit
</Directory>

Using the Miniroot (Linux only)

For the convenience of Linux system administrators wishing to use the chroot features of sbox, I have placed a miniature root directory at stein.cshl.org/software/sbox/miniroot.img.gz. This is a gzipped ext2 filesystem image that contains essential system device files, shared libraries, and executables, including Perl version 5.8.7 and the most commonly used Perl libraries. The filesystem image is based on the one distributed with RIP.

You can use this image in several ways:

Install a copy of the image into each user's directory:

This way gives each user a skeleton root directory that he is free to modify, providing him with considerable flexibility. The downside is that you may not wish users to have so much flexibility; it also takes up about 45 megabytes of space per user directory:

  1. Download the miniroot from stein.cshl.org/software/sbox/miniroot.img.gz.
  2. Unzip it:
          gunzip miniroot.img.gz
          
  3. Mount the resulting disk image in loopback mode (you must be root to do this):
          mkdir /mnt/miniroot
          mount ./miniroot.img /mnt/miniroot -o ro,loop
          
  4. Copy the contents of the miniroot into each user's sbox root directory (assuming in this example that it is ~fred/sbox_root):
          cd /mnt/miniroot
          find . | cpio -p ~fred/sbox_root
          
  5. Create the user's html, cgi-bin and log directories:
          mkdir ~fred/sbox_root/{html,cgi-bin,log}
          
  6. Fix permissions of these directories:
          chown fred.users ~fred/sbox_root/{html,cgi-bin,log}
          

Mounting a copy of the miniroot in each user's directory

The alternative method avoids the waste of putting a complete copy of the root into each user's directory. One copy of the miniroot is mounted read-only into each user's sbox root, giving them read-only access to the mount. The main disadvantage of this strategy is that it generates a mount for each user, which in the case of very many user accounts might bump up against kernel limitations.

  1. Download the miniroot from stein.cshl.org/software/sbox/miniroot.img.gz.
  2. Unzip it:
          gunzip miniroot.img.gz
          
  3. Create the user's html and cgi-bin directories, as well as a directory called "mnt":
          mkdir ~fred/sbox_root/html
          mkdir ~fred/sbox_root/cgi-bin
          mkdir ~fred/sbox_root/mnt
          
  4. Mount the miniroot read-only on the user's mnt/ directory:
           mount ./miniroot.img ~fred/sbox_root/mnt -o ro,loop
           
  5. Create symlinks to the directories of the mounted filesystem:
           cd ~fred/sbox_root
           ln -s mnt/* .
           
  6. Fix permissions of html, cgi-bin and log.

At the end of this process, you should have a directory structure that looks like this:

lrwxrwxrwx  1 root root     7 Dec  4 18:18 bin -> mnt/bin
drwxrwxr-x  2 fred users   96 Dec  4 18:15 cgi-bin/
lrwxrwxrwx  1 root root     7 Dec  4 18:18 dev -> mnt/dev
lrwxrwxrwx  1 root root     7 Dec  4 18:18 etc -> mnt/etc
drwxr-xr-x  5 fred users 1136 Dec  4 18:15 html/
lrwxrwxrwx  1 root root     7 Dec  4 18:18 lib -> mnt/lib
lrwxrwxrwx  1 fred users    7 Dec  4 18:15 log/
drwxrwxr-x  2 root root    48 Dec  4 18:16 mnt/
lrwxrwxrwx  1 root root     7 Dec  4 18:18 tmp -> mnt/tmp
lrwxrwxrwx  1 root root     7 Dec  4 18:18 usr -> mnt/usr
lrwxrwxrwx  1 root root     7 Dec  4 18:18 var -> mnt/var

If you ever wish to modify the miniroot image, simply mount it read/write and make the changes you need. If you run out of space on the miniroot, you can create a new one with the following series of commands:

mount ./miniroot.img /mnt/miniroot -o ro,loop
dd if=/dev/zero of=./new_miniroot.img bs=1M count=100  # or whatever you want
mke2fs -F ./new_miniroot.img
mount ./new_miniroot.img /mnt/new_miniroot -o rw,loop
cd /mnt/miniroot
find . | cpio -p /mnt/new_miniroot

You are also free to burn the miniroot into a CDROM image, create a cramfs image, etc.


Tips

Here are a few pieces of advice and tips on making best use of sbox.

Setting up the Chroot directory

Many CGI scripts will not run correctly in a chroot environment unless they can find the resources they need. Compiled C programs often need access to shared libraries and/or device special files. Interpreted scripts need access to their interpreters, for example Perl. Feature-rich programs like sendmail depend on their configuration files being present in /etc.

As described above, you will need to turn the chroot directory into a miniature root file system, complete with /etc, /lib, /bin, /tmp and /dev directories. If the web server is running on a Linux system, then one option is to use the miniroot image provided with sbox as the basis for the root file system. If you prefer to do it yourself, I recommend that you create and test a chroot directory for one virtual host, then use it as a master copy for creating new virtual hosts every time you add a new user account. Both the cpio and the tar commands can be used to copy shared libraries and device special files safely.

Programs that check file ownerships may need access to password and/or group files in order for them to translate from numeric uid's and gid's to text names. In order to support CGI scripts that perform this type of action, you should place dummy copies of /etc/passwd and /etc/group in the author's /etc directory. These files should not contain real passwords, and should only contain standard system user accounts (e.g. "bin" and "mail"), plus any needed by the script. You probably don't want to make the complete list of user account names available to authors' CGI scripts!

If CGI scripts require access to the DNS system in order to resolve host names and IP addresses, you should place a copy of /etc/resolv.conf into the chroot directory. You may need to copy other configuration files to use certain feature-rich programs. For example, if scripts send e-mail using the sendmail program, you will need to install its configuration program, sendmail.cf.

Many programs redirect their output to the device special file /dev/null. Other programs need access to /dev/zero or other special files. You can copy these files from the real /dev directory using either cpio or tar. Alternately you can create the files from scratch using mknod, but only if you know what you're doing. You'll need to have superuser privileges to accomplish either of these tasks.

The Unix time system expects to find information about the local timezone in a compiled file named /etc/localtime. You may need to copy this into your chroot directory in order for the timezone to be correctly displayed. You can confirm that the correct timezone is being found by examining the output of the "env" executable.

There are two ways to finesse the problem of shared libraries. For compiled C scripts, one option is to link the program statically (by providing the -static flag to the linker). A less laborious solution is to place copies of the required shared libraries in the new root's /lib directory (or /slib, for systems that use that directory for shared libraries). Many systems have a utility that lists the shared libraries required by a binary. Use this program to determine which shared libraries are required, and copy them over into each author's /lib directory. In addition to the shared libraries, you may need to copy the dynamic linker itself into the /lib directory. On my linux system, this file is "ld-linux.so".

If a executable cannot find its shared libraries at run time, it will usually fail with a specific error message that will lead you to the problem -- look in the server error log. If you get silent failures, it's probably the dynamic linker itself that can't be found.

Linux, and possibly some other systems, uses a cache file named /etc/ld.so.cache to resolve the location of library files. If this file isn't found at run time, the system will generate a warning but find the correct shared libraries nevertheless. The quick and dirty way to get rid of this warning is to copy the current cache file from the real /etc directory to the chroot one. However, this may have bad side effects (I haven't actually encountered any, but I worry about it.) It's better to make this cache file from scratch in the chroot environment itself. To do this, run the ldconfig program with the command-line version of chroot. You'll need to be root to do this:

# cd /sbin
# chroot ~fred/pub ./ldconfig

Perl scripts, in addition to requiring the Perl interpreter, will often need access to the Perl lib directory in order to get at useful modules (such as CGI.pm). It's easiest to copy the whole Perl library tree to the correct location in the chroot directory, being careful to get the path right. For example, if the real Perl library is located in /usr/local/lib/perl5, you'll need to create a parallel /usr hierarchy in the chroot directory. On my system, I recompiled Perl to use /lib/perl5 and dumped the modules into that directory. If things get bolluxed up, you can always tell Perl where to look for its libraries by appending something like this to the top of CGI scripts:

#!/bin/perl
BEGIN { push(@INC,'/lib/perl5','/lib/perl5/i586-linux/5.004'); }

The Document Root and the chroot() directory

Some CGI scripts act as filters on static HTML documents. Examples include PHP and various guestbook scripts. Such scripts often include the path to the static document appended to the end of the script's URL as "additional path information." For example:

http://your.site/~fred/guestbook.cgi/~fred/guestbook/data.txt

The script will be passed two environment variables, PATH_INFO, containing the additional path information, and PATH_TRANSLATED, containing the path information translated into an absolute filename. In the example above, the values of these variables might be:

PATH_INFO/~fred/guestbook/data.txt
PATH_TRANSLATED/home/fred/public_html/guestbook/data.txt

When sbox is running it interprets the additional path information as relative to the user's document root. This means that a document located in Fred's public_html directory can be referred to this way:

http://your.site/cgi-bin/sbox/~fred/guestbook.cgi/guestbook/data.txt

After performing the chroot(), sbox attempts to adjust PATH_TRANSLATED so that it continues to point to a valid file. If the user's document root is located within the chroot directory, then PATH_TRANSLATED is trimmed so that it is relative to the new root directory:

PATH_INFO/guestbook/data.txt
PATH_TRANSLATED/public_html/guestbook/data.txt

However, if the document root is entirely outside the new root directory, then sbox will simply use the same value for PATH_INFO and PATH_TRANSLATED:

PATH_INFO/guestbook/data.txt
PATH_TRANSLATED/guestbook/data.txt

Users and Webmasters should be aware of this behavior, as it can cause some confusion.

The Resource Limitations

The default resource limits are reasonable. Most authors won't have problems with them unless they need to do number crunching or manipulate many files simultaneously. If need be, authors can raise the soft resource limits up to the levels imposed by the hard limit ceilings, which are very liberal. C programmers can do this directly by making calls to setrlimit(). Perl scripters should download and install Jarkko Hietaniemi's BSD::Resource module from CPAN.

Server-Side Includes

Because of design conflicts, the "#exec" style server-side include do not work correctly with sbox. However, the "#include virtual" command, which does almost exactly the same thing, does work correctly. To include the output of sbox-wrapped CGI scripts in server-side-include files, just write something like this:

&lt;!--#include virtual="/cgi-bin/sbox/~fred/guestbook"--&gt;

Rewrite-Rule Tricks

If you are running Apache 1.2 or higher, you can take advantage of the rewrite rule module to make sbox transparent. For virtual hosts, you can add something like the following to main or the <VirtualHost> section:

RewriteEngine on
RewriteRule ^/cgi/(.*) /cgi-bin/sbox/$1 [PT,NS]
This replaces all URLs that start with "/cgi" with "/cgi-bin/sbox". This lets authors refer to their scripts with:
http://www.virtual.host.com/cgi/script_name
and to main Web server scripts with:
http://www.virtual.host.com/cgi-bin/guestbook
For user-supported directories, this rewrite rule will allow users to refer to their scripts using http://www.host.com/~username/cgi/script_name:
RewriteEngine on
RewriteRule ^/~([^/]+)/cgi/(.+) /cgi-bin/sbox/~$1/$2 [PT,NS]

If you are already using rewrite rules to allow users to control access with a .htaccess file, place the appropriate RewriteRule before the first RewriteCond and omit the [PT,NS] flags. The following two examples show RewriteRule blocks that will correctly respect .htaccess files:

For Virtual Hosts

RewriteRule ^/cgi/(.+) /cgi-bin/sbox/$1
RewriteCond %{REQUEST_FILENAME} ^/cgi-bin/sbox/(.+)
RewriteCond %{DOCUMENT_ROOT}/../cgi-bin/%1 !-F
RewriteRule ^/cgi-bin/sbox/(.+) %{DOCUMENT_ROOT}/../cgi-bin/$1  [L]
RewriteRule ^(/cgi-bin/sbox/.+) $1 [PT,NS]

For User Directories

RewriteRule ^/~([^/]+)/cgi/(.+) /cgi-bin/sbox/~$1/$2
RewriteCond %{REQUEST_FILENAME} ^/cgi-bin/sbox/~([^/]+)/(.+)
RewriteCond /home/%1/sbox_root/cgi-bin/%2 !-F
RewriteRule ^/cgi-bin/sbox/~([^/]+)/(.+) /home/$1/sbox_root/cgi-bin/$2  [L]
RewriteRule ^(/cgi-bin/sbox/~.+/.+) $1 [PT,NS]

The env Script

This distribution comes with a small statically linked binary called "env" that you can call as a CGI script. It prints out some information about the current environment, including the user and group ID's, the current working directory, and the environment variables, to help you determine whether sbox is configured correctly and working as expected.


Author Information

This utility is ©1997-2005 Lincoln D. Stein. It can be used freely and redistributed in source code and binary form. I request that this documentation, including the copyright statement, remain attached to the utility if you redistribute it. You are free to make modifications, but please attach a note stating the changes you made.


Change History

Version 1.10
Revamped documentation to show how to get .htaccess and 404 Not Found errors to work correctly.
Added an example root directory for use in chroot mode.
Versions 1.08-1.09
Never released.
Version 1.07
Patch from Jukka Forsgren to cause script to chdir() into target directory in the same manner as Apache does.

Version 1.06
Fixed cross-scripting security vulnerability identified by Ivan Schmid (ivan.schmid@astalavista.ch)

Version 1.05
Lost version.

Version 1.04
Changes to make sbox compile with egcs version 1.1.2
Fixed problem of CGI scripts not being able to access command line variables (courtesy Sean Gabriel Heacock)
If logfile can't be opened, logs to standard error instead.

Version 1.03
Added USE_ABSOLUTE_ROOT functionality, contributed by Grant Kaufmann.

Version 1.02
Fixed a crash that occurred when configured userid or groupid is not in passwd/group file (patch provided by Terry Lorrah <delikon@itw.net>).

Version 1.01
Fixed minor bug in webmaster's error message.
Fixed minor bug in reporting gid to log file

Version 1.00
Replaced all occurrences of strcpy() and strcat() with strncpy() and strncat().
Changes to string constants to make more ANSI-compatible.
Code cleanup

Versons 0.98-0.99
Documentation fixes.

Version 0.97
Fixed bugs relating to automounter confusion.

Version 0.95
Fixes to compile and run on Solaris systems. Still not extensively tested, but no bug reports yet.

Version 0.90
Beta release. Use with caution.

Lincoln D. Stein, lstein@cshl.org
Cold Spring Harbor Laboratory
Last modified: Mon Dec 5 15:58:19 EST 2005