Project Home

How To Setup A Portable Unix Configuration

Author: Martin Blais
Contact: blais@furius.ca
Status: First draft: 1999-03
LastUpdate:$Date$

Abstract

Description of a proven organization system for a user's configuration files under UNIX, to support multiple platforms, multiple sites and hosts.

Contents

Introduction

Unix software configuration files come in many varieties. They are often written in many different (often funny) languages, and for the seasoned hacker are often the result of many many precious hours of tweaking and learning--in my personal case it has become pathological: I dub it the Custom-Configuration Syndrome (CCS), hours and hours spent tailoring my environment.

Obviously, one doesn't want to restart this work everytime he changes location, company, school, platform, etc. Moreover, one may share the same configuration say, at home and at work. For example, my emacs configuration files have been growing with me since I first started using it. For the past almost ten years now, I have been using more or less the same setup that I devised to store my configuration files efficiently and minimally.

This document explains how my configuration files are organized, so that they are easily portable across various platforms and sites.

We assume here that this configuration is to happen for a specific user and so all the paths mentioned in this document are relative to the user's home directory.

Scopes

It is necessary here to define scopes of applicability to factor the different variables that our configuration files are going to apply to.

Definitions

version
A specific version of a computer program e.g. the current version of emacs is 20.3.
host
A specific machine. We assume here that a host can be uniquely identified by its DNS name e.g. vandam.iro.umontreal.ca, and as such the definition of host comprises the complete name (i.e. host + domain).
platform
A specific version of an operating system running under a specific computer architecture. Some examples: Solaris 2.5 runing on an ultra-sparc Sun machine, IRIX 6.5 running on SGI Octane with mips processor, Linux 2.2 with glibc5 running on an intel x486 processor pc.
site
An organization, an institute, a university lab, a software company, your home. In other words, a place where you're logging in.

Basic scopes

  • common scope: settings that are to be shared across all platforms;
  • site-specific scope: settings that are only valid at one site;
  • platform-specific scope: settings that are only valid on one platform;
  • host-specific scope: settings that are only valid on one host.
  • version-specific scope: settings that are only valid for certain versions of certain software;

Combined scopes

Of the five basic scopes, the last three ones are of specific interest. They can be combined. Let's examine the possible combinations:

  • site- AND platform- specific scope:settings that are only valid for a certain platform at a certain site. This combination is common in a university setting where there are many labs with different platforms, but your account is shared from the same server via NFS;
  • host- AND platform- specific scope: settings that are only valid on a certain host under a certain platform. This is only useful for the case where a particular machine is configured with dual-boot to different operating systems, for example, Linux and Windows;

Note that a site- and host- specific scope does not really make sense, because we assume that a host lies within a specific site. The same applies to a site- and platform- and host- specific scope. Host means site-host.

We must note that in practice, the common, site-specific and platform-specific scopes cover 99% of the cases.

Configuration files vs. Locally Installated files

The essence of the configuration files directory is user-specific configuration for your software. A question arises: should software installed by and for the user only be stored in this directory [1]? Technically, it could, since the different scopes support it, and it is easy to get tempted into doing this. It can sometimes be handy to have scripts and other pieces of software related to configuration available on all the platforms and sites.

Installing software in the configuration tree has several drawbacks:

  • The configuration directory tree becomes HUGE. We don not want that, since we want to be able to replicate it at many different sites, and manage it under version control (e.g. CVS). Thus this directory should be limited to relatively small text files, and very small binaries, only if necessary (for example, my PGP keyring is in there).
  • Some software that is installed locally at one site may not be working at other sites when simply replicated, because it may depend on other files at runtime;
  • Binary files are not managed efficiently under version control.

I propose that most of the user's own software installations happen under a different root directory else, which we will call the "local configuration", and that will be identified with the environment variable CONF_LOCAL. Under this directory, a hierarchy similar to the one present under the configuration directory should be present (see description below).

As a degenerate case, CONF_LOCAL can be $HOME, in which case the user's home directory would contain ~/common and ~/plat subdirectory hierarchies.

[1]Note that we do not refer here to machine-wide software installations, which are still done in the usual places, i.e. /usr, or /usr/local or /opt or wherever. Sometimes a user wants to install software for him only, or perhaps does not officially have root access (for example, in a university lab setting).

How it works

Basically, each file is split in scope-specific smaller files. A main file "includes" the scope-specific files with whatever mechanism is provided by whatever parsing engine that file is executed by. If it's a shell script, the scope-specific files are are "sourced"; if it's a .emacs lisp file, the scope-specific .emacs files are "loaded", etc.

The dot files that are expected to be found in the user's home directory are symbolically linked with a home-made script to do just that. For example, for my .emacs:

~/.emacs -> conf/common/etc/emacs

Note: often, one must pay attention to the order in which the root scripts include the other scripts.

Environment Initialization

Initializing a shell's environment properly is rather important, since so many applications depend on it.

Critical Configuration Variables

Some environment variables are critical to implement this configuration scheme. In our setup, we assume that the following environment variables are always defined. My configuration files depend on the existence of only these variables in their running environment.

  • HOME: the user's home directory. This is most often already set by the system when logging in;
  • PLAT: contains a string that uniquely represents the platform. One can think of many ways to have this set automatically at login. I have succesfully been using GNU's config.guess script which does a pretty good job at generating a unique ''canonical name'' for the platform you're running on. It also has the added bonus that someone else is maintaining this painful script. It generates strings of the general form: <architecture>-<maker>-<os name><os version>, some examples: mips-sgi-irix6.5, i686-pc-linux, i686-pc-winnt4.0, rs6000-ibm-aix3.2, sparc-sun-solaris2.4;
  • SITE: the current site is determined by a unique string. Examples are: unilab, mycompany. It can be set in the .xsession (that's what I do) or could be determined by a tailored script that checks for a list of hosts;
  • HOST: a string that identifies the hosts we're on. Hosts are unique within a specific site, and the host-specific files should be located under that site's directory.
  • CONF: the directory where the configuration files are located. It can lie anywhere, but this directory is probably one of the user's many projects. (For reference, I used to use $HOME/conf for a long time, now I put the directory alongside my other projects).

Since there are potentiatlly multiple entry points for the initialization files, I set those critical variables in $HOME/.conf; here are typical contents:

CONF=$HOME/p/conf
CONF_LOCAL=$HOME/p/conf-local
PLAT=`$CONF/common/bin/config.guess`
SITE=home
HOST=`hostname`  # Linux.

This can be sourced by the various initialization files.

Initialization Levels

The initialization files that get run for various shells do not match the reality of the various scenarios where we desire specific sets of initializations. To simplify and rationalize the selection of the initializations that are performed for each scenario, we define three levels of initialization of the environment:

  1. conf: Initialize the critical configuration variables (CONF, SITE, ...)
  2. env: Bourne-shell compatible environment settings (*/env)
  3. bash: Bash initialization (*/bashrc)

The trick to creating the configuration correctly is to create generic initialization files that detect the various scenarios described above and which then dispatch to the initialization levels depending on our desire for the various scenarios (see figure).

initdispatch.png

We create generic dot files in $CONF/init, to which we link, and make these dot files invoke the various initialization steps as we need them.

Scenarios

Here are the scenarios under consideration.

Scenario: Logging into the console.

What gets run: /etc/profile + .bash_profile, .profile

The levels we want: conf, env, bash

Scenario: Logging into an X11 session.

What gets run: /etc/profile + .bash_profile, .profile

The levels we want: conf

Note: we could tolerate environment initializations here but I find it much cleaner not to have it, it runs a lot less code during X session setup, so the chances of something going wrong preventing you from logging in are slim. xprofile is run for those minimal initializations. We do need the basic config variables however, for the window manager commands itself.

Scenario: Remote ssh login.

What gets run: /etc/profile + .bash_profile, .profile

The levels we want: conf, env, bash

Scenario: Starting a shell from the virtual terminal: .bashrc

What gets run: /etc/profile + .bash_profile, .profile

The levels we want: We want: conf, env, bash

An alternative would be to let the X session fully initialize its environment and to do nothing here. This would make the shells start very very fast. A disadvantage with this method is that you cannot simply change the init files and then restart a virtual terminal, you have to fully log out to change your shells' environments.

Scenario: Remote ssh command: .bashrc

What gets run: /etc/profile + .bash_profile, .profile

The levels we want: We want: conf, env

Note that we do not want to run bash specific initializations for non-interactive shells, we want to save time, so that the commands run as fast as possible. However, we like to have a setup where the environment variables and path have been set correctly.

Constraints

  • We need to the variables CONF, CONF_LOCAL, PLAT, SITE, HOST as early as possible, and they are potentially different on each machine.
  • We cannot run /etc/profile more than once, because a warning is issued from bash-completion if you do (due to a read-only variable), and some initialization may concatenate path components to environment variables multiple times.
  • When logging in from a remote shell, we want to have the environment setup, but not the bash aliases, so that remote commands run reasonably fast.
  • We want a consistent clean PATH variable in all scenarios. We need to save the original path that we get after running the profile somehow.

Summary of Bash init sequence

(For reference.)

Detection:
Login shell: first char of $0 is '-' Interactive shell: PS1 is set and $- includes i
  • Interactive

    • Login

      1. /etc/profile
      2. First of: ~/.bash_profile, ~/.bash_login, ~/.profile
    • Not-login

      1. ~/.bashrc
  • Non-interactive

    1. BASH_ENV, if set.
  • When graphical session login script is run, whatever happens depends on the Xsession script and on which xdm is used, distribution, etc. For example, gdm checks manually for .profile, .xprofile and more.

Suggested directory hierarchy

Here is the suggested hierarchy for the configuration files directory:

Root of all configuration files:

$CONF

Scope specific directories (see scopes above):

$CONF/init/...                             global init files,
                                           include all the scopes

$CONF/common/...                           common (shared) scope

$CONF/plat/$PLAT/...                       platform scope

$CONF-$SITE/...                            site scope
$CONF-$SITE/plat/$PLAT/...                 site and platform scope
$CONF-$SITE/host/$HOST/...                 host scope
$CONF-$SITE/host/$HOST/plat/$PLAT/...      host and platform scope

In addition, I use the same principles to manage my local installs (see section on this above), that is:

$CONF_LOCAL/common/...      common stuff, install logs, very local
                            or installed scripts that I didn't write
$CONF_LOCAL/plat/$PLAT/...  platform-specific local installations of software

Under each of those directories the following files and directories can be found:

../etc/...         all dot files for that scope
../etc/env         environment variables (bourne-shell level)
../etc/bashrc      bash-specific settings and environment
../etc/inputrc     readline (for bash) keyboard changes
../etc/emacs     dot emacs
../etc/Xmodmap     modifications to be made to X keyboard mappings
../etc/Xclients    X clients to run upon logging in
../etc/procmailrc  procmail processing
../etc/            ... LOTS OF OTHER DOT FILES

../bin/            executable scripts
                   (few and only small binary executables)

../lib/            other files needed by scripts
../lib/python/     Only custom Python libraries/packages (.py)
../lib/perl/       Only custom Perl libraries/packages (.pm)
../lib/java/       java class files (very few and small)

../man/            man pages that I really need outside of
                   installed applications (very few)
../include/        non application-installed includes (extremely few)
../info/           texinfo documentation for stuff installed under
                   $CONF (not much)

../share/          application data
../share/images/   images needed for some config stuff
                   (really few, e.g. some personal window manager icons)
../share/sounds/   sounds needed for some config stuff
                   (very small files only, and only what is truly necessary)

../app-defaults/   X application-specific resource files

../elisp/          emacs-lisp files and custom packages
../elisp/emacs/    specific to GNU emacs
../elisp/xemacs/   specific to XEmacs (I almost never need it)

../tex/            TeX-related stuff
../tex/bib/        BibTeX bibliography files
../tex/bst/        BibTeX style files
../tex/doc/        LaTeX packages documentation
../tex/inputs/     LaTeX input files (.sty, .cls)

../gnupg/          public keyring and options file
                   (store your secret keyring somewhere else)

../src/            source code to be built for stuff that is
                   absolutely necessary for config purposes (almost nothing)

../log/            logs (note: in local installations subdirectory only)

Notes: in practice, some of the files have branching using the configuration variables when it is convenient. For example, in my common scope .emacs, I often branch on the "site" within that single because it is simpler than doing an override in the site-specific file or "not requiring some package" this way.

Personal Backups

Here are a few issues that you want to be aware of when designing a system/script for doing your personal backups:

Note

I have built Arnie, a Python script to remote incremental backups in a simple, naive but reliable way.

Archiving and Indexing Email

Eventually, when gigabytes are a laughable and meaningless quantity of data, I would like to group all my email from all my backups and generate a searchable index of it, by parsing the headers and generating keywords from the bodies texts.

Search Path

Let's examine the path we need to setup for accessing all the scripts the stuff that may be in this configuration. The following directories should be included in the path, if they exist:

$CONF/common/bin
$CONF/plat/$PLAT/bin
$CONF-$SITE/bin
$CONF-$SITE/plat/$PLAT/bin
$CONF-$SITE/host/$HOST/bin
$CONF-$SITE/host/$HOST/plat/$PLAT/bin
$CONF_LOCAL/common/bin
$CONF_LOCAL/plat/$PLAT/bin

This makes a pretty long addition to your path. With many UNIX shells, a longer path means a slower lookup for your executables.

A better way to make up the path is to have a script filter out those parts of the path that don't exist. Working this way, the same script can also be reused for for sh- and csh- derived shells, helping maintain consistency and reduce the overhead of maintenance of environment scripts if you use multiple shells. The same principle applies to the library search path LD_LIBRARY_PATH.

User Software Installation

If you compile lots of free/open source software, if may be worthwhile to keep logs of how you compiled certain software. The need for this is reduced under Linux now that this information can often be packaged within the different packaging systems (such as RPM).

Autoconf

Installing new software that is using autoconf should be easy. You can use the following command:

./configure --prefix=$CONF_LOCAL/plat/$PLAT

...or if you use multiple platforms at your site:

./configure --prefix=$CONF_LOCAL/common --exec-prefix=$CONF_LOCAL/plat/$PLAT

Propagating the DISPLAY environment variable

When remote logging to a machine from an X session, it is convenient to have the DISPLAY environment variable be propagated automatically to the remote session. This is easily carried out if your account is NFS-mounted on the remote machine: a file containing your current DISPLAY can be read and the display set from that. This file can be automatically create at the time where you log on to your X session.

Acknowledgements

Many of the ideas present in this document originate from interesting discussions with Stefan Monnier and Dominik Madon, a long time ago while we were at EPFL (1994-1995).