SEED Wiki   DatabaseConfigurationNotes UserPreferences
 
HelpContents Search Diffs Info Edit Subscribe XML Print View

Notes on database configuration issues (7/26/05)

The current SEED bootstrap configuration tools know how to completely set up a Postgres database environment dedicated to serving a single SEED. They also know how to set up a mysql database environment configured to use a default system mysql server.

1 Configuration basics

The basic mechanism is as follows. Each environment (the "arch" in the ./configure <arch command) has a configuration file. The postgres-based configs just do the following:

   $pg_script = "$ENV{FIGCONFIG_RELEASE_DIR}/FigCommon/envsub.postgres.pl"; 
   run_script($pg_script); 

Similarly, the mysql-based ones do this:

   $pg_script = "$ENV{FIGCONFIG_RELEASE_DIR}/FigCommon/envsub.mysql.pl"; 
   run_script($pg_script); 

Each of the database configuration script is responsible for setting up the following keys in the $Config hash:

Key valid values Description
dbms mysql Database type (the DBI identifier for the database in use)
Pg
db string Name of database
dbuser string Database username
dbpass string Database password
dbport integer TCP port the database listens on
dbsocket pathname Unix socket the database listens on
sproutDB string Name of the Sprout database
db_datadir pathname Directory containing the database's internal files
preIndex 0 or 1 Set to 1 if the index should be created before loading data tables.

The configuration may also set up database-specific environment variables. These will assist the user in having the proper environment configured to successfully use the command-line database interaction tools.

Database system Environment variable Value Description
[WWW]Postgres PGHOST string Default server name. If this begins with a slash, it specifies Unix-domain communication rather than TCP/IP communication; the value is the name of the directory in which the socket file is stored (default /tmp).
PGPORT integer Default TCP port number or Unix-domain socket file extension for communicating with the PostgreSQL backend.
[WWW]Mysql MYSQL_HOST hostname The default hostname used by the mysql command-line client.
MYSQL_TCP_PORT integer The default TCP/IP port number.
MYSQL_UNIX_PORT pathname The default Unix socket filename; used for connections to localhost.

2 Configuration choices

The choice of database system is made by the user at the time a SEED instance is configured. There are two main choices to be made: what database system is to be used, and should the SEED installer set up a local server to provide that service.

These choices result in two chunks of configuration logic being invoked. One involves the bootstrapping of the database directory, if the user has chosen to have the SEED instance host the database. The other involves the configuration of the location of the database. If the SEED is hosting the database, much of this choice can be made automatically. If it is not, the installer can guess (perhaps using the standard defaults for the database in question), but will generally have to defer to the user's knowledge.

The next hurdle is determining how the configuration script is notified of this information. An ideal solution would allow interactive configuration for new users, and batch-mode via command lines for advanced users or for script-based configuration. In either case, the following configuration choices must be made. (Note that this is distinct from the list of configuration settings with which we opened this discussion; those are implementation details for the following user-visible choices).

db-type postgres or mysql Database system
db-locally-hosted boolean True if the SEED instance is going to host the database
db-port integer TCP port the server listens on
db-socket pathname Unix socket the server listens on
db-host hostname Hostname the database is on. Defaults to localhost, implicit if locally_hosted is true
db-dir pathname Directory into which the database internal files will be installed. Defaults to the FIGdb directory in the FIGdisk.
db-seed-name string Database instance name for the SEED database. Defaults to fig
db-sprout-name string Database instance name for the Sprout database. Defaults to Sprout

3 How does it work

After FIG_Config is created with the proper settings, the fig-user-env is created with the proper enviroment, and the user has sourced the fig-user-env, we can now have the user finish the configuration. He now has to do this by running init_FIG which does the following:

Since we will will always need to perform the init step after a configure (*) we can roll it into the configure process, though it will most likely be its own program invoked at the end of configuration.

One twitchy part of this is that before the database instance can be created, the database server must be started. Hence we have the following sequence of events:

  1. Run configure to create FIG_Config.pm, fig-user-env.sh
  2. Run init_dbserver to create the database directory, if we are hosting the database within the SEED environment.
  3. Run start_dbserver to bring up the database, if we are hosting the database within the SEED environment.
  4. Run init_FIG to set up the database instance.
  5. Run fig load_all to load the SEED data.

(*) Is this true? What if we are reconfiguring an existing system to fix up perl paths? -- That should probably be made an explicit step - something on the order of "configure-environment".


PythonPowered
FindPage by browsing, title search , text search or an index
Or try one of these actions: LikePages, LocalSiteMap, SpellCheck