========================================= GAVO DaCHS installation and configuration ========================================= .. contents:: :depth: 3 :backlinks: entry :class: toc These installation instructions cover the installation of the complete data center suite. Installing libraries or, say, the tapsh, is much less involved. See the respective pages at the `GAVO DC's software distribution pages`_ for details on those. .. _GAVO DC's software distribution pages: http://vo.ari.uni-heidelberg.de/soft Basic Installation ================== Dependencies ------------ Unfortunately, DaCHS has quite a few dependencies. Thus, you're doing yourself a favour if you run Debian or some Debian-derived distribution. On such a system, :: sudo aptitude install python-dev build-essential \ python-nevow python-psycopg2 python-pyparsing python-pyfits\ python-numpy python-imaging python-soappy python-zsi\ python-setuptools libxml2 subversion \ libedit-dev libxslt-dev libpam-dev libreadline-dev should pull in everything that's required apart from the postgres engine itself. For that, install ``postgresql-``; versions supported right now include 8.3 and 8.4, but it's worth trying more recent ones. You will also need to install the server development package. On 2011 Debian stable, this would look like this:: PGVERSION=8.4 aptitude install postgresql-$PGVERSION postgresql-server-dev-$PGVERSION PgSphere '''''''' PgSphere is a postgres extension for spherical geometry. It is needed for support of the geometric types in DaCHS' ADQL implementation and in the preferred SIAP backend, so you should definitely install it. Obtain the source from http://pgsphere.projects.postgresql.org/, and in the source directory run:: USE_PGXS=1 make sudo USE_PGXS=1 make install Q3C ''' DaCHS uses the Q3C library by Sergey Koposov et al, http://www.sai.msu.su/~megera/oddmuse/index.cgi/SkyPixelization for positional indexes. DaCHS uses it for positional indexes (the scs#q3cindex mixin) and in the interpretation of ADQL. It is therefore highly recommended to install it. To do that, get the source from the web site given above, and in the source directory run:: make sudo make install Debian systems -------------- The preferred way to run DaCHS is on Debian stable or compatible systems. However, for the complete data center suite, the apt-based approach is not recommended yet, in particular since it is quite possible that you will run into bugs, and it will be much easier for us if you can update directly from our subversion repository. Still, you should `add our apt repository`_ to your system's sources.list. When you have done this, you *could* do:: sudo aptitude install gavodachs But really, if you want to operate a server, you should, for now, work from svn. .. _add our apt repository: http://vo.ari.uni-heidelberg.de/soft/repo Working with the subversion source ---------------------------------- Getting the source '''''''''''''''''' Though we may provide releases_ now and then, you probably should just check out whatever is in the subversion repository right now. Say, at some place you can write to:: svn co http://svn.ari.uni-heidelberg.de/svn/gavo/python/trunk/ dachs After that, the current source code is in the ``dachs`` subdirectory. This is development code, so *please* do not hesitate to contact us if something weird is going on with it. We mean it; even trivial reports help us to gauge where our software behaves contrary to expectations. Plus, we don't have oodles of users, so chances are you won't get on our nerves. Try gavo@ari.uni-heidelberg.de using mail, or ++49 6221 541837 on the plain old telephone. You can also use XMPP ("jabber"), we'll give you an id on request. .. _releases: http://vo.ari.uni-heidelberg.de/soft/ Installing from source '''''''''''''''''''''' The DaCHS installer is based on setuptools; we do not use setuptools' dependency management, though, since in practice it seems more trouble than it's worth, which means you need to manually install Dependencies_. To install the software, in the ``dachs`` directory you checked out above, say:: sudo python setup.py develop (there are various options to get the stuff installed when you prefer not to install as root; refer to the `setuptools documentation`_ if necessary). The checkout itself needs to be readable by whoever later runs the server in this mode. You can also use ``install`` instead of ``develop``; in that case, you will have to rerun setup.py everytime you update the source. .. _nevow: http://divmod.org/trac/wiki/DivmodNevow .. _setuptools documentation: http://peak.telecommunity.com/DevCenter/EasyInstall Setup ===== Account Management ------------------ You should first create a user that the DaCHS server runs as later, and a group for running DC-related processes in:: sudo adduser --system gavo sudo addgroup --system gavo sudo adduser gavo gavo (or similar, depending on your environment). This user should not be able to log in, but it should have a home directory. Everyone that may issue a ``gavo serve debug`` must be in the group created (this is because the log directory will be writable by this group); in particular, you should add yourself:: sudo adduser `id -nu` gavo You may want to create another account for "maintenance", or just use your normal account; if more than one person will feed the data center, you'll need more elaborate schemes; do not use the gavo group as the "data center maintenance group". Database setup -------------- The most complicated step in setting up DaCHS is actually setting up the database. We currently only support postgres. While it is conceivable to use DaCHS together with an existing postgres database, we do not recommend trying this the first time. Experiment with a database dedicated to DaCHS first, then consider whether it's worth interfacing to your existing database or whether a copy of that data is more convenient. Cluster Creation '''''''''''''''' You first need a database to play with, preferably in a suitable cluster (you could skip this, but the all bets are off as to whether you'll be able to store non-ASCII characters in strings). Database cluster generator is very system-dependent, and ideally a database admin would assist you. On Debian systems dedicated for GAVO DaCHS, you can try the following: (#) Find out the version of the server you will be running (e.g., using ``dpkg -l``; in Debian, more than one version may be installed in parallel. It's probably a good idea to use the most recent one. Set your desired version for subsequent use:: export PGVERSION=8.3 (#) Drop the Debian default cluster (this will delete everything in there -- for a fresh install, that doesn't matter, but don't do this if other people use the database). If you don't do this, your database will listen do a different port, and you will have to adapt the default profiles:: sudo pg_dropcluster --stop $PGVERSION main (#) Create the new cluster used by DaCHS:: sudo pg_createcluster -d / \ --locale=C -e UNICODE\ --lc-collate=C --lc-ctype=C $PGVERSION pgdata The locale should currently be C, because only the C locale will allow you databases with all kinds of encodings. The database stores descriptions and similar entities, and you may encounter funny characters in there. It would be a shame if you couldn't store them (plus, you would get odd error messages for those). (#) Start the server:: sudo /etc/init.d/postgresql-$PGVERSION start (#) Create the database itself:: sudo -u postgres createdb --encoding=UTF-8 gavo On Debian, the configuration files for this cluster are at ``/etc/postgresql/$PGVERSION/pgdata/``. Note that running a database server always is a security liability. You should make sure you understand what the pg_hba.conf (in postgres' configuration directory) says. As a minimum, you should have a line like:: local gavo gavo,gavoadmin,untrusted md5 in there, probably right below the line allowing the postgres user complete access (the order of lines in pg_hba.conf is significant); it allows password authentication for the three users above from the local machine. If you have two machines sitting on a reasonably trusted local network, you could say something like:: host gavo gavo,gavoadmin,untrusted xxx.xxx.xxx.0/24 md5 (where the x must be replaced with your network number). If you insist on having the data between DaCHS and the postgres server go through untrusted lines, see the postgres docs on how to set up an SSL connection. Initial Account Setup ''''''''''''''''''''' At least during setup, you also need superuser privileges on the database. For ``gavo init`` below to work, your normal account must have such privileges. On Debian systems, you can simply say:: sudo -u postgres createuser -s `id -nu` You can drop those privileges later if they make you nervous, but for gavo init you need to be DB superuser. Also note that DaCHS assumes your server is trusted, and if people have managed to take over an account in the gavo group, they can do with your database whatever they please anyway. In particular (don't complain we didn't tell you), DaCHS currently encrypts *no* passwords; for the DB passwords, sensible encryption would mean the software requires some passphrase during startup, which we don't want. For user passwords (for protecting web resources), it would make no sense since with HTTP basic authentication as employed by DaCHS, they travel through the net unencrypted anyway (which is sometimes called "mild security"). Configuration File ------------------ Next, you need to decide on a "root" directory for DaCHS. Below it, there are data descriptions, cache files, logs, etc. (these locations can be changed later, but for a simple setup we recommend keeping everything together). By default, this is ``/var/gavo``. DaCHS is configured in an INI-style configuration file in ``/etc/gavo.rc`` (overridable using the envirvonment variable ``GAVOSETTINGS``). In addition, users, in particular the gavo user, can have ~/.gavorc files, the contents of which override settings in /etc/gavo.rc. `Configuration Settings <./opguide.html#configuration-settings>`_ gives a walkthrough through the most important settings; for now, you must set the DaCHS root dir if you are not happy with ``/var/gavo``:: [general] rootDir: /data/gavo as /etc/gavo.rc. You can now let DaCHS create its file system hierarchy:: gavo init For this to work, ``rootDir`` must exist and be writable by you, or you must have sufficient privileges to create it. Do *not* run ``gavo init`` as root, since the files and directories it creates will be owned by whoever ran the program. In the typical situation in which you may not write to ``rootDir``'s parent, do something like:: sudo mkdir -p /data/gavo sudo chown `id -nu`:gavo /data/gavo ``gavo init`` may spit out a warning or two on the first run. On repeated runs no output at all should appear. If you database server is not on the same machine as your web server (which is not recommended for a test setup), you cannot use the automatic database setup. In that case, say:: gavo --nodb init and read `Manual Database Preparation`_ You can later run gavo init again. It will not clobber anything you did in the meantime (well, if it does, it's a bug and you should fiercely complain). In particular, this is the most convenient way to create directories if you changed locations in ``gavo.rc``. Binaries ======== If you want to provide previews for images, you must compile some binaries. these are in the ``src`` subdirectory of the source tree. Just enter all subdirectories there in turn and say ``make install`` in each. Run these as yourself (i.e., as datacenter administrator), not as root. Specifying meta information fallbacks ===================================== In the file ``$GAVO_ROOT/etc/defaultmeta.txt`` you should give some information filled in when the resources do not give this kind of metadata themselves. Don't sweat it for now, but you must fix it before you run your own registry. Customizing the appearance on the web ===================================== Uh. I need to work on this. Basically, you'd have to check out ``resources/web`` in the source distribution. You can take these files and copy them to ``$GAVO_ROOT/web/`` to edit them there. The machinery should then pick them up and use them instead of what comes with the distribution (it doesn't, yet, for all such files). This is clearly suboptimal. Good ideas for "shallow" configurability are welcome. More dependencies ================= The following software components are not really hard dependencies, but they are in some ways used by very common functions of DaCHS, and thus you *should* install them unless you know what you are doing. VOPlot ------ While DaCHS contains a very rudimentary Javascript-based plotting component, in-browser plotting is better done through VO India's VOPlot. Due to the applet security model, the applet has to originate from the server that will deliver the data, and so you will need to install VOPlot locally if you want to use this feature; if you don't do this, you should comment out or delete :: from ``$INSTALLROOT/gavo/resources/templates/defaultresponse.html`` (or leave this out from your local defaultresponse.html). To make this work, first `download VOPlot `_ and unpack the distribution at a convenient place. In the resulting folder, you'll find a subdirectory binaries. Its contents is expected at ``$GAVOROOT/web/nv_static/voplot``. Thus, within this ``binaries`` subdirectory, you could do:: DESTDIR=`gavo config webDir`/nv_static/voplot mkdir -p $DESTDIR cp -r voplot* $DESTDIR (for VOPlot 1.2 – recommended for its compact size –, you can just copy voplot.jar). wcstools -------- To support cutout services, you need getfits from the wcstools package available at http://tdc-www.harvard.edu/software/wcstools/. After building wcstools, the binary is in /bin/getfits. The system expects the binary getfits in $GAVO_INPUTSDIR/cutout/bin/getfits. This concludes the installation instructions for the normal case. Only read on if you're curious and/or courageous. Manual Database Preparation =========================== Normally, the following steps are done by ``gavo init``. So, on a normal install you can stop reading here. However, if you want to play tricks (e.g., remote database server), the following instructions should help. Creating database users ----------------------- The data center software accesses the database in various functions. These are mapped to profiles which correspond to access information (basically, the DSN, user, and password). There are three of them: * feed -- the "admin" profile, used for feeding tables into the normal database, for user management, credentials checking and the like. * trustedquery -- this profile is used for queries generated by the DC software (though usually on behalf of a user). The corresponding DB role can access all "normal" tables, privilege management is supposed to happen throught the web interface. * untrustedquery -- the profile used for user-contributed SQL. Only tables expressly opened up are accessible to it. You can adapt those names as necessary in the corresponding profiles. See the section on profiles in the `Operator's Guide `_ for details. The following procedure sets up users and databases as expected by the default profiles (if you made yourself a superuser account as described above you do not need the ``sudo -u postgres`` in these commands):: # create the database that'll hold your data sudo -u postgres createdb --encoding=UTF-8 gavo # create the user that feeds the db... sudo -u postgres createuser -P -ADsr gavoadmin # and a user that usually has no write privileges sudo -u postgres createuser -P -ADSR gavo # and a user for ADQL queries (i.e., untrusted queries from the net) sudo -u postgres createuser -P -ADSR untrusted Enter the passwords you assign here into the ``feed``, ``trustedquery``, and ``untrustedquery`` profiles, respectively. These profiles are found in ``$GAVO_ROOT/etc``. Finally, you need to let the various roles you just created access the database; you do this using the command line interface to postgres:: sudo -u postgres psql gavo \ -c "GRANT ALL ON DATABASE gavo TO gavoadmin" For the individual tables, rights to gavo and untrusted are granted by ``gavo imp``, so you do not need to specify any rights for them. Reading the extensions' SQL files --------------------------------- Both pgsphere and q3c have files that define SQL functions and such. You'll have to manually read them into your new database. You can find these SQL files in the source directories of the packages, or in your server's contrib directory. On Debian systems, these contribution directories are in ``/usr/share/postgresql//contrib``. So, on postgres 8.4 you could say:: SRCDIR=/usr/share/postgresql/8.4/contrib psql gavo < $SRCDIR/q3c.sql psql gavo < $SRCDIR/pg_sphere.sql Importing basic resources ------------------------- There are some built-in tables in DaCHS, related to metadata storage, certain protocols, and the like. You must import them before the DC software can be used. This also is a nice test that at least some things work. So, in this sequence, run:: gavo imp --system //dc_tables gavo imp --system //services gavo imp --system //users gavo imp --system //tap gavo imp --system //products gavo imp --system //obscore Output of the type ``Columns affected: 0`` is ok for these commands. The double slash in the identifiers above means "use system resources". All these really refer to resource descriptors (RD) in the __system__ resource directory; at this point, they are the RDs shipped with DaCHS. If you get error messages, add a ``--hints`` after the gavo command, like this:: gavo --hints imp --system //dc_tables This will (for the ``gavo`` command in general) give additional error info where available. You should now be able to run the examples in the tutorial.