===========================
GAVO DaCHS operator's guide
===========================
:Author: Markus Demleitner
:Email: gavo@ari.uni-heidelberg.de
.. contents::
:depth: 3
:backlinks: entry
:class: toc
This document details the configuration and operation of the GAVO DaCHS
server. For information on installing the software, please refer to
the `installation guide`_, to learn how to import data, see the
tutorial_. For an overview of the available documentation, see `DaCHS
documentation`_
.. _DaCHS documentation: http://docs.g-vo.org/DaCHS
.. _installation guide: http://docs.g-vo.org/DaCHS/install.html
.. _tutorial: http://docs.g-vo.org/DaCHS/tutorial.html
Starting and stopping the server
================================
The ``gavo serve`` subcommand is used to control the server. ``gavo
serve start`` starts the server, changes the user to what is specified
in the [web] user config item if it has the privileges to do so
(that's "gavo" by default; you will already have
created that user if you followed the installation instructions) and
detaches from the terminal.
Analoguosly, ``gavo serve stop`` stops the server. To reload some of
the server configuration (e.g., the resource descriptors, the vanity
map, and the /etc/gavo.rc and ~/.gavorc files), run ``gavo
serve reload``. This does not reload database profiles, and not all
configuration items are applied (e.g., changes to the bind address and
port only take effect after a restart). If you remove a configuration
item entriely, their built-in defaults do not get restored on reload
either.
Finally, ``gavo serve restart`` restarts the server. The start, stop,
reload, and restart operations generally should be run as root; you
can run them as the server user (by default, gavo), too, as long as the
server doesn't try to bind to a privileged (lower than 1025).
All this can and should be packed into a startup script. With a system
V init system, you could use something like this
(e.g., as ``/etc/init.d/dachs``; some paths may need adaption for
non-Debian systems)::
#!/bin/sh -e
### BEGIN INIT INFO
# Provides: DaCHS
# Required-Start: $local_fs $remote_fs $network
# Required-Stop: $local_fs $remote_fs $network
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: Start/stop DaCHS Virtual Observatory server
### END INIT INFO
#
ENV="env -i LANG=C PATH=/usr/local/bin:/usr/bin:/bin"
. /lib/lsb/init-functions
test -f /etc/default/rcS && . /etc/default/rcS
test -f /etc/default/apache2 && . /etc/default/apache2
SERVER_BIN="$ENV /usr/bin/gavo --disable-spew serve"
case $1 in
start)
log_daemon_msg "Starting VO server" "dachs"
if $SERVER_BIN start; then
log_end_msg 0
else
log_end_msg 1
fi
;;
stop)
log_daemon_msg "Stopping VO server" "dachs"
if $SERVER_BIN stop; then
log_end_msg 0
else
log_end_msg 1
fi
;;
reload | force-reload)
log_daemon_msg "Reloading VO server config" "dachs"
if $SERVER_BIN reload $2 ; then
log_end_msg 0
else
log_end_msg 1
fi
;;
restart)
log_daemon_msg "Restarting VO server" "dachs"
if $SERVER_BIN restart; then
log_end_msg 0
else
log_end_msg 1
fi
;;
*)
log_success_msg "Usage: /etc/init.d/dachs {start|stop|restart|reload|force-reload}"
exit 1
;;
esac
For development work or to see what is going on, you can run ``gavo
serve debug``; this does not detach and does not change users.
Publication
===========
To "publish" a resource – which means include it either on your site's
home page or in what you report to the VO registry –, add a ``publish``
element to a ``service`` element or a ``register`` element to data or
table elements. Both of these let you specify the sets the resources
shall be published to. Unless you have specific application, only two
sets are relevant: ``ivo_managed`` for publishing to the VO (see
`Registry Matters`_, and ``local`` for publishing to your data center's
service roster. Other sets can be introduced and used for, e.g.,
specific sub-rosters.
The ``publish`` element needs, in addition, a render attribute, giving a
comma-separated list of renderers the publication is for. The various
renderers are translated into capability element in the VO resource
records. For example, a typical pattern could be::
This generates a capability each for the simple cone search and a
browser-based interface; the browser-based interface is, in addition,
listed in the local service roster.
You can also publish tables; for those, the notion of renderers make no
sense, so the publish element doesn't have that. Instead, you could
define services that serve that data. For many cases, you don't even
need to do this, since for tables that have ``adql="True"``, the local
TAP service is automatically considered to be a service for that data.
So, to publish an ADQL-queriable table to the VO for querying via TAP,
just write::
within the table element. A table containing, e.g., data that's queried
in a SIAP service in a different RD, would require something like::
" in their mails.
* contact.address – A contact address for surface mail
* contact.email – An email address. It will be published on web pages,
so there probably should be some kind of spam filter in front of it.
* contact.telephone – A telephone number people can call if things
really look bad.
* creator.name – A name to use when you give no creator in your
resource descriptors. Could be some error sentinel ("we foget to
give credit, please complain") or just contact.name if you produce
resources yourself.
* creator.logo – A URL for a logo to use when none is given in the
resource metadata. Use a small PNG here.
* site.description – A description of your site (i.e., "data center")
Example: ``The GAVO data center provides VO publication services to
all interested parties on behalf of the German Astrophysical Virtual
Observatory.`` (use backslashes an the end of the lines to break long
lines).
.. _tutorial chapter on the registry: tutorial.html#the-registry-interface
.. _authority query against the registry: http://dc.zah.uni-heidelberg.de/__system__/adql/query/form?query=select%20ivoid%20from%20rr.resource%20where%20res_type%3D%27vg%3Aauthority%27%20and%20ivoid%3D%27ivo%3A%2F%2Forg.gavo.dc%27
.. _meta stream format: ref.html#meta-stream-format
Then, fill out the metadata for the system registry resources
in your userconfig RD. This is the stuff in the
``registry-interfacerecords`` stream (which you can copy from
//userconfig if it's not yet in your ``etc/userconfig.rd``. Fill things
out; in particular everything where there's a ``\\metaString`` expansion.
This still is filled from defaultmeta.txt, but as we want to get rid of
that file on the long run, just enter the text as you see fit.
In authority, change in particular
* creationDate – A UTC datetime (with trailing Z);
technically, it should be the date the resource record is created, but
realistically, just use "now" at the time your're writing the
defaultmeta.txt. Example: ``2007-12-19T12:00:00Z``.
* title – A human-readable descriptor of what the authority
corresponds to. Example: ``The Utopia Observatory Data Center``
* description – A sentence or two on what the authority
you are using means. This could be the same as site.description if
all you're claiming authority for is that; if you're claiming
authority for your institute or organization, this obviously should
be different. Example: ``The Data Center at the Observatory of Utopia
manages lots of substantial data sets created by the bright scientists
all over the world`` (use backslashes an the end of the lines to break
long lines).
* shortName – a short (about 16 chars) identifier for your
authority. Example: GAVO DC.
* referenceURL – A URL at which people can learn more about
your data center. Example: ``http://www.g-vo.org``.
* managingOrg – an ivo id of the organization you're running
the dc for. The default is ivo:///org, the content
defined in the ``manager`` resRec. If
your institution has a registry entry independent of your DC, you
can enter that IVORN here as well (and you would remove the
``manager`` resRec).
In the ``manager`` resRec (if you have it), change:
* creationDate – as above for authority.
* title – the name of the organization on behalf of which
you are running the data center. Example: ``Observatory of
Utopia``
* description and referenceURL – as the
analoguous item for authority, just for the organization for which
you are running the data center (e.g., your "home institute").
Example: ``The Observatory of Utopia is Lilliput's
largest astronomical institution with ten large telescopes spread
around the Plain Mountains. Beautiful vistas and lush valleys
make them an attractive holiday spot as well. Book now at the
Observatories soft money department at 1-800-GOOD-GREED.``
After you've specificed all that, you're ready to define your first
resources, viz, your registry itself, the authority, and the
organization that's managing it. These are predefined using the data
you just filled in in the //services RD. To publish them, you say::
gavo pub //services
Registering DaCHS-external Services
-----------------------------------
The registry interface of DaCHS can be used to register entities
external to DaCHS; actually, you're already doing this when you're
claiming an authority.
To register a non-service "resource", you can fill out a resRec_ RD
element. You could reserve an RD (say, ``GAVOROOT/inputs/ext.rd`` to
collect such external registrations, or you could put them alongside
internal services into their respective RDs. You will then usually just
use the resRec's id attribute to determine the IVORN of resource record.
It will then be ``ivo:////``.
In all likelihood, however, you will want to register services. To
do that, use a normal service definition with with a nullCore. You
probably need to manually give an accessURL. The most common case is
that of a service with a ``WebBrowser`` capability. These result from
``external`` or ``static`` renderers. Thus, the pattern here usually
is::
shortName: My external service
description: This service does wonderful things, even though\
it's not based on GAVO's DaCHS software.
http://wherever.else/svc
Of course, you will normally need to add further metadata as discussed
above. ``gavo pub`` should complain if there's metadata missing,
though.
The "services" can be fairly funky, actually; here's how GAVO registers
their ADQL reference card::
shortName: GAVO ADQL ref
creationDate: 2012-11-05T14:24:00Z
title: The GAVO ADQL reference card
subject:Virtual Observatory
subject:Standards
subject:ADQL
description: GAVO's ADQL reference card briefly gives an overview \
of the SQL dialect used in the VO. It is available as a PDF\
file and as Scribus source under the CC-BY license.
referenceURL:http://www.g-vo.org/pmwiki/About/ADQLReference
http://docs.g-vo.org/adqlref/adqlref.pdf
It is likely that if you register external services, you'll want to
manage authorities other than ``[ivoa]authority`` as used by DaCHS. If
you do, just add authority record(s) as before in the
``registry-interfacerecords`` STREAM in your `userconfig RD`_. And
do not forget to add lines like::
edu.euro-vo.org
within the ``
Form-based service
...
...
To publish
Simple OAI operation
--------------------
If you want to check what you have published, see the ``/oai.xml`` on
your server, e.g., http://localhost:8080/oai.xml. This is a plain
OAI-PMH interface with some style sheets (if you want to customize them,
copy them to ``rootDir/web/xsl/``). The default style sheets add a
link to "All identifiers defined here". Follow it to a list of all
records you currently publish.
Making the VO see your Registry
-------------------------------
The VO registry is a distributed system. There still is some sort of
root, the `Registry of Registries`_ or RofR. Once your system provides
sufficient metadata, go to
http://rofr.ivoa.net/regvalidate/regvalidate.html and enter your
registry endpoint (i.e., your installation's root URL with /oai.xml
appended).
GAVO DaCHS is lenient with missing metadata and will deliver invalid
VOResource for records missing some. It is not unlikely that your
registry will not validate on the first attempt. Reading the error
messages should give you a hint what's wrong. You can also use the
``gavo val`` command on the RDs that generate invalid records to figure
out what's wrong.
Once your registry passes the validation test, you can add it to the
RofR, and the full registries will start to harvest your registry (after
a while).
.. _Registry of Registries: http://rofr.ivoa.net/
Adapting DaCHS for Your Site
============================
As delivered, the web interface of DaCHS will make it seem you're
running a copy of the GAVO data center, with some metadata defused such
that you are not actually disturbing our operation if you accidentally
activate your registry interface. You should thus first customize the
items given in ``etc/defaultmeta.txt`` (as discussed in `Registry
Matters`_).
The next adaptations are done through the configuration (as discussed in
`Configuration Settings`_, i.e., usually in /etc/gavo.rc). The most
relevant item here is ``[web]sitename``, which should contain a terse
identifier for the site (like "GAVO Data Center"). It is shown in
titles and top headlines in many places. If you plan to use DaCHS'
embargo feature together with user authorization, you must also set
``[web]realm`` to some characteristic string. You could use the site
name here; some user agents use it to display a prompt like "Credentials
for " or similar.
If you want, you can set ``[web]/favicon`` to either a webDir-relative
path or a full URL to a `favicon`_.
It is also advisable to configure ``[general]maintainerAddress`` to a
mail address of a person who will read problem reports. DaCHS doesn't
send many of those yet, but it's still valuable if the software can
cry for help if necessary. Sending mail only works if the local machine
can actually send mail. If there is no MTA on your machine yet, we
recommend ``nullmailer`` as a lightweight and easy-to-configure sendmail
stand-in. If you use something else, you may need to adapt
``[general]sendmail``.
For the rest, you can customize almost everything by overriding built-in
resources. There are three major entities that you can override:
* `userconfig RD`_
* `Simple Web Resources`_
* `Templates`_
* `Overridden System RDs`_
If you find you need to override anything but the logo, please talk to
us first – we'd in general prefer to provide customization hooks.
Overridden distribution files are always a liability on upgrades.
.. _favicon: https://en.wikipedia.org/wiki/Favicon
Userconfig RD
-----------------
Fairly new in DaCHS is an RD exclusively for configuration. This is a
place in which you can put streams that fill certain hooks; we expect
to move more configuration into userconfig.
DaCHS has a builtin RD ``//userconfig`` that is updated as you update
DaCHS. It always contains fallbacks for everything that can be in
userconfig used by the core code. To override something, pull the
elements in questions in your own userconfig RD and edit it there.
Your own userconfig RD is expected in ``$GAVO_DIR/etc/userconfig.rd``.
If it's not there yet, there's nothing wrong with starting with the
distributed one::
cd `gavo config configDir`
gavo admin dumpDF //userconfig > userconfig.rd
Once it's already there, use ``dumpDF //userconfig`` and, say, ``less``
to pick out the templates for whatever elements you need to copy.
Currently, userconfig is already used in configuring the registry
interface and extending the built-in obscore schema.
Simple Web Resources
--------------------
For items coming from ``static`` (e.g., images, css, javascript), this
overriding works by dropping same-named files in ``$GAVO_ROOT/web/nv_static``.
Thus, you should put a PNG of your logo into
``$GAVO_ROOT/web/nv_static/img/logo_medium.png``.
Other files you may want to override in this way include
* ``css/gavo_dc.css`` – the central CSS; you could use this for
skinning. However, we recommend you just add an
``@import url("");`` to the file the server
delivers by default, since some of the css is almost necessary,
and you want easy upgrade paths when we change the master CSS.
* ``help.shtml`` – the help file. Unfortunately, we blurb quite
a lot about GAVO in there right now. We'll think of something
more parametrizable, but meanwhile you may want to have your own version
* ``img/logo_big.png``, ``img/logo_tiny.png`` –
scaled versions of your logo; logo_big should be 200 pixels wide or
more, logo_tiny of order 50 pixels wide.
* ``js/gavo.js`` – could be the place for additional javascript;
but frankly, if you want custom javascript, write to us and we'll
think of a sane mechanism.
* ``xsl/oai.xsl``, ``xsl/uws-joblist-to-html.xsl``,
``xsl/uws-job-to-html.xsl``, and ``vosi.xsl`` – XSLT stylesheet files.
If you override these to customize them, please let us know. We'd try
to put out generic stylesheets that are customizable without having
to muck around in stuff that's basically functionality.
Templates
---------
Note: If you just need some special behaviour in one or two services,
use the ``template`` children of `service elements`_
You will find the built-in templates in the unpacked DaCHS distribution,
at ``gavo/resources/templates``; if you installed from a Debian package,
that would be
``/usr/lib/python*/dist-packages/gavo/resources/templates/``.
Alternatively, you can pull the `templates from the repository`_
Copy the one you want to override to
``$GAVO_ROOT/web/templates`` and edit it there. You will need to
restart the server to make it pick up a new file; changes in the
templates should propagate without any intervention.
One likely candidate you might want to change is sidebar.html. However,
in particular with the sidebar, functional changes are likely to come
from us, so we'd put some work into making it possible to insert
custom stuff into the sidebar without having to change the template.
Please talk to us.
Much less critical is the ``root.html`` template – you'll still pretty
definitely want to change it. While you could use any XHTML, we
recommend to base your page on either the ``root.html`` template in the
distribution, which mainly gives a list of all services available on the
system, or ``root-tree.html``, which organizes services in trees and
only downloads metadata as necessary (a simple non-javascript fallback
is part of the distributed template, too). The latter is intended for
use on sites that have more then a few tens of services, when the plain
``root.html`` would expand to several 100s of kilobytes.
Whatever you use, it has to sit in
``$GAVO_ROOT/web/templates/root.html``, as that's where DaCHS looks for
the portal page.
Note that such static documents are fairly aggressively cached by DaCHS.
This means that changes to the docment will not usually become
immediately visible in your browser. However, documents are not cached
when you are logged in. As you write on your root page, it is therefore
advisable to log in as administrator (see the [web]adminpasswd
config item; e.g., click on the little [s] on the root page and select
"Log in" from the sidebar); if you're happy with your design, simply reload the
//services RD, which is where the root page cache lies on, and your new
root page will be what's delivered to your "normal" users, too.
Another candidate is serviceinfo.html – it contains an explanation of
the various sets you may create. Grep for GAVO in that file to see how
we use it.
Currently, we don't have good docs on the template language in use.
There's http://docs.g-vo.org/meetstan.html on the templating engine,
but the section on the render functions and data generators available
is still missing. Complain to make us write it sooner.
.. _service elements: ref.html#element-service
.. _templates from the repository: http://svn.ari.uni-heidelberg.de/svn/gavo/python/trunk/gavo/resources/templates/
Overridden System RDs
---------------------
You can copy system RDs from ``gavo/resources/inputs/__system__`` in the
distribution to ``$GAVO_ROOT/inputs/__system__`` (adapt if you have
played tricks with ``inputsDir``) and edit them there. Again, if you
feel you need to do that, contact us first, maybe we can work something
out; it's a liability for upgrades.
Other documents
---------------
The default help file and the default sidebar link to a privacy policy
that you should put down in
``$GAVO_ROOT/web/nv_static/doc/privpol.shtml``. The document must be
well-formed XHTML. Also, files with an extension shtml will be
interpreted as templates over the service ``//services/root``, which
means that you can use the usual render functions and data items; the
same goes for ``disclaimer.html`` (referenced from the standard
sidebar) and, if you offer SOAP services, ``soaplocal.html``. See the
respective pages in the GAVO DC (``http://dc.g-ov.org/static/doc/...``)
for ideas as to what to include.
The Vanity Map
--------------
DaCHS' URL scheme leads to somewhat clunky URLs that, in particular,
reflect the file system underneath. While this doesn't matter to the VO
registry, it is possibly unwelcome when publishing URLs outside of the
VO. To overcome it, you can define "vanity names", single path elements
that are mapped to paths.
These mappings are read from the file ``GAVO_ROOT/etc/vanitymap.txt``.
The file contains lines of the format::
[