=========================== GAVO DaCHS operator's guide =========================== :Author: Markus Demleitner :Email: gavo@ari.uni-heidelberg.de :Date: |date| .. contents:: :depth: 3 :backlinks: entry :class: toc This document details the configuration and operation of the GAVO DaCHS server. For information on installing the software, please refer to the `installation guide`_, to learn how to import data, see the tutorial_. For an overview of the available documentation, see `DaCHS documentation`_ .. _DaCHS documentation: http://docs.g-vo.org/DaCHS .. _installation guide: http://docs.g-vo.org/DaCHS/install.html .. _tutorial: http://docs.g-vo.org/DaCHS/tutorial.html Starting and stopping the server ================================ The ``gavo serve`` subcommand is used to control the server. ``gavo serve start`` starts the server, changes the user to what is specified in the [web] user config item if it has the privileges to do so (that's "gavo" by default; you will already have created that user if you followed the installation instructions) and detaches from the terminal. Analoguosly, ``gavo serve stop`` stops the server. To reload some of the server configuration (e.g., the resource descriptors, the vanity map, and the /etc/gavo.rc and ~/.gavorc files), run ``gavo serve reload``. This does not reload database profiles, and not all configuration items are applied (e.g., changes to the bind address and port only take effect after a restart). If you remove a configuration item entriely, their built-in defaults do not get restored on reload either. Finally, ``gavo serve restart`` restarts the server. The start, stop, reload, and restart operations generally should be run as root; you can run them as the server user (by default, gavo), too, as long as the server doesn't try to bind to a privileged (lower than 1025). All this can and should be packed into a startup script or the equivalent entity for the init systme of your choice. Our Debian package provides a System V-style init script; it is available from http://svn.ari.uni-heidelberg.de/svn/debian-package/gavodachs/trunk/debian/gavodachs-server.dachs.init and should be installed to /etc/init.d/dachs (of course, if you installed the Debian package, the system has already done this for you). For development work or to see what is going on, you can run ``gavo serve debug``; this does not detach and does not change users. Publication =========== To "publish" a resource – which means include it either on your site's home page or in what you report to the VO registry –, add a ``publish`` element to a ``service`` element or a ``register`` element to data or table elements. Both of these let you specify the sets the resources shall be published to. Unless you have specific application, only two sets are relevant: ``ivo_managed`` for publishing to the VO (see `Registry Matters`_, and ``local`` for publishing to your data center's service roster. Other sets can be introduced and used for, e.g., specific sub-rosters. The ``publish`` element needs, in addition, a render attribute, giving a comma-separated list of renderers the publication is for. The various renderers are translated into capability element in the VO resource records. For example, a typical pattern could be:: This generates a capability each for the simple cone search and a browser-based interface; the browser-based interface is, in addition, listed in the local service roster. You can also publish tables; for those, the notion of renderers make no sense, so the publish element doesn't have that. Instead, you could define services that serve that data. For many cases, you don't even need to do this, since for tables that have ``adql="True"``, the local TAP service is automatically considered to be a service for that data. So, to publish an ADQL-queriable table to the VO for querying via TAP, just write:: within the table element. A table containing, e.g., data that's queried in a SIAP service in a different RD, would require something like:: " in their mails. * contact.address – A contact address for surface mail * contact.email – An email address. It will be published on web pages, so there probably should be some kind of spam filter in front of it. * contact.telephone – A telephone number people can call if things really look bad. * creator.name – A name to use when you give no creator in your resource descriptors. Could be some error sentinel ("we foget to give credit, please complain") or just contact.name if you produce resources yourself. * creator.logo – A URL for a logo to use when none is given in the resource metadata. Use a small PNG here. * site.description – A description of your site (i.e., "data center") Example: ``The GAVO data center provides VO publication services to all interested parties on behalf of the German Astrophysical Virtual Observatory.`` (use backslashes an the end of the lines to break long lines). .. _tutorial chapter on the registry: tutorial.html#the-registry-interface .. _authority query against the registry: http://dc.zah.uni-heidelberg.de/__system__/adql/query/form?query=select%20ivoid%20from%20rr.resource%20where%20res_type%3D%27vg%3Aauthority%27%20and%20ivoid%3D%27ivo%3A%2F%2Forg.gavo.dc%27 .. _meta stream format: ref.html#meta-stream-format Then, fill out the metadata for the system registry resources in your userconfig RD. This is the stuff in the ``registry-interfacerecords`` stream (which you can copy from //userconfig if it's not yet in your ``etc/userconfig.rd``. Fill things out; in particular everything where there's a ``\\metaString`` expansion. This still is filled from defaultmeta.txt, but as we want to get rid of that file on the long run, just enter the text as you see fit. In authority, change in particular * creationDate – A UTC datetime (with trailing Z); technically, it should be the date the resource record is created, but realistically, just use "now" at the time your're writing the defaultmeta.txt. Example: ``2007-12-19T12:00:00Z``. * title – A human-readable descriptor of what the authority corresponds to. Example: ``The Utopia Observatory Data Center`` * description – A sentence or two on what the authority you are using means. This could be the same as site.description if all you're claiming authority for is that; if you're claiming authority for your institute or organisation, this obviously should be different. Example: ``The Data Center at the Observatory of Utopia manages lots of substantial data sets created by the bright scientists all over the world`` (use backslashes an the end of the lines to break long lines). * shortName – a short (about 16 chars) identifier for your authority. Example: GAVO DC. * referenceURL – A URL at which people can learn more about your data center. Example: ``http://www.g-vo.org``. * managingOrg – an ivo id of the organisation you're running the dc for. The default is ivo:///org, the content defined in the ``manager`` resRec. If your institution has a registry entry independent of your DC, you can enter that IVORN here as well (and you would remove the ``manager`` resRec). In the ``manager`` resRec (if you have it), change: * creationDate – as above for authority. * title – the name of the organisation on behalf of which you are running the data center. Example: ``Observatory of Utopia`` * description and referenceURL – as the analoguous item for authority, just for the organisation for which you are running the data center (e.g., your "home institute"). Example: ``The Observatory of Utopia is Lilliput's largest astronomical institution with ten large telescopes spread around the Plain Mountains. Beautiful vistas and lush valleys make them an attractive holiday spot as well. Book now at the Observatories soft money department at 1-800-GOOD-GREED.`` After you've specificed all that, you're ready to define your first resources, viz, your registry itself, the authority, and the organisation that's managing it. These are predefined using the data you just filled in in the //services RD. To publish them, you say:: gavo pub //services Registering DaCHS-external Services ----------------------------------- The registry interface of DaCHS can be used to register entities external to DaCHS; actually, you're already doing this when you're claiming an authority. To register a non-service "resource", you can fill out a resRec_ RD element. You could reserve an RD (say, ``GAVOROOT/inputs/ext.rd`` to collect such external registrations, or you could put them alongside internal services into their respective RDs. You will then usually just use the resRec's id attribute to determine the IVORN of resource record. It will then be ``ivo:////``. In all likelihood, however, you will want to register services. To do that, use a normal service definition with with a nullCore. You probably need to manually give an accessURL. The most common case is that of a service with a ``WebBrowser`` capability. These result from ``external`` or ``static`` renderers. Thus, the pattern here usually is:: shortName: My external service description: This service does wonderful things, even though\ it's not based on GAVO's DaCHS software. http://wherever.else/svc Of course, you will normally need to add further metadata as discussed above. ``gavo pub`` should complain if there's metadata missing, though. The "services" can be fairly funky, actually; here's how GAVO registers their ADQL reference card:: shortName: GAVO ADQL ref creationDate: 2012-11-05T14:24:00Z title: The GAVO ADQL reference card subject:Virtual Observatory subject:Standards subject:ADQL description: GAVO's ADQL reference card briefly gives an overview \ of the SQL dialect used in the VO. It is available as a PDF\ file and as Scribus source under the CC-BY license. referenceURL:http://www.g-vo.org/pmwiki/About/ADQLReference http://docs.g-vo.org/adqlref/adqlref.pdf It is likely that if you register external services, you'll want to manage authorities other than ``[ivoa]authority`` as used by DaCHS. If you do, just add authority record(s) as before in the ``registry-interfacerecords`` STREAM in your `userconfig RD`_. And do not forget to add lines like:: edu.euro-vo.org within the `` Form-based service ... ... To publish Simple OAI operation -------------------- If you want to check what you have published, see the ``/oai.xml`` on your server, e.g., http://localhost:8080/oai.xml. This is a plain OAI-PMH interface with some style sheets (if you want to customize them, copy them to ``rootDir/web/xsl/``). The default style sheets add a link to "All identifiers defined here". Follow it to a list of all records you currently publish. The OAI endpoint can also be used to help you in debugging validity problems with your registry content. To XSD-validate your registry without bothering the RofR (see above), you can do the following:: curl ?verb=ListRecords&metadataPrefix=ivo_vor |\ xmlstarlet fo > toval.xml gavo admin xsdValidate toval.xml This may result in a few error messages; if you don't understand them, it's a good idea to just go to the respective line in toval.xml and give it a long, hard look. Making the VO see your Registry ------------------------------- The VO registry is a distributed system. There still is some sort of root, the `Registry of Registries`_ or RofR. Once your system provides sufficient metadata, go to http://rofr.ivoa.net/regvalidate/regvalidate.html and enter your registry endpoint (i.e., your installation's root URL with /oai.xml appended). GAVO DaCHS is lenient with missing metadata and will deliver invalid VOResource for records missing some. It is not unlikely that your registry will not validate on the first attempt. Reading the error messages should give you a hint what's wrong. You can also use the ``gavo val`` command on the RDs that generate invalid records to figure out what's wrong. Once your registry passes the validation test, you can add it to the RofR, and the full registries will start to harvest your registry (after a while). .. _Registry of Registries: http://rofr.ivoa.net/ Adapting DaCHS for Your Site ============================ As delivered, the web interface of DaCHS will make it seem you're running a copy of the GAVO data center, with some metadata defused such that you are not actually disturbing our operation if you accidentally activate your registry interface. You should thus first customize the items given in ``etc/defaultmeta.txt`` (as discussed in `Registry Matters`_). The next adaptations are done through the configuration (as discussed in `Configuration Settings`_, i.e., usually in /etc/gavo.rc). The most relevant item here is ``[web]sitename``, which should contain a terse identifier for the site (like "GAVO Data Center"). It is shown in titles and top headlines in many places. If you plan to use DaCHS' embargo feature together with user authorisation, you must also set ``[web]realm`` to some characteristic string. You could use the site name here; some user agents use it to display a prompt like "Credentials for " or similar. If you want, you can set ``[web]favicon`` to either a webDir-relative path or a full URL to a `favicon`_. It is also advisable to configure ``[general]maintainerAddress`` to a mail address of a person who will read problem reports. DaCHS doesn't send many of those yet, but it's still valuable if the software can cry for help if necessary. Sending mail only works if the local machine can actually send mail. If there is no MTA on your machine yet, we recommend ``nullmailer`` as a lightweight and easy-to-configure sendmail stand-in. If you use something else, you may need to adapt ``[general]sendmail``. For the rest, you can customize almost everything by overriding built-in resources. There are five major entities that you can override: * `customisation hooks`_ * `userconfig RD`_ * `Simple Web Resources`_ * `Templates`_ * `Overridden System RDs`_ If you find you need to override anything but the logo, please talk to us first – we'd in general prefer to provide customisation hooks. Overridden distribution files are always a liability on upgrades. .. _favicon: https://en.wikipedia.org/wiki/Favicon Customisation Hooks ------------------- Operator CSS '''''''''''' To override css rules we distribute or add new rules, avoid changing gavo_dc.css as described in `Simple Web Resources`_, as that will be a liability when upgrading. Instead, drop a CSS file somewhere (recommended location: GAVO_ROOT/web/nv_static/user.css) and add a configuration item in ``[web]operatorCSS``. With the recommended location, this would work out to be:: [web] operatorCSS: /static/user.css in /etc/gavo.rc. This can also be an external URL, but we recommend against that, as that would force a browser to open one external connection per web page delivered. By far the most common complaint is that we are limiting the width of p and li elements to 40em. We believe that text lines longer than about 80 characters are hard to read and should be avoided. On pages with tables where users might actually want to run browsers filling the entire screen, this choice cannot be made through a sensible choice of the width of the user agent window on the user side but requires CSS intervention. Having said that, if you really think you want window-filling text lines, just put:: p, li { max-width: none; } into your operator CSS. XSL configuration ''''''''''''''''' DaCHS employs client-side XSLT for some purposes -- for instance, to show OAI-PMH (registry) responses in web browsers, to allow perusing datalink results in the browser, and to allow web browsers some rudimentary interaction with UWS applications like TAP. The default XSLT contains references to the GAVO data center; to change these (or something else), override the xsl config stylesheet, which is expected at /static/xsl/dachs-xsl-config.xsl. The recommended way to go about this is:: cd /var/gavo # or whereever your DaCHS root is cd web/nv_static mkdir -p xsl cd xsl gavo admin dumpDF web/xsl/dachs-xsl-config.xsl > dachs-xsl-config.xsl Then edit ``dachs-xsl-config.xsl``. Note that you have to restart the server once to make it notice the override. Userconfig RD ------------- Fairly new in DaCHS is an RD exclusively for configuration. This is a place in which you can put streams that fill certain hooks; we expect to move more configuration into userconfig. DaCHS has a builtin RD ``//userconfig`` that is updated as you update DaCHS. It always contains fallbacks for everything that can be in userconfig used by the core code. To override something, pull the elements in questions in your own userconfig RD and edit it there. Your own userconfig RD is expected in ``$GAVO_DIR/etc/userconfig.rd``. If it's not there yet, there's nothing wrong with starting with the distributed one:: cd `gavo config configDir` gavo admin dumpDF //userconfig > userconfig.rd Once it's already there, use ``dumpDF //userconfig`` and, say, ``less`` to pick out the templates for whatever elements you need to copy. Currently, userconfig is already used in configuring the registry interface and extending the built-in obscore schema. Changes to userconfig.rd are picked up by DaCHS but will usually not be visible in the RDs they end up in. This is because DaCHS does not track which RDs make use of userconfig, so these will typically need to be reloaded manually. For instance, if you changed TAP examples, you'd need to run:: gavo serve exp //tap to make your change show up in the web interface. Although usually not necessary, you can reload userconfig itself using:: gavo serve exp % Simple Web Resources -------------------- For items coming from ``static`` (e.g., images, css, javascript), this overriding works by dropping same-named files in ``$GAVO_ROOT/web/nv_static``. Thus, you should put a PNG of your logo into ``$GAVO_ROOT/web/nv_static/img/logo_medium.png``. Other files you may want to override in this way include * ``css/gavo_dc.css`` – the central CSS; you could use this for skinning. However, we recommend you just add an ``@import url("");`` to the file the server delivers by default, since some of the css is almost necessary, and you want easy upgrade paths when we change the master CSS. * ``help.shtml`` – the help file. Unfortunately, we blurb quite a lot about GAVO in there right now. We'll think of something more parametrisable, but meanwhile you may want to have your own version * ``img/logo_big.png``, ``img/logo_tiny.png`` – scaled versions of your logo; logo_big should be 200 pixels wide or more, logo_tiny of order 50 pixels wide. * ``js/gavo.js`` – could be the place for additional javascript; but frankly, if you want custom javascript, write to us and we'll think of a sane mechanism. * ``xsl/oai.xsl``, ``xsl/uws-joblist-to-html.xsl``, ``xsl/uws-job-to-html.xsl``, and ``vosi.xsl`` – XSLT stylesheet files. If you override these to customize them, please let us know. We'd try to put out generic stylesheets that are customisable without having to muck around in stuff that's basically functionality. Templates --------- There is now a document on `HTML templating in DaCHS`_ .. _HTML templating in DaCHS: http://docs.g-vo.org/DaCHS/templating.html Overridden System RDs --------------------- You can copy system RDs from ``gavo/resources/inputs/__system__`` in the distribution to ``$GAVO_ROOT/inputs/__system__`` (adapt if you have played tricks with ``inputsDir``) and edit them there. Again, if you feel you need to do that, contact us first, maybe we can work something out; it's a liability for upgrades. Other documents --------------- The default help file and the default sidebar link to a privacy policy that you should put down in ``$GAVO_ROOT/web/nv_static/doc/privpol.shtml``. The document must be well-formed XHTML. Also, files with an extension shtml will be interpreted as templates over the service ``//services/root``, which means that you can use the usual render functions and data items; the same goes for ``disclaimer.html`` (referenced from the standard sidebar) and, if you offer SOAP services, ``soaplocal.html``. See the respective pages in the GAVO DC (``http://dc.g-ov.org/static/doc/...``) for ideas as to what to include. The Vanity Map -------------- DaCHS' URL scheme leads to somewhat clunky URLs that, in particular, reflect the file system underneath. While this doesn't matter to the VO registry, it is possibly unwelcome when publishing URLs outside of the VO. To overcome it, you can define "vanity names", single path elements that are mapped to paths. These mappings are read from the file ``GAVO_ROOT/etc/vanitymap.txt``. The file contains lines of the format:: [