========================== Common Problems with DaCHS ========================== :Author: Markus Demleitner :Email: gavo@ari.uni-heidelberg.de .. contents:: :depth: 1 :backlinks: entry :class: toc This document tries to discuss some error messages you might encounter while running DaCHS. The rough idea is that when you can grep in this file and get some deeper insight as to what happened and how to fix it. We freely admit that some error messages DaCHS spits out are not too helpful. Therefore, you're welcome to complain to the authors whenever you don't understand something DaCHS said. Of course, we're grateful if you checked this file first. DistributionNotFound ==================== When trying to run any program, you may see tracebacks like:: Traceback (most recent call last): File "/usr/local/bin/gavo", line 5, in from pkg_resources import load_entry_point [...] File "/usr/lib/python2.6/dist-packages/pkg_resources.py", line 552, in resolve raise DistributionNotFound(req) pkg_resources.DistributionNotFound: gavodachs==0.6.3 This is usually due to updates to the source code when you have installed your source in development mode. Simple do ``sudo python setup.py develop`` in the root of the source distribution again. Another source of this error can be unreadable source directories. Please check that the user that's trying to execute the command can actually read the sources you checked out. 'gavodachs' package upgrade fails ================================= If you try to upgrade an older version (< 0.9) of the 'gavodachs' package, e.g. by typing:: sudo apt-get update && sudo apt-get upgrade it could happen that you run into troubles when the gavodachs server is going to be stopped (and restarted). If the server stop fails, the installation of the 'gavodachs-server' package will aborted which leaves the package in a half-configured state. The corresponding error message would be something similar to:: Stopping VO server: dachsTraceback (most recent call last): [...] File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 584, in resolve raise DistributionNotFound(req) pkg_resources.DistributionNotFound: gavodachs==0.9 In that case try:: sudo dpkg --remove --force-all python-gavodachs sudo apt-get -f install gavodachs-server With these commands you should end up in the state obtained by a successful package upgrade. ignoreOn in a rowmaker doesn't seem to work =========================================== The most likely reason is that you are testing for the presence of a key that is within the table. This will not work since rowmakers add key->None mapping for all keys missing but metioned in a map (also implicitely via ``idmaps``. If more than one rowmake operate on a source, things get really messy since rowmakers *change* the row dictionaries in place. Maybe this should change at some point, but right now that's the way it is. Thus, you can *never* reliably expect keys used by other tables to be present or absent since you cannot predict the order in which the various table's rowmakers will run. To fix this, you can check against that key's value being NULL, e.g., like this:: You could also instruct the rowmaker to ignore that key; this would require you to enumerate all rows you want mapped. Import fails with "Column xy missing" and very few keys ======================================================= This error is caused by the row validation of the table ingestor – it wants to see values for all table columns, and it's missing one or more. This may be flabbergasting when your grammar yields the fields that are missing here. The reason is simple: You must map them in the rowmaker. If you see this error, you probably wanted to write ``idmaps="*"`` or something like that in the rowmaker. Server is Only Visible from the Local Host ========================================== When the server is running (``gavo serve start``) and you can access pages from the machine the server runs on just fine, but no other machines can access the server, you run the server with the default web configuration. It tells the server to only bind to the loopback interface (127.0.0.1, a.k.a. localhost). To fix this, say:: [web] bindAddress: in your /etc/gavo.rc. Transaction Deadlocking ======================= When gavo imp (or possibly requests to the server) just hangs without consuming CPU but not doing anything useful, it is quite likely that you managed to provoke a deadlock. This happens when you have a database transaction going on a table while trying to access it from the outside. To give an example:: from gavo import base from gavo import rsc t = rsc.TableForDef(tableDefForFoo) q = base.SimpleQuerier().query("select * from foo") This will deadlock if tableDefForFoo actually defines an onDisk table foo. The reason is that instanciating a database table object will create a connection and start a transaction (e.g., to see if the table is actually present on disk). SimpleQuerier, on the other hand, creates another connection and another transaction. In general, the result of this second transaction will depend on the outcome of the first one. Postgres will notice that and postpone creating the result until the t's transaction if finished. That will never happen with this code. To diagnose what's happening, it is useful to see the server's idea of what is going on inside itself. The following script (that you might call psdb) will help you:: #!/bin/sh psql gavo << EOF select procpid, usename, current_query, date_trunc('seconds', query_start::time) from pg_stat_activity order by procpid EOF (this assumes your database is called gavo and you have sufficient rights on that database; it's not hard to figure out the psql command line for other scenarios). This could output something like:: procpid | usename | current_query | date_trunc ---------+-----------+-------------------------+------------ 9301 | gavoadmin | | 16:55:39 9302 | gavoadmin | in transaction | 16:55:39 9303 | gavoadmin | in transaction | 16:55:39 9306 | gavoadmin | in transaction | 16:55:43 9309 | gavoadmin | SELECT calPar FROM l... | 16:55:43 (5 Zeilen) The procpid is the pid of the process handling the connection. Usually, you will see one running query and possibly quite a few connections that are idle in transaction (which are tables waiting to be fed, etc.). The query should give you some idea where the deadlock occurs. To escape the deadlock (which, under CPython, will block ^C as well), kill the process trying the query -- this will give you a traceback to the offending instruction. Of course, you will need to become the postgres or root user to do that, so it may be easier to forego the traceback and just kill gavo imp. To fix such a situation, there are various options. You could commit the table's transaction:: from gavo import base from gavo import rsc t = rsc.TableForDef(tableDefForFoo) t.commit() q = base.SimpleQuerier().query("select * from foo") but that is not usally what you want to do. Much more often, you want to execute the second query in t's transaction. In this case, this could work like this:: from gavo import base from gavo import rsc t = rsc.TableForDef(tableDefForFoo) q = base.SimpleQuerier(connection=t.connection).query("select * from foo") Of course, it is not always easy to access the connection object. Note, however, that in most procedure definitions, you have the target data set available as data. If you have that, you can usually obtain the current connection (and thus transaction) via:: data.getPrimaryTable().connection -- at least if you designate one of the data's tables as primary through its make elements. 'prodtblAccref' not found in a mapping ====================================== You get this error message when you make a table that mixes in //products#table (possibly indirectly, e.g., via SSAP or SIAP mixins) with a grammar that does not use the //products#define row filter. So, within the grammar, say (at least, see the reference documentation for other parameters for rowgen):: \schema.table -- substituting dest.table with the actual name of the table fed. The reason why you need to manually give the table is that the grammar doesn't now what table the rows generated end up in. On the other hand, the information needs to be added in the grammar, since it is fed both to your table and the system-wide products table. I get "Column ssa_dstitle missing" when importing SSA tables ============================================================ The ``//ssap#setMeta`` rowmaker application does not directly fill the output rowdict but rather defines new input symbols. This is done to give you a chance to map things set by it, but it means that you must idmap at least all ssa symbols (or map them manually, but that's probably too tedious). So, in the rowmaker definition, you write:: "unpack requires a string argument of length" ============================================= These typically come from a binary grammar parsing from a source with armor=fortran. Then, the input parser delivers data in parcels given by the input file, and the grammar tries to parse it into the fields given in your binaryRecordDef. The error message means that the two don't match. This can be because the input file is damaged, you forgot to skip some header, but it can also be because you forgot fields or your binaryRecordDef doesn't match the input in some other way. "resource directory '' does not exist" ================================================ DaCHS expects each RD to have a "resource directory" that contains input files, auxillary data, etc. Multiple RDs may share a single resource directory. By default, the resource directory is /. If you don't need any auxillary files, the resdir doesn't need to exist. In that case, you'll see the said warning. To suppress it, you could just say:: The ``__system`` resource directory is used by the built-in RDs and thus should in general exist. However, the recommended layout is, below ``inputsDir``, a subdirectory named like the resource schema, and the RD immediately within that subdirectory. In that case, you don't need a resdir attribute. Only RDs from below inputsDir may be imported ============================================= RDs in DaCHS must reside below ``inputsDir`` (to figure out what that is on your installation, say ``gavo config inputsDir``). The main reason for that restriction is that RDs have identifiers, and these are essentially the inputsDir-relative paths of the file. Out-of-tree RDs just cannot compute this. Therefore, most subcommands that accept file paths just refuse to work when the file in question is not below inputsDir. Not reloading services RD on server since no admin password available ===================================================================== That's a warning you can get when you run ``gavo pub``. The reason for it is that the DaCHS server caches quite a bit of information (e.g., the root page) that may depend on the table of published services (see also `Managing Runtime Resources`_. Therefore, ``gavo pub`` tries to make the running server discard such caches. To do this, it inspects the ``serverURL`` config item and tries access a protected resource. Thus, it needs the value of the config setting ``adminpasswd`` (if set), and that needs to be identical on the machine gavo pub executes on and on whatever serverURL points to. If anything goes wrong, a warning is emitted. The publication has happened still, but you may need to run ``gavo serve reload`` on the server to make it visible. .. _Managing Runtime Resources: opguide.html#managing-runtime-resources I'm getting "No output columns with these settings." instead of result rows =========================================================================== This is particularly likely to happen with the scs.xml renderer. There, it can happen the the server doesn't even bother to run database queries but instead keeps coming back with an error message ``No output columns with these settings.``. This happens when the "verbosity" (in SCS, this is computed as 10*VERB) of the query is lower than the verbLevel of all the columns. By default, this verbLevel is 20. In order to ensure that a column is returned even with VERB=1, say:: gavo imp dies with Permission denied: u'/home/gavo/logs/dcErrors' ================================================================= (or something similar). The reason for these typically is that the user that runs ``gavo imp`` is not in the ``gavo`` group (actually, whatever [general]gavoGroup says). To fix it, add that user. If that user was, say, fred, you'd say:: sudo adduser fred gavo Note that fred will either have to log in and out (or similar) or say ``newgrp gavo`` after that. Warnings about popen, md5, etc being deprecated =============================================== The python-nevow package that comes with Debian sequeeze is outdated. Install http://docs.g-vo.org/python-nevow_0.11.0-1_all.deb I'm using reGrammar to parse a file, but no splitting takes place ================================================================= This mostly happens for input lines like ``a|b|c``; the underlying problem is that you're trying to split along regular expression metacharacters. The solution is to escape the the metacharacter. In the example, you wouldn't write:: but rather:: IntegrityError: duplicate key value violates unique constraint "products_pkey" ============================================================================== This happens when you try to import the same "product" twice. There are many possible reasons why this might happen, but the most common (of the non-obvious ones) probably is the use of updating data items with row triggers. If you say something like:: ... ... you're doing it wrong. The reason this yields IntegrityErrors is that if the ignoreOn trigger fires, the row will not be inserted into the table ``data``. However, the make feeding the ``dc.products`` table implicitely inserted by the ``//products#table`` mixin will not skip an ignored image. So, it will end up in ``dc.product``, but on the next import, that source will be tried again – it didn't end up in my.data, which is where ignoreSources takes its file names from –, and boom. If you feed multiple tables in one data and you need to skip an input row entirely, the only way to do that reliably is to trigger in the grammar, like this::
... ... gavo init/installation dies with UnicodeDecodeError: 'ascii' codec... ===================================================================== The full signature is something like:: File "/usr/lib/python2.7/dist-packages/pyfits/core.py", line 103, in formatwarning return unicode(message) + '\n' UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 65: ordinal not in range(128) This is a bug in pyfits, together with carelessness in passing through error messages on our side. We'll see which side will fix this first; meanwhile, the easy workaround is to set ``lc_messages = 'C'`` in ``postgresql.conf`` (e.g., ``/etc/postgresql/9.1/main/postgresql.conf`` on Debian wheezy). That's probably a good idea anyway since TAP may expose postgres error messages to the user, and these aren't nearly as useful to remote users as to you if they're in your local language. relation "dc.datalinkjobs" does not exist ========================================= This happens when you try to run asynchronous datalink (the dlasync renderer) when you've not created the datalink jobs table. This is not (yet) done automatically on installation since right now we consider async datalink to be a bit of an exotic case. To fix this, run:: gavo imp //datalink