=========
How Do I?
=========
Recipies and tricks for solving problems using GAVO DaCHS
=========================================================================
...skip a row from a rowmaker?
------------------------------
Raise IgnoreThisRow in a procedure application, like this::
if 2+colX>22:
raise IgnoreThisRow()
However, it's probably more desirable to use the rowmakers' built-in
``ignoreOn`` feature, possibly in connection with a procedure, since it is
more declarative.
Still, the following is the recommended way to selectively ignore broken
records defined via certain identifiers::
This proc filters out records too broken to ingest.
for the set of ids in the toIgnorePar
set([
202, 405])
if @catid in toIgnore:
raise IgnoreThisRow("Manually ignored from RD")
...skip a single source?
------------------------
If you want to skip processing of a source, you can raise SkipThis from
an appropriate place. Usually, this will be a sourceFields element,
like this::
if len(sourceToken)>22:
raise base.SkipThis("%s skipped since I didn't like the name"%
sourceToken)
...set a constant input to a core?
----------------------------------
Use a service input key with a Hidden widget factory and a default::
...
...get a multi-line text input for an input key?
------------------------------------------------
Use a widgetFactory, like this::
...add computed columns to a dbCore output?
-------------------------------------------
Easy: define an output field with a select attribute, e.g.::
This will add an output field that looks to the service like it comes
from the DB proper but contains the value of the ``ev_i`` column
multiplied with 5.434.
The expression must be valid SQL.
...make an input widget to select which columns appear in the output table?
------------------------------------------------------–--------------------
In general, selecting fancy output options currently requires custom
cores or custom renderers. Ideas on how to change this are welcome.
For this specific purpose, however, you can simply define an service key
named _ADDITEM. This would look like this::
....
...
Setting showItems to -1 gives you checkboxes rather than a select list,
which is mostly what you want. Try with and without and see what you
like better.
If you do that, you *probably* do not want the standard "additional
fields" widget at the bottom of the form. To suppress it, add a line
::
True
to the service definition. The "True" in there actually is a bit of a
red herring, the widget is suppressed for any value.
...add and image to query forms?
--------------------------------
There are various variations to that theme -- you could go for a
custom template if you want to get fancy, but usually putting an image
into an _intro or _bottominfo meta section should do.
In both cases, you need a place to get your image from. While you could
put it somewhere into rootDir/web/nv_static, it's probably nicer to have
it within a resource's input directory. So, add a static renderer to
your service, like this::
static
This lets you put service-local static data into resdir/static/ and
access it as /static/
Usually, your _intro or _bottominfo will be in reStructured text. Plain
images work in there using substitution references or simply the naked
image directive::
The current data set comprises these fields:
.. image:: \servicelink{cars/q/cat/static/fields-inline.png}
The servicelink macro would ensure that the image would still be found
if the server ran off-root.
This is the recommended way of doing things. If, however, you insist on
fancy layouts or need complete control over the appearance of your
image (or whatever), you can use the evil "raw" meta format::
]]>
Make sure you enter valid HTML here, no checks are done by the DC
software.
...import data coming in to a service?
--------------------------------------
In a custom renderer or core, you can use code like::
from gavo import api
...
def import(self, srcName, srcFile):
connection = api.getDBConnection("admin")
dd = self.service.rd.getById("myDataId")
self.nAffected = api.makeData(dd, forceSource=srcFile,
connection=connection).nAffected
connection.commit()
connection.close()
You want to use a separate connection since the default connections
obtained by cores and friends are unprivileged and typically cannot
write to table.
The nAffected should contain the total number of records imported and
could be used in a custom render function.
srcName and srcFile come from a formal File form item. In
submitActions, you obtain them like::
srcName, srcFile = data["inFile"]
Note that you can get really fancy and manipulate data in some way up
front. That could look like this::
data = rsc.Data.create(dd, parseOptions=api.parseValidating,
connection=connection)
data.addMeta("_stationId", self.stationRecord["stationId"])
self.nAffected = api.makeData(dd, forceSource=srcFile, data=data,
connection=connection).nAffected
...change the query issued on SCS queries?
------------------------------------------
You may want to do that because for some reason there is no q3c index on
the queried table, or the semantics aren't quite a point vs. point cone
search but close enough.
Sadly, this is quite complicated right now since our inheritance
mechanism ("original") is so simple-minded. This will hopefully improve
with a generic record/replay mechanism we're thinking about.
That said, the current way looks like this (for a query that does a
proximity search on bboxes)::
bbox < %%(%s)s"%(
vizierexprs.getSQLKey("RA", inPars["RA"], outPars),
vizierexprs.getSQLKey("DEC", inPars["DEC"], outPars),
vizierexprs.getSQLKey("SR", inPars["SR"], outPars))
]]>
-- so, you are inheriting from the SCS condition on three levels and then
override the genQuery function defined in the common setup code. The
way the condDescs are written, you must return rather than yield
the actual query string. See the tutorial on how condDesc code
works in general. The semi-good news is that if you want the same thing
for an SCS query, you can reuse part of what you did above::
...create database views in data elements?
------------------------------------------
There's a catch going beyond using simpleView or tables with
viewStatements: views evidently depend on the existence of tables.
It would seem this does not hurt when you have a data definition like::
where ``v`` is the view depending on the table ``a``. The trouble with
this is that as soon as you change ``a``, the data build cannot be
re-created; you cannot even drop it if you deleted the view v (in that
case, just manually drop table a, and things work again.
The reason for this odd behaviour is somewhat subtle, and I'll explain
it here when someone asks. The upshot, however, is: Never ``make`` a
view in the same ``data`` as a table it depends on. This right way to
do what's intended above is::
...fix duplicate values?
------------------------
There are many reasons why you could violate the uniqueness constraints on
primary keys, but let's say you just got a message saying::
Primary key could not be added ('could not create unique
index "data_pkey" DETAIL: Table contains duplicated values.)'
The question at this point is: What are the duplicated values? For a
variety of reasons, DaCHS only applies constraints only after inserting
all the data, so the error will occur at the end of the input. Not even
the ``-b1`` trick will help you here.
Instead, temporarily remove the primary key condition from the RD and
import your data.
Then, exececute a query like::
select *
from (
select , count(*) as ct
from
group by ) as q
where ct>1;
...define an input field doing google-type full text searches?
--------------------------------------------------------------
Since version 8.3 (or so), postgres supports query modes inspired by
information retrieval on text columns -- basically, you enter a couple
of terms, and postgres matches all strings containing them.
Within DaCHS, this is currently only supported using custom phrase
makers. This would look like this::
yield ("to_tsvector('english', description)"
" @@ plainto_tsquery('english', %%(%s)s)"%(
base.getSQLKey("columnwords", inPars["columnwords"], outPars))
-- here, ``description`` is the column containing the strings, and the
``'english'`` in the function arguments gives the language according
to which the strings should be interpreted.
You may want to create an index supporting this type of query on the
respective columns, too. To do that, say::
to_tsvector('english', bibref)
...put more than one widget into a line in web forms?
-----------------------------------------------------
Use input table groups with a compact.
In DB cores, however, you probably do not want to give inputTables
explicitely since it's much less hassle to have them computed from
the condDescs. In this case, the widgets you want to group probably
come from a single condDesc. To have them in a group, define a
group within the condDesc without any paramRefs (or colRefs) -- they
cannot be resolved anyway. Give the group style and label properties,
and it will be added to the input table for all fields of the condDesc::
compactExample vals
If you are doing this, you probably want to use the ``cssClass``
property of input keys and the ``customCSS`` property of services. The
latter can contain css specifications. They are added into form pages
by the defaultresponse template (in your custom templates, you should
have ```` in the head if
you want this ufnctionality). These can be used to style form elements
by their css class, which in turn is given by specifying ``cssClass``
properties on inputKeys.
Here's an example that uses CSS to insert material, which currently is
the only way to input something between the fields (short of redefining
the widgets). This may be overdoing it since the usability of this
widget without CSS is questionable (whereas it would be fine if the
group were non-compact and no CSS tricks were played); first a condDesc
template::
A condDesc for a mass fraction. These consist of an element label,
a center value and a weird way of specifying the error.
There can be a few of them for a given service, and thus you need to
define the macro item. It is used to disambiguate the names.
a_val0.3a_fuzzMass fraction of an element. The condition expands
to c/10^r ≤ mass fraction(Element) ≤ c*10^rMass Fraction \itemcompact
And here's a service with that ``condDesc``, including the custom css::
Theoretical spectra of hot compact stars
input.a_val {width:3em}
input.a_fuzz {width: 3em}
span.a_val:before { content:" in "; }
span.a_fuzz:before { content:" ± "; }
span.a_fuzz:after { content:" dex"; }
Note that we are styling both ``input`` and ``span`` elements with the
css class we set. Before and after can only be used on the span since
input elements do not support before and after. For that reason, DaCHS
wraps each element within a compact group with a span of the same css
class.
Also see the next question.
...get a range input widget?
----------------------------------------------------
Well, VizieR expressions let your users specify intervals and more, but
ok, they would need to read docs to know that, so there's a case to be
made for widgets like::
Phlogistics between __________ and ___________
These basically work as discussed in grouping widgets above, but since
stuff like this is fairly common, there's built-in support for this in
//procs#rangeCond. This is a stream requiring three definitions:
* name -- the column name in the core's queried table
* groupdesc -- a terse phrase describing the range. This will be
used in the description of both the input keys and the group
* grouplabel -- a label (include the unit, it is not taken from InputKey)
written in front of the form group
groupdesc has to work after "Range of", "Lower bound of", and
"Upper bound of". Do not include a concluding period in groupdesc.
Here's an example::
...use a custom input widget for a field?
----------------------------------------------------
Right now, you cannot really; we're still waiting for an enlightenment
on how to sensibly do that from RDs. Pester us if you really want this.
Meanwhile, to have little tags like the explanations of the
vizier-like-expressions, you can use a custom widget with fields. This
could look like this::
Designated object on the plate (i.e.,
the object the observers entered into their journal).
You can use wildcards if your prefix your expression with "~".
widgetFactory(StringFieldWithBlurb, additionalMaterial=
T.a(href="/objectlist")[
"[List of known objects]"])
Here, instead of the String in ``StringFieldWithBlurb``, you could have
used Numeric or Date, and then used vexpr-float or vexpr-date,
respectively, for the inputKey's date.
The value of the ``additionalMaterial`` argument is some nevow stan.
Info on what you can write there can be found elsewhere.
...cope with "Undefined"s in FITS headers?
------------------------------------------
Pyfits returns some null values from FITS headers as instances of
"Undefined" (Note that this is unrelated to DaCHS' base.Undefined). If
you want to parse from values that *sometimes* are Undefined, use code
like::
parseWithNull(@RA,
lambda l: hmsToDeg(l, ":"),
checker=lambda l: isinstance(l, utils.pyfits.Undefined))
...force the software to accept weird file names?
-------------------------------------------------
When importing products (using //products#define or derived procDefs),
you may see a message like::
File path
'rauchspectra/spec/0050000_5.00_H+He_0.000_1.meta'
contains characters known to the GAVO staff to be hazardous in URLs.
Please defuse the name before using it for published names.
The machinery warns you against using characters that need escaping in
URLs. While the software itself should be fine with them (it's a bug if
it isn't), such characters make other software's lives much harder – the
plus above, for example, may be turned to a space if used in a URL.
Thus, we discourage the use of such names, and if at all possible, you
should try and use simpler names. If, however, you insist on such
names, you can simply write something::
"\schema.newdata"\inputRelativePath{True}
(plus whatever else you want to define for that rowfilter, of course) in
the respective data element.
...handle formats in which the first line is metadata?
---------------------------------------------------------
Consider a format like the metadata for stacked spectra::
C 2.3 5.9 0
1 0 0 1
2 1 0 1
...
– here, the first line gives a center position in degrees, the following
lines offsets to that.
For this situation, there's the grammar's sourceFields element. This is
a python code fragment returning a dictionary. That dictionary's
key-value pairs are added to each record the grammar returns.
For this example, you could use the following grammar::
with open(sourceToken) as f:
_, cra, cde, _ = f.readline().split()
return {
"cra": float(cra),
"cde": float(cde)}
In the rowmaker, you could then do something like this::
...use binaries when the gavo directory is mounted from multiple hosts?
-----------------------------------------------------------------------
If your GAVO_ROOT is accessible from more than one machine and the
machines have different architectures (e.g., i386 and amd64 or ppc,
corresponding to test machine and production server), compiled binaries
(e.g., getfits, or the preview generating code) will only work on one of
the machines.
To fix this, set platform in the [general] section of your config file.
You can then rename any platform-dependent executable base-,
and if on the respective platform, that binary will be used. This also
works for computed resources using binaries, and those parts of the DC
software that build binaries (e.g., the booster machinery) will
automatically add the platform postfix.
If you build your own software, a make file like the following could be
helpful::
PLATFORM=$(shell gavo config platform)
TARGET=@@@binName@@@-$(PLATFORM)
OBJECTS=@@@your object files@@@
$(REG_TARGET): buildstamp-$(PLATFORM) $(OBJECTS)
$(CC) -o $@ $(OBJECTS)
buildstamp-$(PLATFORM):
make clean
rm -f buildstamp-*
touch buildstamp-$(PLATFORM)
You'll have to fill in the @@@.*@@@ parts and probably write rules for
building the files in $OBJECT, but otherwise it's a simple hack to make
sure a make on the respective machine builds a suitable binary.