(Main tables of) the Gaia data releases. DR3 sources ----------- Docs at https://gea.esac.esa.int/archive/documentation/GDR3/ A whopping 175M of PDF of that is on Markus' box at /media/incoming/GaiaDR3_documentation_1.1.pdf for a while. We're carrying a gaia_source lite of these as usual. The dump was made with bin/dumpdr3sources.sh. The test set was obtained with zgrep "^206259.............[[:space:]]" data3/sources-dump.txt.gz > sample.txt (the odd selection is because of what happened to be the test set for edr3, which is then reflected in several other data sets here). DR3 spectra ----------- [Not up-to-date for MC-samling yet: gist here: run bin/dr3_to_mcsampled.py] Our main thing here are the spectra. Upstream has the majority of them in "continuous" form, which we want to rectify by providing sampled (if noisy) spectra for all. To get there, we first dumped the summary and continuous tables using bin/dr3dumpxpspec.sh from Gaia-ARI. Warning: for the cont spectra, this takes many days. Then, we ran bin/dr3_to_sampled.py; again, that's long running (~2 days). That creates data3/xp_sampled_computed.txt, which is pigz-ed and then imported. The intermediate xp_continuous_dump.txt.gz is too large to just keep around and is removed. Test set for the sampled spectra (in order to have spectra for eDR3 test set members): zgrep "^2062598............[[:space:]]" data3/xp_sampled_computed.txt.gz > sample.txt Note that the test set needs to be drawn from sampled_computed, as re-sampling locally will result in different spectra. eDR3 ---- Again, we produced an intermediate dump from mintaka using bin/dumpedr3lite.sh (no longer version control -- we don't do any edr3 stuff any more. Test set: first 3000 rows of that dump. DR1 --- DR1 data we pulled directly from Gregory's machine mintaka; this assumes the importing user's pubkey is in msdemlei's authorized_keys. Test data: In a psql on the source server: \copy (select * from gaia.dr1 where source_id between 34359738368*57575896 and 34359738368*57575897) to 'tmp.dump'; Move tmp.dump to the test server. Briefly have a in the q.rd and run ``dachs imp q tmp``. Then, in a psql: \copy gaia.dr1 from tmp.dump DR2 light+epochs ---------------- DR2 DM documentation is at http://gea.esac.esa.int/archive/documentation/GDR2/pdf/GaiaDR2_documentation_1.0.pdf The epoch photometry isn't documented there; I took what I could from the rendered epoch photometry, e.g. http://geadata.esac.esa.int/data-server/data?RETRIEVAL_TYPE=epoch_photometry&ID=1000103304040175360 DR2 "light" we dumped using bin/dumpdr2light.sh DR2 RUWE -------- http://www.rssd.esa.int/doc_fetch.php?id=3757412 (GAIA-C3-TN-LU-LL-124-01) discusses a quality measure called RUWE. Since we're stingy on the normal quality measures, we've added this one into DR2 light. There's code to compute the RUWE in res/ruwe.py; that's adapted from code I got from Jan Rybizki and was used to create data2/ruwes.txt (which was a lengthy process, so don't repeat it unless necessary). To add the ruwes to dr2light, use dachs imp q2 import_ruwes dachs imp -R q2 insert_ruwes dachs drop q2 import_ruwes Samples ------- The test set assumes data from 7-Healpix 49384, i.e. between source_ids 1737527439048835072 and 1737562623420923904. To re-generate the sample (don't do this on alnilam, obviously), go to data2 and say: ssh alnilam "psql gavo -c '\copy (SELECT source_id, random_index, ra, dec, pmra, pmdec, parallax, ra_error, dec_error, pmra_error, pmdec_error, parallax_error, astrometric_gof_al, astrometric_params_solved, phot_g_mean_flux, phot_g_mean_mag, phot_g_mean_flux_error, phot_rp_mean_flux, phot_rp_mean_mag, phot_rp_mean_flux_error, phot_bp_mean_flux, phot_bp_mean_mag, phot_bp_mean_flux_error, phot_rp_bp_excess_factor, radial_velocity, radial_velocity_error FROM gaia.dr2light WHERE source_id BETWEEN 1737527439048835072 and 1737562623420923904) to stdout'" | gzip > dump.txt.gz ssh alnilam "psql gavo -c '\copy (SELECT source_id, transit_id, g_transit_time, g_transit_flux, g_transit_flux_error, bp_obs_time, bp_flux, bp_flux_error, rp_obs_time, rp_flux, rp_flux_error, solution_id FROM gaia.dr2epochflux WHERE source_id BETWEEN 1737527439048835072 and 1737562623420923904) to stdout'" | gzip > epoch_dump.txt.gz