This is a selection of spectral lines from splatalogue for distribution with CASA. Markus got it from https://safe.nrao.edu/wiki/bin/view/ALMA/CASA_Offline_Splat_list mainly for LineTAP experiments. Some documentation is also findable through https://splatalogue.online/. The one big challenge here are the inchis, as whatever the NRAO folks have as identifiers is a crazy hodgepodge that in part doesn't even seem to make chemical sense (like stereo markers (?) on non-chiral molecules) and either way cannot reliably be parsed by a machine. To come up with halfway reliable InChIs anyway, I'm extracting the "Species" and "Chemical Name" columns and sort them such that similar things hopefully are next to each other; that's the root of the res/species-key.txt file and is implemented in bin/extract_species.py. Next, I manually annotate that file with a third column containing the structure formula in our heuristic protocols.linetap code. The result of that is then processed with bin/compute_inchis.py, which add the inchis to res/species_key.txt. That is then used by the importer. If you do big changes here, see species/README on how to propagate the fixes here to there. The test set is the first 1000 lines of the TSV file.