Integration with pyhf JSON workspaces #98

kratsg · 2019-06-17T18:43:55Z

/cc @matthewfeickert @lukasheinrich -- we should probably file an issue here to investigate the possibility of getting the HEPData to handle pyhf JSON specifications (additionally teaching it to export the given specification to root+xml as well if needed).

I'm hoping to use this issue as a place to hold discussion on this. For reference, we do have a JSON schema that fully specifies the workspace and will be releasing a pyhf version on pypi shortly that contains the v1.0.0 of this schema.

lukasheinrich · 2019-06-19T13:55:49Z

yes this has been a long-term (read: years) project and I initially came up with some code that reads in ROOT workspaces and spits out HepData

https://github.com/lukasheinrich/hf2hd-demo

but we should absolutely revisit this. (though arguably, just uploading the likelihood is sufficient if all the hepdata records can be fully generated from them)

kratsg · 2019-06-25T06:35:53Z

Just a quick note that we do have a very nice feature of pyhf that allows you to produce summaries of the JSON schemas. See diana-hep/pyhf#443 for details. We currently provide (beta) a pyhf inspect command line tool that pretty-prints a summary of the JSON specification in a human-readable format. This can (and probably should) spit out a JSON of the summary as well to be consumed in an automated fashion. Is this something of interest for HEPData to use?

clelange · 2019-06-27T14:59:07Z

Hi @kratsg - mind that this tool is mainly meant for converting input to a format that can be ingested by HEPData. Once this is the case for pyhf workspaces (my understanding that this is currently not so), it'd be great if you added this to hepdata_lib. For discussion on what can be added to HEPData and how, you probably have to communicate with the HEPData developers/maintainers directly (I guess preferably by email).

kratsg · 2019-06-27T16:23:47Z

hi @clelange , I did not realize the two were somewhat separated. Should hepdata_lib effectively support something like yaml.dump(json.load(open('workspace.json')))? Really, that's most of the work as the entire specification is in a single JSON document.

So HEPData needs to support this first, before hepdata_lib can write a converter for it?

clelange · 2019-06-27T16:37:52Z

I think I wasn't reading carefully, sorry. If you contribute code similar to https://github.com/lukasheinrich/hf2hd-demo that converts the workspace.json into the YAML format that is understood by HEPData as part of the submission.tar.gz (which is effectively what hepdata_lib does for other formats such as ROOT histograms already), this is perfectly fine. Do I understand correctly that this is your plan?
I'm not sure I understand why additional exports to root+xml are needed.

kratsg · 2019-06-27T16:57:26Z

I'm not sure I understand why additional exports to root+xml are needed.

This is usually because "ROOT+XML" is what people already use (the HistFactory workspace) and I think, in some cases already, these have been uploaded for an analysis or two in the past (but I'm reaaaaaally not sure here). The fact that this functionality is possible means it could be useful to have the likelihood exported into different formats depending on what you want.. but I don't know if this is something HEPData wants to do or not.

I think I wasn't reading carefully, sorry. If you contribute code similar to https://github.com/lukasheinrich/hf2hd-demo that converts the workspace.json into the YAML format that is understood by HEPData as part of the submission.tar.gz (which is effectively what hepdata_lib does for other formats such as ROOT histograms already), this is perfectly fine. Do I understand correctly that this is your plan?

yeah, that should be ~what we want :)

lukasheinrich · 2019-06-27T17:07:31Z

just note that a conversion into hepdata yaml will always be lossy.. the full likelihood probably will require uploading the full spec to hepdata (either as aux material or as a native integration as @GraemeWatt suggested). Bit a lossy projection an still be useful: The generated hepdata tables can be e.g. the equivalent of pre/post fit plots we usually produce)

GraemeWatt mentioned this issue Nov 8, 2019

submission: provide native support for individual "pyhf" JSON files HEPData/hepdata#164

Closed

GraemeWatt mentioned this issue Feb 14, 2024

Possible workspace.json file submission + symerrors submission #252

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integration with pyhf JSON workspaces #98

Integration with pyhf JSON workspaces #98

kratsg commented Jun 17, 2019

lukasheinrich commented Jun 19, 2019

kratsg commented Jun 25, 2019

clelange commented Jun 27, 2019

kratsg commented Jun 27, 2019

clelange commented Jun 27, 2019

kratsg commented Jun 27, 2019

lukasheinrich commented Jun 27, 2019

Integration with pyhf JSON workspaces #98

Integration with pyhf JSON workspaces #98

Comments

kratsg commented Jun 17, 2019

lukasheinrich commented Jun 19, 2019

kratsg commented Jun 25, 2019

clelange commented Jun 27, 2019

kratsg commented Jun 27, 2019

clelange commented Jun 27, 2019

kratsg commented Jun 27, 2019

lukasheinrich commented Jun 27, 2019