-
Notifications
You must be signed in to change notification settings - Fork 30
Description
It would be good to be able to query for summary statistics (e.g., log-likelihood of the observed data, MAP estimates of various parameters, etc.) from a BLOG program itself. Even better would be to combine the two. I appreciate any feedback on the following thoughts:
Some support already exists for summary statistics in BLOG models, e.g., https://github.com/lileicc/dblog/blob/master/src/blog/sample/LWSampler.java#L181, making use of it requires code changes. I think the user should be able to call a built-in function LogLikelihood and write something like
//parameters we wish to learn
random Real[] theta ~ ...;
//let's say we have a 1000-dim array of dataPoints
obs ObsValue(dataPoint[0]) = ...;
...
obs ObsValue(dataPoint[999]) = ...;
//return log-likelihood of dataPoint[*]
query LogLikelihood();
One issue is: for what estimation of model parameters should we return LL? If we added ways to easily reference various estimates of parameters (e.g., MAP: query MAP(theta), the current sample: CURRENT(theta), or various other summarizations), we could make the LogLikelihood syntax slightly richer:
- LogLikelihood([model parameters]): evaluate LL using the model parameters provided
- LogLikelihood(): evaluate LL for all accepted model parameters
This brings up a number of issues to resolve:
- How to implement the estimation functions MAP, etc: My first thought would be to enrich the hierarchy in blog.model to include a BuiltInQuery type. This would need access to all variables.
- How should model parameters be represented? : Simplest might be a TabularInterp
- What should the convention be for built-in functions? As above, using a BLOG "namespace" (BLOG.LogLikelihood(...)), or something more Pythonic (__LogLikelihood__(...))?
- (More forward-thinking:) Do we allow people to override built-in functions?