-
Notifications
You must be signed in to change notification settings - Fork 30
Meeting Notes
rbharath edited this page Aug 29, 2012
·
19 revisions
Memos from DBLOG implementation meetings. These should be for reference only, for discussions, use the issue tracker.
Topics:
- Release 0.4 (Now): a) modify bayesianlogic.cs to have a manual and a changelog and to link to git b) Set up a public github with a version of our code c) Set up a user group d) Contains basic examples, HMMs, PCFGs
- Release 0.5 (Sept 31st): a) Arrays, LDA, Kalman Filter, Mixed Membership b) Basic operations: +,*,&,|, etc. c) Gibbs Sampling
- Release 0.6 (Oct 31st): a) Particle Filtering Support b) Type Checking c) Block Sampling/Modular Sampling for general functions
- NIPS Demo: a) Urns and Balls Live Video b) Background Subtraction from Live Video c) These examples take form video => text => blog => text => video Separate out (video => text) as own piece? OpenCV hookup?
- SIGMOD Demo: a) Run Mixed Membership model on DBLP dataset
- Independent Parallel MCMC Sampling
- Berkeley Update
- Progress: Symbol evidence fixed, along with associated models
- Problems: PCFG shows evaluation problems; still need arrays for LDA
- JHU Update
- Yanif working on C code generation
- Each component of BLOG code will generate a corresponding chunk of C code
- Sampling engine:
- To be tested on subset of BLOG; results expected after September
- Santa Barbara
- Meeting 9/18 to 9/20 with DARPA
- I-Jeng (JHU), Sidd/Lei (Berkeley) will give talks corresponding to end-of-year report
- Will learn more about data sets, queries of interest (likely fixed webcams for now)
- Questions: how to perform sensor planning? What experimental results to present?
- Next Steps
- Specifying MCMC proposals in BLOG? (Only supported in Java for now)
- VLDB/SIGMOD demos: test analysis, social networks
- Next meeting: Tuesday 9/4, 12 PM
- Discussed Paper "Gibbs Sampling in Open-Universe Stochastic Languages," and specifically the
implementation of MCMC - Unused Functions: AbstractPartialWorld.getInverseTuples(), AbstractFunctionInterp.getInverseArgs
- MCMC Implementation: src/blog/GenericProposer GenericProposer.java:144 world.getNewlyBarrenVars ParentUpdateDGraph:213
- Comments on ParentUpdatedDGraph: Directed graph that is backed by an underlying DGraph, but represents
changes to the set of nodes and to the parent sets of some existing nodes (and thus to the child sets of
some nodes as well). - Barren Node Types: a) Add a node, but add no children b) Remove all children from a node
- Organize meeting sometime next week. a) Aaron: Complete PCFG and other Examples b) Bharath: Get an implementation of Gibbs sampling and continue CBN refactoring c) Aki: Look into equipment for remote meetings.
- Tabular CPD
- New syntax: {a -> ~D}
- Cannot handle "compound" distributions (those taking arguments)
- Depends on modifying MapSpec to handle Clauses
- Million object cases
- JHU vision: handle BLOG models over 1 million objects
- Sybil attacks
- Mixed membership models ** LDA
- Caching models
- Image processing
- Information extraction
- Never-Ending Learning (NELL, @CMU)
- Data integration
- Critical for large-scale models (e.g. LDA, NELL)
- Want general interface for data sources, RDBMS and otherwise
- Leaving CV models for later ** English easier to process
- Short-term plan
- Block sampling (Gibbs too?)
- Parallelism ** For now run multiple samplers in parallel
- Data integration
- ???
- Milestone:2012-07-27
- complete semant class
- refactor model/partial world/blog.util
- CBN
- (sample manager)
(lei & bharath & aki) (lei, andreas, yanif)
- do we need sample class, or just partial world?
- partial world with probability/likelihood
- (lei) interface for CBN
- given a variable node, to get its parents, and children nodes.
- sample manager class
- schedule which node(s) to sample next
- naive (plain) parallelization, independent possible worlds
- semant class, finish functionality
- run through examples of blog
- automatic test
- structure for parallel on cbn node, part of possible world
- (andreas) performance profiling for java packages.
(Lei & Bharath) detailed blog code refactoring meeting
- put blog.util inference related code in AbstractPartialWorld
- add blog.bn.CBN
- partial world responsible for creating instantiating context
- VarWithDistrib (?) some lines of code should be refactored, e.g. creating evalcontext everytime
- EvalContext moved a separate package
- NumberVar, check the get origin values
- Cardinality spec
- randomfunction, getvalueincontext
- the creating of new random variable should be in CBN or partialworld
- move experimental code to a separate
- Lei and UCB Side
- Complete the semantic analysis and translator part
- Design the new internal representation, also allowing future extension to parallel/compiler/database
- Modular sampling framework and specific expert modules
- Check out the BUGS implementation
- Make sure that we include all distributions in BUGS
- Andreas and Yanif
- Compiler part, how to integrate into the blog
- Parallel engine, to start, just independent parallel inference by multiple MCMC chains and LWSampler
- More examples of BLOG/DBLOG:
- PCFG, AND-OR graph for images
- LDA and topic models for document
- Mixed membership models for relationships and social network
- Switching linear dynamical systems (and simple case) [Question: How to sample properly for covariance matrix?]
- Performance comparision:
- Design an evaluation part to easily evaluation the performance of each technique or combinations of techniques.