Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BIC score for FCI #90

Open
vlad43210 opened this issue Sep 20, 2019 · 13 comments
Open

BIC score for FCI #90

vlad43210 opened this issue Sep 20, 2019 · 13 comments

Comments

@vlad43210
Copy link

vlad43210 commented Sep 20, 2019

Hi,

I have a similar issue to #86 -- I would like to get the BIC score for the FCI algorithm. I'm working with a team and they confirm that while FCI does not use the BIC for search, it does output it. However, when I run:

graph = tetrad.getTetradGraph() print('Graph BIC: {}'.format(graph.getAttribute('BIC')))

I get:

Graph BIC: None

when I choose the fci algorithm. Any chance you could surface this score for me? I was told by the same team that TETRAD does output it for FCI, so it should be in the source somewhere.

@jdramsey
Copy link
Contributor

jdramsey commented Sep 20, 2019 via email

@vlad43210
Copy link
Author

vlad43210 commented Sep 20, 2019

So my colleague says:

BIC scores make sense if you have a likelihood that comes from a curved exponential family
FCI may certainly be applicable to data that comes from such a model
and in general you have no choice but to use FCI, because the only way you can use model scoring is if you had a tractable local search procedure, and such procedures are only known to exist for undirected and directed acyclic graph models
but not, for example, for MAGs
even if MAGs form a curved exponential family and thus have a very much defined BIC score, and this score is even consistent
to summarize:
(a) the BIC score may be defined on any model with a fixed dimension
(b) the BIC score is further consistent if the model is also curved exponential
(c) the BIC score may further be used to have a computationally tractable model selection procedure if it corresponds to directed acyclic, undirected, or bidirected graphs

Our work is in the above three settings, and it would be very useful to get the BIC score for DAGs generated by pycausal in these cases! Especially since it's already a feature in TETRAD, but our solution is programmatic and we don't want to manually fire up TETRAD every time we want to calculate a BIC.

To give a little more clarity, we are interested in calculating BICs for DAGs generated at various levels of confidence (alpha value) to select the optimal alpha value for a particular DAG generation process.

Hope this helps!

Vlad

@jdramsey
Copy link
Contributor

jdramsey commented Sep 21, 2019 via email

@jdramsey
Copy link
Contributor

jdramsey commented Sep 23, 2019 via email

@bja43
Copy link
Contributor

bja43 commented Sep 23, 2019

Joe, isn't the iterative conditional fitting algorithm of Drton and Richarson implemented in TETRAD?

This can be used to calculate the maximum likelihood term of BIC for linear Gaussian data. Additionally, I believe the number of parameters is proportional to the number of edges. I would run this by Peter to be sure, but I think you can calculate BIC on a MAG (choose a MAG in the output PAG of FCI) and linear Gaussian data using these two quantities.

The only issue I can think of is that FCI doesn't guarantee a PAG output. That being said, if it does output a PAG, you should be able to calculate its BIC score.

@bja43
Copy link
Contributor

bja43 commented Sep 23, 2019

To be clear, this is what I suggest:

  1. run FCI
  2. if FCI does not return a PAG, then return none
  3. if FCI does return a PAG, then sample a MAG from that PAG
  4. calculate the MLE covariance matrix for the MAG using (R)ICF
  5. use the MLE covariance matrix to calculate the log-likelihood term
  6. BIC = log-likelihood - (num edges / 2) * log(num samples)

@jdramsey
Copy link
Contributor

Thanks! I'll run that by Peter. Maybe i should get Ilya to agree as well? @vlad43210

@vlad43210
Copy link
Author

vlad43210 commented Sep 23, 2019 via email

@bja43
Copy link
Contributor

bja43 commented Sep 23, 2019

I am going off of the statement of BIC here: https://projecteuclid.org/download/pdf_1/euclid.aos/1176350709.

image

In our case, k is proportional to the number of edges and n is the number of samples so the parameter penalty in the criterion becomes (num edges / 2) log (num samples).

@vlad43210
Copy link
Author

vlad43210 commented Sep 23, 2019 via email

@jdramsey
Copy link
Contributor

@bja43 @vlad43210 Of course all of our other BIC scores in Tetrad use the formula 2L - k ln n (higher is better).

@jdramsey
Copy link
Contributor

@bja43 @vlad43210 Peter agrees with Bryan. Yay! He says we should use Jiji's method for getting a MAG from a PAG. Bryan, do we have that?

@bja43
Copy link
Contributor

bja43 commented Sep 25, 2019

There is a pagToMag function in SearchGraphUtils.java but I have never confirmed that it is Jiji's method. Perhaps Peter knows?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants