-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BIC score for FCI #90
Comments
The problem is, Bic scores only make sense if you gave a dag model or can
get a dag model since you need to know what the parents are for every node.
For a pag you don't necessarily know this. So usually Bic scores for fci
outputs don't theoretically make sense.
…On Fri, Sep 20, 2019, 3:58 PM vlad43210 ***@***.***> wrote:
Hi,
I have a similar issue to #86
<#86> -- I would like to get
the BIC score for the FCI algorithm. I'm working with Ilya Shpitser's team
at JHU and they confirm that while FCI does not use the BIC for search, it
does output it. However, when I run:
graph = tetrad.getTetradGraph() print('Graph BIC:
{}'.format(graph.getAttribute('BIC')))
I get:
Graph BIC: None
when I choose the fci algorithm. Any chance you could surface this score
for me? I was told by the same team that TETRAD does output it for FCI, so
it should be in the source somewhere.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#90?email_source=notifications&email_token=ACLFSR6CNE2XFMMUVNOAJF3QKUTNHA5CNFSM4IY2YGBKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HMYKUGA>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ACLFSRYKNLNOJQBLC23H2U3QKUTNHANCNFSM4IY2YGBA>
.
|
So my colleague says:
Our work is in the above three settings, and it would be very useful to get the BIC score for DAGs generated by pycausal in these cases! Especially since it's already a feature in TETRAD, but our solution is programmatic and we don't want to manually fire up TETRAD every time we want to calculate a BIC. To give a little more clarity, we are interested in calculating BICs for DAGs generated at various levels of confidence (alpha value) to select the optimal alpha value for a particular DAG generation process. Hope this helps! Vlad |
OK, ask Ilya how to calculate it. :)
…On Fri, Sep 20, 2019 at 7:14 PM vlad43210 ***@***.***> wrote:
So Prof. Shpitser says:
BIC scores make sense if you have a likelihood that comes from a curved
exponential family
FCI may certainly be applicable to data that comes from such a model
and in general you have no choice but to use FCI, because the only way you
can use model scoring is if you had a tractable local search procedure, and
such procedures are only known to exist for undirected and directed acyclic
graph models
but not, for example, for MAGs
even if MAGs form a curved exponential family and thus have a very much
defined BIC score, and this score is even consistent
to summarize:
(a) the BIC score may be defined on any model with a fixed dimension
(b) the BIC score is further consistent if the model is also curved
exponential
(c) the BIC score may further be used to have a computationally tractable
model selection procedure if it corresponds to directed acyclic,
undirected, or bidirected graphs
Our work is in the above three settings, and it would be very useful to
get the BIC score for DAGs generated by pycausal in these cases! Especially
since it's already a feature in TETRAD, but our solution is programmatic
and we don't want to manually fire up TETRAD every time we want to
calculate a BIC.
To give a little more clarity, we are interested in calculating BICs for
DAGs generated at various levels of confidence (alpha value) to select the
optimal alpha value for a particular DAG generation process.
Hope this helps!
Vlad
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#90?email_source=notifications&email_token=ACLFSR74LHXMJZPSSDMGJDLQKVKN5A5CNFSM4IY2YGBKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD7ID2CI#issuecomment-533740809>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ACLFSR6ILKJQVQIYMNEXZ7TQKVKN5ANCNFSM4IY2YGBA>
.
--
Joseph D. Ramsey
Special Faculty and Director of Research Computing
Department of Philosophy
135 Baker Hall
Carnegie Mellon University
Pittsburgh, PA 15213
[email protected]
Office: (412) 268-8063
http://www.andrew.cmu.edu/user/jdramsey
|
By the way, I was serious. If you or Ilya can say how to calculate BIC for
a FCI output correctly, I'll run it by Peter Spirtes, and if he agrees,
I'll code it up for you.
On Fri, Sep 20, 2019 at 8:22 PM Joseph Ramsey <[email protected]>
wrote:
… OK, ask Ilya how to calculate it. :)
On Fri, Sep 20, 2019 at 7:14 PM vlad43210 ***@***.***>
wrote:
> So Prof. Shpitser says:
>
> BIC scores make sense if you have a likelihood that comes from a curved
> exponential family
> FCI may certainly be applicable to data that comes from such a model
> and in general you have no choice but to use FCI, because the only way
> you can use model scoring is if you had a tractable local search procedure,
> and such procedures are only known to exist for undirected and directed
> acyclic graph models
> but not, for example, for MAGs
> even if MAGs form a curved exponential family and thus have a very much
> defined BIC score, and this score is even consistent
> to summarize:
> (a) the BIC score may be defined on any model with a fixed dimension
> (b) the BIC score is further consistent if the model is also curved
> exponential
> (c) the BIC score may further be used to have a computationally tractable
> model selection procedure if it corresponds to directed acyclic,
> undirected, or bidirected graphs
>
> Our work is in the above three settings, and it would be very useful to
> get the BIC score for DAGs generated by pycausal in these cases! Especially
> since it's already a feature in TETRAD, but our solution is programmatic
> and we don't want to manually fire up TETRAD every time we want to
> calculate a BIC.
>
> To give a little more clarity, we are interested in calculating BICs for
> DAGs generated at various levels of confidence (alpha value) to select the
> optimal alpha value for a particular DAG generation process.
>
> Hope this helps!
>
> Vlad
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> <#90?email_source=notifications&email_token=ACLFSR74LHXMJZPSSDMGJDLQKVKN5A5CNFSM4IY2YGBKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD7ID2CI#issuecomment-533740809>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/ACLFSR6ILKJQVQIYMNEXZ7TQKVKN5ANCNFSM4IY2YGBA>
> .
>
--
Joseph D. Ramsey
Special Faculty and Director of Research Computing
Department of Philosophy
135 Baker Hall
Carnegie Mellon University
Pittsburgh, PA 15213
***@***.***
Office: (412) 268-8063
http://www.andrew.cmu.edu/user/jdramsey
--
Joseph D. Ramsey
Special Faculty and Director of Research Computing
Department of Philosophy
135 Baker Hall
Carnegie Mellon University
Pittsburgh, PA 15213
[email protected]
Office: (412) 268-8063
http://www.andrew.cmu.edu/user/jdramsey
|
Joe, isn't the iterative conditional fitting algorithm of Drton and Richarson implemented in TETRAD? This can be used to calculate the maximum likelihood term of BIC for linear Gaussian data. Additionally, I believe the number of parameters is proportional to the number of edges. I would run this by Peter to be sure, but I think you can calculate BIC on a MAG (choose a MAG in the output PAG of FCI) and linear Gaussian data using these two quantities. The only issue I can think of is that FCI doesn't guarantee a PAG output. That being said, if it does output a PAG, you should be able to calculate its BIC score. |
To be clear, this is what I suggest:
|
Thanks! I'll run that by Peter. Maybe i should get Ilya to agree as well? @vlad43210 |
Can you say why model dimension is num_edges / 2, rather than num_edges or
perhaps num_edges * 2 (depending on whether you standardize)?
Vlad
…On Mon, Sep 23, 2019 at 4:12 PM Joseph Ramsey ***@***.***> wrote:
Thanks! I'll run that by Peter. Maybe i should get Ilya to agree as well?
@vlad43210 <https://github.com/vlad43210>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#90?email_source=notifications&email_token=AAFIDHDQJGPSZG774VD4Y33QLEPMVA5CNFSM4IY2YGBKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD7MDXDY#issuecomment-534264719>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAFIDHDLNBH6QMKNIEEIMRTQLEPMVANCNFSM4IY2YGBA>
.
|
I am going off of the statement of BIC here: https://projecteuclid.org/download/pdf_1/euclid.aos/1176350709. In our case, k is proportional to the number of edges and n is the number of samples so the parameter penalty in the criterion becomes (num edges / 2) log (num samples). |
Got it, thanks!
Vlad
…On Mon, Sep 23, 2019 at 5:06 PM Bryan Andrews ***@***.***> wrote:
I am going off of the statement of BIC here:
https://projecteuclid.org/download/pdf_1/euclid.aos/1176350709.
[image: image]
<https://user-images.githubusercontent.com/16494809/65462583-19875800-de24-11e9-878b-3d9b7da3e434.png>
In our case, k is proportional to the number of edges and n is the number
of samples so the parameter penalty in the criterion becomes (num edges /
2) log (num samples).
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#90?email_source=notifications&email_token=AAFIDHHV3XPU6734VJ2EJCLQLEVUDA5CNFSM4IY2YGBKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD7MIRSQ#issuecomment-534284490>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAFIDHHD2272O7OC7UK3MDTQLEVUDANCNFSM4IY2YGBA>
.
|
@bja43 @vlad43210 Of course all of our other BIC scores in Tetrad use the formula 2L - k ln n (higher is better). |
@bja43 @vlad43210 Peter agrees with Bryan. Yay! He says we should use Jiji's method for getting a MAG from a PAG. Bryan, do we have that? |
There is a pagToMag function in SearchGraphUtils.java but I have never confirmed that it is Jiji's method. Perhaps Peter knows? |
Hi,
I have a similar issue to #86 -- I would like to get the BIC score for the FCI algorithm. I'm working with a team and they confirm that while FCI does not use the BIC for search, it does output it. However, when I run:
graph = tetrad.getTetradGraph() print('Graph BIC: {}'.format(graph.getAttribute('BIC')))
I get:
Graph BIC: None
when I choose the fci algorithm. Any chance you could surface this score for me? I was told by the same team that TETRAD does output it for FCI, so it should be in the source somewhere.
The text was updated successfully, but these errors were encountered: