Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deterministic nodes with factor parents #65

Open
jarrod-dalton opened this issue Jun 9, 2015 · 12 comments
Open

Deterministic nodes with factor parents #65

jarrod-dalton opened this issue Jun 9, 2015 · 12 comments

Comments

@jarrod-dalton
Copy link
Collaborator

This can be difficult, since we have to write formulas which reference factor levels. Currently, we have to do so by referencing the integer index of the factor level we want.

Rather than fumble around trying to give you a closed-form example of R code that tries to describe what I'm talking about, I will invite you to go into the Decision Networks vignette and attempt defining the payoff utility node using setNode(). Or, if that's actually not possible, modifying the network structure (e.g., by adding nodes) in such a way that the payoff can be calculated.

In the meantime, I'll switch to working on the Setting Nodes and Getting Started vignettes.

@nutterb
Copy link
Owner

nutterb commented Jun 16, 2015

This is another one I'll spend some time thinking about. I'll come up with something.

@nutterb
Copy link
Owner

nutterb commented Oct 30, 2015

Consider this example:

net <- setNode(net, payoff, "determ", define=fromFormula(),
         nodeFormula = payoff ~
                         ifelse(playerFinalPoints > 21, -1,
                           ifelse(playerFinalPoints == 21,
                             ifelse(dealerOutcome == 1, 0,
                               ifelse(dealerOutcome == 7, 0, 1)),
                             ifelse(dealerOutcome == 2,
                               ifelse(playerFinalPoints < 22, 1, -1),
                               ifelse(dealerOutcome == 3,
                                 ifelse(playerFinalPoints == 17, 0,
                                 ifelse(playerFinalPoints > 17, 1, -1)),
                                 ifelse(dealerOutcome == 4,
                                   ifelse(playerFinalPoints == 18, 0,
                                     ifelse(playerFinalPoints > 18, 1, -1)),
                                   ifelse(dealerOutcome == 5,
                                     ifelse(playerFinalPoints == 19, 0,
                                       ifelse(playerFinalPoints > 19, 1, -1)),
                                     ifelse(dealerOutcome == 6,
                                       ifelse(playerFinalPoints == 20, 0,
                                         ifelse(playerFinalPoints > 20, 1, -1)),
                                       ifelse(playerFinalPoints == 21, 0, -1)))))))))

Given the current structure, the only thing I can think that would make it feasible to give the factor level would be to use a utility function here. So if I wanted the equivalent of dealerOutcome == 2, I could use a utility such as

dealerOutcome == numericLevel("Bust", BJDealer$dealerOutcome)

the numericLevel function would then return the number 2.

The upside is that you don't have to remember all of the variable codings. The downside is that it has the potential to be much more typing. But the only other place this gets processed is in writing the JAGS code, and there's no good way to tie the numeric coding to a factor variable at that point.

I'll write up the function. You can tell me if you want to use it at all. :)

@jarrod-dalton
Copy link
Collaborator Author

Is there maybe an escape character that we could use instead of quotation
marks, like

#Bust#

which would tell HydeNet to call such a function?

On Fri, Oct 30, 2015 at 11:08 AM, Benjamin [email protected] wrote:

Consider this example:

net <- setNode(net, payoff, "determ", define=fromFormula(),
nodeFormula = payoff ~
ifelse(playerFinalPoints > 21, -1,
ifelse(playerFinalPoints == 21,
ifelse(dealerOutcome == 1, 0,
ifelse(dealerOutcome == 7, 0, 1)),
ifelse(dealerOutcome == 2,
ifelse(playerFinalPoints < 22, 1, -1),
ifelse(dealerOutcome == 3,
ifelse(playerFinalPoints == 17, 0,
ifelse(playerFinalPoints > 17, 1, -1)),
ifelse(dealerOutcome == 4,
ifelse(playerFinalPoints == 18, 0,
ifelse(playerFinalPoints > 18, 1, -1)),
ifelse(dealerOutcome == 5,
ifelse(playerFinalPoints == 19, 0,
ifelse(playerFinalPoints > 19, 1, -1)),
ifelse(dealerOutcome == 6,
ifelse(playerFinalPoints == 20, 0,
ifelse(playerFinalPoints > 20, 1, -1)),
ifelse(playerFinalPoints == 21, 0, -1)))))))))

Given the current structure, the only thing I can think that would make it
feasible to give the factor level would be to use a utility function here.
So if I wanted the equivalent of dealerOutcome == 2, I could use a
utility such as

dealerOutcome == numericLevel("Bust", BJDealer$dealerOutcome)

the numericLevel function would then return the number 2.

The upside is that you don't have to remember all of the variable codings.
The downside is that it has the potential to be much more typing. But the
only other place this gets processed is in writing the JAGS code, and
there's no good way to tie the numeric coding to a factor variable at that
point.


Reply to this email directly or view it on GitHub
#65 (comment).

@nutterb
Copy link
Owner

nutterb commented Oct 30, 2015

It's possible we could use something like dealerOutcome == "#Bust,BJDealer$dealerOutcome#", but I don't think that saves much typing. The major issue is that therToJags` function deals with converting R code into JAGS and only takes a single argument--a formula object. The variable has to be passed with the variable level.

However, as I think about it, we could create our own handy dandy little intermediary function with a weird syntax. for example:

jagsFunc(formula, ...)

where the ... arguments takes named arguments, each giving a factor variable referenced in formula.

jags(payoff ~ dealerOutcome == "#Bust:dealerOutcome#",
     dealerOutcome = BJDealer$dealerOutcome)

returns a formula object payoff ~ dealerOutcome == 2.

Alternatively, we might have jagsFunc take a character argument, which would allow jagsFunc("payoff ~ #dealerOutcome == 'Bust'#"). I'm a little nervous about this one, however, because I think it will likely fail if someone tries to use it in any way other than the == sense. I can't think of why anyone would do something like dealerOutcome * "Bust" or what that would mean. Perhaps I'm being paranoid?

I'm rambling. that might work actually.

nutterb added a commit that referenced this issue Oct 30, 2015
@nutterb
Copy link
Owner

nutterb commented Oct 30, 2015

This is now implemented into the current-devel branch. The final function name is factorFormula and I even implemented it in the Decision Networks vignette if you'd like to see it in action. If you feel like you can get behind this, let me know.

@jarrod-dalton
Copy link
Collaborator Author

Beautiful. Now on to the beggar->chooser transition: is it possible to build logic into the formula evaluation such that if it sees any quoted elements it knows to pass it through factorFormula() without the user explicitly calling it?

  • Jarrod

On Oct 30, 2015, at 2:16 PM, Benjamin [email protected] wrote:

This is now implemented into the current-devel branch. The final function name is factorFormula and I even implemented it in the Decision Networks vignette if you'd like to see it in action. If you feel like you can get behind this, let me know.


Reply to this email directly or view it on GitHub.

@nutterb
Copy link
Owner

nutterb commented Oct 30, 2015

Truthfully, yes. It just means passing every formula through factorFormula within setNode and not exporting factorFormula. (well, we could still export it, we just wouldn't have to, and I would probably opt not to, since there isn't much need for it otherwise). would you like to beg and choose that option?

@jarrod-dalton
Copy link
Collaborator Author

I like that. All under the hood. Thanks!

On Fri, Oct 30, 2015 at 2:36 PM, Benjamin [email protected] wrote:

Truthfully, yes. It just means passing every formula through
factorFormula within setNode and not exporting factorFormula. (well, we
could still export it, we just wouldn't have to, and I would probably opt
not to, since there isn't much need for it otherwise). would you like to
beg and choose that option?


Reply to this email directly or view it on GitHub
#65 (comment).

nutterb added a commit that referenced this issue Oct 30, 2015
@nutterb nutterb closed this as completed Oct 30, 2015
@jarrod-dalton
Copy link
Collaborator Author

I seem to be unable to pass node formulas through factorFormula() when the node is not deterministic. Below, I attempt to manually write a logistic regression equation for pe given wells, where wells is treated as a three-level categorical variable.

# Set up some stuff...
net <- HydeNetwork(~ wells
                   + pe | wells
                   + d.dimer | pregnant*pe
                   + angio | pe
                   + treat | d.dimer*angio
                   + death | pe*treat)

net <- setNode(net, wells,
               nodeType = "dcat",
               pi = vectorProbs(p = c(37, 164, 49), wells),
               factorLevels = c("Low","Medium","High"))

# These two attempts do not work...
net <- setNode(net, "pe", nodeType = "dbern", 
               define = fromFormula(),
               nodeFormula = pe ~ ilogit(-2.94
                                         + 1.56*(wells == "Medium")
                                         + 3.14*(wells == "High")))  

net <- setNode(net, "pe", nodeType = "dbern", 
               p = plogis(-2.94 + 1.56*(wells == "Medium")
                          + 3.14*(wells == "High")))

@jarrod-dalton jarrod-dalton reopened this Nov 23, 2015
@jarrod-dalton
Copy link
Collaborator Author

I think I got it...

net <- setNode(net, "pe", nodeType = "dbern", 
                p = fromFormula(),
                nodeFormula = pe ~ ilogit(-2.94
                                          + 1.56*(wells == "Medium")
                                          + 3.14*(wells == "High")))  

@jarrod-dalton
Copy link
Collaborator Author

Do we want to alert the user to unconverted factor levels? In the below example, we try to use a factor level for node pe in the regression equation for d.dimer before we've used setNode() to define node pe (and told it that the factorLevels are c("No","Yes").

It is generally a good idea to proceed through the network in topological order (basically starting from the root nodes and populating children only when all parent nodes have been populated). Doing so will avoid issues like this.

Do we want to go so far as disallowing setNode() from working if all parents' models have not yet been specified? This wouldn't catch all possible ways to screw up inputting node distributions via setNode() (as I seem to be adept at demonstrating), but on the other hand I can't seem to think of a good reason not to work under this restriction.

net <- HydeNetwork(~ wells
                   + pe | wells
                   + d.dimer | pregnant*pe)

net <- setNode(network = net, node = pregnant,
               nodeType = "dbern", p=.4,
               factorLevels = c("No","Yes"))

wells.p <- paste("pi.wells[1] <- 0.148",
                 "pi.wells[2] <- 0.656",
                 "pi.wells[3] <- 0.196",
                 sep = "; ")
net <- setNode(net, wells, nodeType = "dcat", pi = wells.p)

# Not run, but it should be...
#
#net <- setNode(net, "pe", nodeType = "dbern", 
#                p = fromFormula(),
#                nodeFormula = pe ~ ilogit(-2.94
#                                          + 1.56*(wells == "Medium")
#                                          + 3.14*(wells == "High")))

net <- setNode(net, d.dimer, nodeType="dnorm",
               mu=fromFormula(), tau=1/30,  #sigma^2 = 30
               nodeFormula = d.dimer ~ 210 + 29*(pregnant=="Yes") + 68*(pe=="Yes"))

net$nodeFormula$d.dimer

d.dimer ~ 210 + 29 * (pregnant == 1) + 68 * (pe == character(0))
<environment: 0x10615f5c0>

@jarrod-dalton jarrod-dalton reopened this Nov 23, 2015
nutterb added a commit that referenced this issue Dec 11, 2015
@nutterb
Copy link
Owner

nutterb commented Dec 11, 2015

I added an error in circumstances where there is no accompanying factorLevels entry for the variable. I think it's important to make this a hard error--the downstream consequences are catastrophic. Let me know if you think the error message is sufficient or if it needs more information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants