Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nice inline export for Rmd documents #78

Open
simonthelwall opened this issue Jan 15, 2017 · 8 comments
Open

Nice inline export for Rmd documents #78

simonthelwall opened this issue Jan 15, 2017 · 8 comments

Comments

@simonthelwall
Copy link

I was thinking that it would be great if R had a set of functions for including regression output in body text and was wondering whether you thought this would be a good fit for pixiedust, or whether it would be out of scope for the package?

I imagine it working something like the following:

"Risk of cardiovascular events increased with increasing BMI r sprinkle_text(x, y, form = "odds_ratio", conf.int = TRUE, p = TRUE)...

where x is a regression model and y an independent variable.

The output would look like

"Risk of cardiovascular events increased with increasing BMI (OR: 4.5, 95% CI: 4.1-4.7)..."

What do you think?

@nutterb
Copy link
Owner

nutterb commented Jan 15, 2017

I'm not opposed to the idea, but there are a few questions I would want resolve prior to taking on a project like this. For instance

  1. How should interaction terms be addressed?
  2. How should factor variables be retrieved?
  3. Most model objects return a term column when tidied via broom, but what should happen if there is no term column.
  4. What should be the behavior if there is no confidence interval method for the object? When no p-value is available?
  5. Are there specific formats that should be followed? For example, APA formats? (Ugh)
  6. What should happen if an inappropriate form is requested. For example, if I request an odds ratio with a t.test object.

Questions 1-4 seem like they would need pretty firm answers before committing any serious code effort. 5 and 6 are things I think could become headaches in the future.

Just brainstorming out loud, but would be interested in your thoughts on 1-4.

@simonthelwall
Copy link
Author

All excellent points. Some thoughts below.

  1. I had wondered about this. I'll confess that I don't fully understand interactions in R. I think we would want some way to specify the stratum-specific effect (the linear combination).
  2. I think an optional argument specifying the factor level.
  3. I'm not familiar with any model objects that would not return a term, perhaps print a warning in place of the term?
  4. I think CIs and p-values should be optional arguments, defaulting to FALSE. If users then specify an illegal choice an error message should be printed.
  5. Really good point. I think a combination of two options:
    • some built in styles that can be specified by name
    • another function by which a user can specify their own format that will be used universally through the document.
  6. Again, I think print a warning.

@nutterb
Copy link
Owner

nutterb commented Jan 17, 2017

As I've thought about it more, I've decided that this really ought to be a generic for which additional methods may be written. For the generic, I propose the flolling functional requirement's

  1. Accepts an object that may be successfully tidyied.
  2. Returns the error message from tidy when tidy is not successful
  3. Returns any warnings generated by tidy
  4. Accepts a character (1) argument that can determine the output format (overriding other formal arguments)

For now, I would set style = "none" to indicate the formal arguments should be used to determine the format. Other styles, such as APA may follow later.

As an example of the lm method, I would add the following requirements.

  1. Accepts a character vector naming the term to be summarised. A length one vector returns the main effect. A length two vector returns the interaction between two terms, etc.
  2. Return an error if no term exists that satisfies the linear combination.
  3. Accepts a vector or list of characters, optionally named, specifying the level for any factors named in term. If unnamed, the levels are assumed to follow the same order of factors in term.
  4. Returns an error if any levels in level cannot be found in its corresponding term.
  5. Accept a logical (1) indicating if the confidence interval is to be included in the summary
  6. Accept a logical (1) indicating if the SE is to be included in the summary
  7. Accept a logical (1) indicating if the test statistic is to be included in the summary
  8. Accept a logical (1) indicating if the p-value is to be included in the summary
  9. Accept a character(1) designating the text label for the coefficient (beta, OR, etc)
  10. Accept a function by to apply to the coefficient and CI
  11. Accept additional arguments to the transformation function

How would this work for getting started?

@ckraner
Copy link

ckraner commented Jan 20, 2017

If you are using LaTeX just use knitR. Here is my chi-sq reported values from the lm objects:
\({\chi}^2(\Sexpr{PreviousChiSq$df})=\Sexpr{round(PreviousChiSq$dx,2)}, p=\Sexpr{round(PreviousChiSq$chi,24)}\).

It doesn't give you the label for the value, but it's there and easy.

Edit: For percents look at something like this first: http://stackoverflow.com/questions/7145826/how-to-format-a-number-as-percentage-in-r

@nutterb
Copy link
Owner

nutterb commented Jan 20, 2017

Here's a first attempt. How does this look as proof of concept?

use devtools::install_github("nutterb/pixiedust", ref = "new-latex-tables-inline-dust") to install the package with these utilities.

The source code to generate the document displayed below is at https://gist.githubusercontent.com/nutterb/bcc3c04bc4c807cb9753f74820584cf5/raw/dfe78db875de0a314d4e87126ab2cdf5548173d8/dust_inline_example.Rmd

test

@simonthelwall
Copy link
Author

I think that works really nicely.
One thing I noticed is that the upper confidence interval does not appear to be formateted to two dp.
image

@simonthelwall
Copy link
Author

I don't know if I was doing something wrong, or whether it was something else, I just had to update R to 3.3.2 and reinstall all my libraries. When trying to install pixiedust as above, I also had to install the packages below, one-by-one.

Formula
acepack
latticeExtra
gridExtra
htmlTable
data.table

@nutterb
Copy link
Owner

nutterb commented Jan 22, 2017

This sounds like something in the dependency chain. A dependency in one of the dependencies is not being installed. When upgrading R, I would recommend using dependencies =TRUE when using install.packages or any of its devtools variants. You can piece together why by reading about the dependencies argument in install.packages and install_github.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants