Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test bplyr integration #9

Open
jonocarroll opened this issue Jan 28, 2020 · 5 comments
Open

Test bplyr integration #9

jonocarroll opened this issue Jan 28, 2020 · 5 comments

Comments

@jonocarroll
Copy link
Owner

https://github.com/yonicd/bplyr

Appears to work for mutate and filter, even processing an S4 column

library(S4Vectors)
m <- mtcars[, c("cyl", "hp", "am", "gear", "disp")]
d <- as(m, "DataFrame")
d$gr <- GenomicRanges::GRanges("chrY", IRanges::IRanges(1:32, width=10))
d$gr2 <- GenomicRanges::GRanges("chrX", IRanges::IRanges(1:32, width = 10))
d$nl <- IRanges::NumericList(lapply(d$gear, function(n) round(rnorm(n), 2)))
d
#> DataFrame with 32 rows and 8 columns
#>                         cyl        hp        am      gear      disp
#>                   <numeric> <numeric> <numeric> <numeric> <numeric>
#> Mazda RX4                 6       110         1         4       160
#> Mazda RX4 Wag             6       110         1         4       160
#> Datsun 710                4        93         1         4       108
#> Hornet 4 Drive            6       110         0         3       258
#> Hornet Sportabout         8       175         0         3       360
#> ...                     ...       ...       ...       ...       ...
#> Lotus Europa              4       113         1         5      95.1
#> Ford Pantera L            8       264         1         5       351
#> Ferrari Dino              6       175         1         5       145
#> Maserati Bora             8       335         1         5       301
#> Volvo 142E                4       109         1         4       121
#>                           gr        gr2                   nl
#>                    <GRanges>  <GRanges>        <NumericList>
#> Mazda RX4          chrY:1-10  chrX:1-10 -0.26,0.22,-1.33,...
#> Mazda RX4 Wag      chrY:2-11  chrX:2-11    0.35,0.67,2.5,...
#> Datsun 710         chrY:3-12  chrX:3-12 0.47,-0.76,-1.91,...
#> Hornet 4 Drive     chrY:4-13  chrX:4-13     -2.78,-1.82,0.81
#> Hornet Sportabout  chrY:5-14  chrX:5-14      0.03,-1.51,1.01
#> ...                      ...        ...                  ...
#> Lotus Europa      chrY:28-37 chrX:28-37  0.29,1.11,-0.13,...
#> Ford Pantera L    chrY:29-38 chrX:29-38   1.9,-1.43,-0.6,...
#> Ferrari Dino      chrY:30-39 chrX:30-39  0.76,0.28,-0.16,...
#> Maserati Bora     chrY:31-40 chrX:31-40  -0.14,0.96,1.52,...
#> Volvo 142E        chrY:32-41 chrX:32-41 -0.49,0.54,-1.55,...

mutateDF <- function(.data,...){
  
  FNS <- lapply(rlang::quos(...),rlang::quo_expr)
  
  EXPRS <- lapply(names(FNS),function(x){
    sprintf('%s <- %s',x,deparse(FNS[[x]]))
  })
  
  within(.data,eval(parse(text = paste0(unlist(EXPRS),collapse = '\n'))))
  
}
mutateDF(d, nl2 = 2 * nl)
#> Warning: `quo_expr()` is deprecated as of rlang 0.2.0.
#> Please use `quo_squash()` instead.
#> This warning is displayed once per session.
#> DataFrame with 32 rows and 9 columns
#>                         cyl        hp        am      gear      disp
#>                   <numeric> <numeric> <numeric> <numeric> <numeric>
#> Mazda RX4                 6       110         1         4       160
#> Mazda RX4 Wag             6       110         1         4       160
#> Datsun 710                4        93         1         4       108
#> Hornet 4 Drive            6       110         0         3       258
#> Hornet Sportabout         8       175         0         3       360
#> ...                     ...       ...       ...       ...       ...
#> Lotus Europa              4       113         1         5      95.1
#> Ford Pantera L            8       264         1         5       351
#> Ferrari Dino              6       175         1         5       145
#> Maserati Bora             8       335         1         5       301
#> Volvo 142E                4       109         1         4       121
#>                           gr        gr2                   nl
#>                    <GRanges>  <GRanges>        <NumericList>
#> Mazda RX4          chrY:1-10  chrX:1-10 -0.26,0.22,-1.33,...
#> Mazda RX4 Wag      chrY:2-11  chrX:2-11    0.35,0.67,2.5,...
#> Datsun 710         chrY:3-12  chrX:3-12 0.47,-0.76,-1.91,...
#> Hornet 4 Drive     chrY:4-13  chrX:4-13     -2.78,-1.82,0.81
#> Hornet Sportabout  chrY:5-14  chrX:5-14      0.03,-1.51,1.01
#> ...                      ...        ...                  ...
#> Lotus Europa      chrY:28-37 chrX:28-37  0.29,1.11,-0.13,...
#> Ford Pantera L    chrY:29-38 chrX:29-38   1.9,-1.43,-0.6,...
#> Ferrari Dino      chrY:30-39 chrX:30-39  0.76,0.28,-0.16,...
#> Maserati Bora     chrY:31-40 chrX:31-40  -0.14,0.96,1.52,...
#> Volvo 142E        chrY:32-41 chrX:32-41 -0.49,0.54,-1.55,...
#>                                    nl2
#>                          <NumericList>
#> Mazda RX4         -0.52,0.44,-2.66,...
#> Mazda RX4 Wag           0.7,1.34,5,...
#> Datsun 710        0.94,-1.52,-3.82,...
#> Hornet 4 Drive        -5.56,-3.64,1.62
#> Hornet Sportabout      0.06,-3.02,2.02
#> ...                                ...
#> Lotus Europa       0.58,2.22,-0.26,...
#> Ford Pantera L      3.8,-2.86,-1.2,...
#> Ferrari Dino       1.52,0.56,-0.32,...
#> Maserati Bora      -0.28,1.92,3.04,...
#> Volvo 142E         -0.98,1.08,-3.1,...


filterDF <- function(.data,...){
  subset(.data,{
    eval(rlang::quo_expr(rlang::quo(...)))
  })
}
filterDF(d, lengths(nl) == 5)
#> DataFrame with 5 rows and 8 columns
#>                      cyl        hp        am      gear      disp
#>                <numeric> <numeric> <numeric> <numeric> <numeric>
#> Porsche 914-2          4        91         1         5     120.3
#> Lotus Europa           4       113         1         5      95.1
#> Ford Pantera L         8       264         1         5       351
#> Ferrari Dino           6       175         1         5       145
#> Maserati Bora          8       335         1         5       301
#>                        gr        gr2                  nl
#>                 <GRanges>  <GRanges>       <NumericList>
#> Porsche 914-2  chrY:27-36 chrX:27-36  0.27,0.77,0.38,...
#> Lotus Europa   chrY:28-37 chrX:28-37 0.29,1.11,-0.13,...
#> Ford Pantera L chrY:29-38 chrX:29-38  1.9,-1.43,-0.6,...
#> Ferrari Dino   chrY:30-39 chrX:30-39 0.76,0.28,-0.16,...
#> Maserati Bora  chrY:31-40 chrX:31-40 -0.14,0.96,1.52,...

Created on 2020-01-29 by the reprex package (v0.3.0)

(with dispatch, of course).

It doesn't seem to work to call the b_mutate methods internally, but maybe I'm doing something wrong. Collaboration, @yonicd?

@yonicd
Copy link

yonicd commented Jan 28, 2020

I’ll take a look on my end

@jonocarroll
Copy link
Owner Author

Progress... https://github.com/jonocarroll/DFplyr/tree/bplyr_integration

The README renders in the current form (including S4 columns). I haven't finished, but I found a lot of edge cases and have dealt with them.

@yonicd
Copy link

yonicd commented Feb 2, 2020

Looks better!
A few q’s (probably me not grokking)

You are importing dplyr?

Aren’t the Fn names causing ns conflicts?

If you are using base underneath why would the user want to install dplyr?

@jonocarroll
Copy link
Owner Author

I only import the generics - without those there's no dispatch. You reclassed everything and wrote new generics but this is 'supposed' to be the way to extend a generic - write the method for a new class. Plus this way mutate works whether you pass it a data.frame or a DataFrame. My original idea was to use the tbl methods under the hood but there are glaring issues with that.

I could write new generics but that breaks dplyr if it's also attached.

@yonicd
Copy link

yonicd commented Feb 3, 2020

Ok. The original noplyr was like that but still caused tons of ns problems. I’ll look more closely at how you did it to figure out what i did wrong there. Cheers ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants