Robert Schnitman
2018-06-23
Recommended Citation:
Schnitman, Robert (2018). afp v0.0.0.1. https://github.com/robertschnitman/afp
- Installation
- Introduction
do.bind()
mapreduce()
telecast()
- Conclusion
- References
- See also
## Dependencies: R >= 3.5
# install.packages("devtools")
devtools::install_github("robertschnitman/afp")
The afp
package--Applied Functional Programming--provides
functionals to simplify iterative processes in R. The Base R *apply()
family, purrr
library, and Julia programming
language are the principal influences. The
former two contain tools essential for functional programming that
minimize the need to incorporate loops and increase code brevity;
however, there is inelegance with respect to specific situations.
For example, to map a function and consequently bind rows or columns,
purrr
splits the decision into two functions rather than within one:
map_dfr()
and map_dfc()
, both of which only output to data
frames--while this feature is as intended, they nonetheless omit the
possibility of a matrix when such a data type is preferred.
Additionally, to reduce the results of a mapping, one must encase
Map()
/lapply()
in Reduce()
, while Julia blends the routine into
mapreduce()
.
As such, afp
exists to supplement the functionals in Base R,
purrr
, and others to support efficient and concise programming.
The following sections provide examples for the primary functions:
do.bind()
, mapreduce()
, and telecast()
. For a complete description of these and other functions within the packages, see the afp Handbook.
When executing lapply()
to manipulate subsets of data, calling
rbind()
or cbind()
within do.call()
is common to fuse the
transformed partitions. While the implementation is possible on one line
by way of the do.call(*bind, lapply(x, f))
form, the readability and
intended concision decreases as the complexity of the anonymous function
increases. Even if the lapply()
portion is stored in an object before
hand, the intention of do.call()
is not clear until *bind
is stated.
To minimize these issues, do.bind()
wraps this process and clarifies
its purpose: bind the results of the given function.
One may obtain similar results with map_dfr()
/map_dfc()
from
purrr
; but the output would always result in a data frame rather than
the possibility of a matrix. Additionally, the binding rows or columns
must be done in different functions, whereas it can be defined within
do.bind()
.
There are three required parameters in this function: f
, x
, and
m
--respectively the function, collection (e.g. data frame), and
margin (rbind/cbind designation). If m = 1
(the default), the results are
combined row-wise; 2
for column-wise. A fourth parameter ...
passes
to do.call()
.
To note, the naming of this function is to be consistent with the previously mentioned function.
split1 <- split(mtcars, mtcars$gear)
adhoc1 <- function(s) {coef(lm(mpg ~ disp + wt + am, s))}
output1 <- do.bind(adhoc1, split1) # == do.call(rbind, lapply(split1, adhoc1)).
output1 # Rownames indicate subset.
## (Intercept) disp wt am
## 3 27.99461 -0.007982643 -2.384834 NA
## 4 46.68250 -0.097327135 -3.171284 -2.817164
## 5 41.77904 -0.006730729 -7.230952 NA
airquality2 <- na.omit(airquality) # NA values exist in airquality.
split2 <- split(airquality2, airquality2$Month)
adhoc2 <- function(x, y) {median(x/y)}
output2 <- do.bind(function(s) with(s, adhoc2(Ozone, Temp)), split2, 2)
output2 # == do.call(cbind, lapply(split2, function(s) with(s, adhoc2(Ozone, Temp))))
## 5 6 7 8 9
## [1,] 0.3003337 0.3076923 0.7272928 0.5696203 0.2948718
Base R has functionals that output a list by way of lapply()
and
Map()
(among others), while Reduce()
diminishes given elements in a
consecutive manner until a single result remains. While these functions
are relatively straightforward to combine (depending on the functions
being passed), R does not inherently possess a singular function to
accomplish this operation unlike Julia with mapreduce()
. Therefore, mapreduce()
in afp
offers a simplified Julia-equivalent to
streamline the intended procedure in R.
The three required parameters are f
, o
, and x
("fox",
collectively)--function, (binary) operator, and collection (e.g. matrix). If f
is multivariate, the fourth parameter y
can take multiple arguments much like MoreArgs
in mapply()
. The final parameter ...
passes to Reduce()
.
matrixl <- list(A = matrix(c(1:9), 3, 3), B = matrix(10:18, 3, 3), C = matrix(19:27, 3, 3))
output1 <- mapreduce(function(x) x^2 + 1, `/`, matrixl) # == Reduce(`/`, Map(function(x) x^2 + 1, matrixl))
output1
## [,1] [,2] [,3]
## [1,] 0.0000547016 0.0002061856 0.0003107868
## [2,] 0.0001022035 0.0002490183 0.0003310752
## [3,] 0.0001560306 0.0002837380 0.0003456270
# Using the same list from the previous example...
output2 <- with(matrixl, mapreduce(function(i, j, k) i*j - k, `/`, A, list(B, C)))
output2
## [,1] [,2] [,3]
## [1,] -1.387799e-10 -3.408547e-12 -3.442132e-13
## [2,] -2.925642e-11 -1.456427e-12 -1.839155e-13
## [3,] -9.059628e-12 -6.829299e-13 -1.031438e-13
Map()
/mapply()
from Base R executes functions pairwise when given
multiple data objects, as do map2()
/pmap()
from purrr
. While
beneficial in their own right, said functions cannot concisely map over
datasets independently of each other, which would be useful for
storing disparate information into a single list.
Inspired by broadcast()
from
Julia,
telecast()
essentially wraps mapply()
within lapply()
to achieve
this outcome.
The two functions (both required) are f
and l
, respectively a function and list. The third parameter as.vector
, which is optional, converts the output to a vector if set to TRUE
(and thus will resemble the output from rapply()
); by default, it is FALSE
for a list format.
l <- list(mc = mtcars, aq = airquality, lcs = LifeCycleSavings)
mean.nr <- function(x) mean(x, na.rm = TRUE) # airquality has NA values.
output1 <- telecast(mean.nr, l)
output1 # Compare: lapply(l, function(x) mapply(mean.nr, x))
## $`mc`
## mpg cyl disp hp drat wt
## 20.090625 6.187500 230.721875 146.687500 3.596563 3.217250
## qsec vs am gear carb
## 17.848750 0.437500 0.406250 3.687500 2.812500
##
## $aq
## Ozone Solar.R Wind Temp Month Day
## 42.129310 185.931507 9.957516 77.882353 6.993464 15.803922
##
## $lcs
## sr pop15 pop75 dpi ddpi
## 9.6710 35.0896 2.2930 1106.7584 3.7576
EXAMPLE - telecast()
#2: Iteratively reduce each variable for all datasets in a list and store in a vector.
# With the same list in the previous example...
red.div <- function(y) Reduce(`/`, na.omit(y))
output2 <- telecast(red.div, l, as.vector = TRUE)
output2 # Compare: rapply(l, red.div)
## mc.mpg mc.cyl mc.disp mc.hp mc.drat
## 3.488260e-39 6.971465e-24 9.175733e-70 1.724534e-64 3.483381e-17
## mc.wt mc.qsec mc.vs mc.am mc.gear
## 1.767344e-15 2.807034e-38 NaN Inf 2.126822e-17
## mc.carb aq.Ozone aq.Solar.R aq.Wind aq.Temp
## 1.149781e-11 1.016406e-169 1.081719e-313 5.634846e-147 5.949698e-286
## aq.Month aq.Day lcs.sr lcs.pop15 lcs.pop75
## 3.666302e-127 2.556316e-167 2.774085e-44 2.731185e-74 7.078060e-14
## lcs.dpi lcs.ddpi
## 2.849951e-136 5.292273e-23
The functions discussed and demonstrated will be improved on a continuous basis to (1) minimize repetitive iterative processing and (2) emphasize code efficiency and brevity. New functions to be added based on feasibility and future needs as appropriate.
afp Handbook
Julia programming language
Julia - broadcast()
Julia - mapreduce()
Tidyverse - purrr
afpj
, the Julia-equivalent library of afp
Advanced R by Hadley Wickham
End of Document