Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-44953: [R] Add R bindings for new compute functions #44971

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions r/NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,10 @@

# arrow 18.1.0.9000

## Minor improvements and fixes

- Added bindings for atan, sinh, cosh, tanh, asinh, acosh, and tanh, and expm1 (#44953)

# arrow 18.1.0

## Minor improvements and fixes
Expand Down
14 changes: 7 additions & 7 deletions r/R/arrow-datum.R
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,7 @@ Math.ArrowDatum <- function(x, ..., base = exp(1), digits = 0) {
log10 = eval_array_expression("log10_checked", x),
log2 = eval_array_expression("log2_checked", x),
log1p = eval_array_expression("log1p_checked", x),
expm1 = eval_array_expression("expm1", x),
round = eval_array_expression(
"round",
x,
Expand All @@ -139,17 +140,16 @@ Math.ArrowDatum <- function(x, ..., base = exp(1), digits = 0) {
cumprod = eval_array_expression("cumulative_prod_checked", x),
cummax = eval_array_expression("cumulative_max", x),
cummin = eval_array_expression("cumulative_min", x),
cosh = eval_array_expression("cosh", x),
sinh = eval_array_expression("sinh", x),
tanh = eval_array_expression("tanh", x),
acosh = eval_array_expression("acosh_checked", x),
asinh = eval_array_expression("asinh", x),
atanh = eval_array_expression("atanh_checked", x),
Comment on lines +143 to +148
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly out of curiosity: did we miss these earlier? Or are they also new?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're new as of #44630 which hasn't been released.

signif = ,
expm1 = ,
cospi = ,
sinpi = ,
tanpi = ,
cosh = ,
sinh = ,
tanh = ,
acosh = ,
asinh = ,
atanh = ,
lgamma = ,
gamma = ,
digamma = ,
Expand Down
15 changes: 12 additions & 3 deletions r/R/dplyr-funcs-doc.R
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
#'
#' The `arrow` package contains methods for 37 `dplyr` table functions, many of
#' which are "verbs" that do transformations to one or more tables.
#' The package also has mappings of 212 R functions to the corresponding
#' The package also has mappings of 221 R functions to the corresponding
#' functions in the Arrow compute library. These allow you to write code inside
#' of `dplyr` methods that call R functions, including many in packages like
#' `stringr` and `lubridate`, and they will get translated to Arrow and run
Expand All @@ -42,7 +42,7 @@
#' * [`collect()`][dplyr::collect()]
#' * [`compute()`][dplyr::compute()]
#' * [`count()`][dplyr::count()]
#' * [`distinct()`][dplyr::distinct()]: `.keep_all = TRUE` not supported
#' * [`distinct()`][dplyr::distinct()]: `.keep_all = TRUE` returns a non-missing value if present, only returning missing values if all are missing.
#' * [`explain()`][dplyr::explain()]
#' * [`filter()`][dplyr::filter()]
#' * [`full_join()`][dplyr::full_join()]: the `copy` argument is ignored
Expand Down Expand Up @@ -83,7 +83,7 @@
#' Functions can be called either as `pkg::fun()` or just `fun()`, i.e. both
#' `str_sub()` and `stringr::str_sub()` work.
#'
#' In addition to these functions, you can call any of Arrow's 262 compute
#' In addition to these functions, you can call any of Arrow's 271 compute
#' functions directly. Arrow has many functions that don't map to an existing R
#' function. In other cases where there is an R function mapping, you can still
#' call the Arrow function directly if you don't want the adaptations that the R
Expand All @@ -96,6 +96,7 @@
#'
#' * [`add_filename()`][arrow::add_filename()]
#' * [`cast()`][arrow::cast()]
#' * [`one()`][arrow::one()]
#'
#' ## base
#'
Expand All @@ -119,6 +120,7 @@
#' * [`^`][^()]
#' * [`abs()`][base::abs()]
#' * [`acos()`][base::acos()]
#' * [`acosh()`][base::acosh()]
#' * [`all()`][base::all()]
#' * [`any()`][base::any()]
#' * [`as.Date()`][base::as.Date()]: Multiple `tryFormats` not supported in Arrow.
Expand All @@ -130,14 +132,19 @@
#' * [`as.logical()`][base::as.logical()]
#' * [`as.numeric()`][base::as.numeric()]
#' * [`asin()`][base::asin()]
#' * [`asinh()`][base::asinh()]
#' * [`atan()`][base::atan()]
#' * [`atanh()`][base::atanh()]
#' * [`ceiling()`][base::ceiling()]
#' * [`cos()`][base::cos()]
#' * [`cosh()`][base::cosh()]
#' * [`data.frame()`][base::data.frame()]: `row.names` and `check.rows` arguments not supported;
#' `stringsAsFactors` must be `FALSE`
#' * [`difftime()`][base::difftime()]: only supports `units = "secs"` (the default);
#' `tz` argument not supported
#' * [`endsWith()`][base::endsWith()]
#' * [`exp()`][base::exp()]
#' * [`expm1()`][base::expm1()]
#' * [`floor()`][base::floor()]
#' * [`format()`][base::format()]
#' * [`grepl()`][base::grepl()]
Expand Down Expand Up @@ -171,6 +178,7 @@
#' * [`round()`][base::round()]
#' * [`sign()`][base::sign()]
#' * [`sin()`][base::sin()]
#' * [`sinh()`][base::sinh()]
#' * [`sqrt()`][base::sqrt()]
#' * [`startsWith()`][base::startsWith()]
#' * [`strftime()`][base::strftime()]
Expand All @@ -183,6 +191,7 @@
#' * [`substring()`][base::substring()]
#' * [`sum()`][base::sum()]
#' * [`tan()`][base::tan()]
#' * [`tanh()`][base::tanh()]
#' * [`tolower()`][base::tolower()]
#' * [`toupper()`][base::toupper()]
#' * [`trunc()`][base::trunc()]
Expand Down
8 changes: 8 additions & 0 deletions r/R/dplyr-funcs-simple.R
Original file line number Diff line number Diff line change
Expand Up @@ -32,14 +32,22 @@
"base::log1p" = "log1p_checked",
"base::log2" = "log2_checked",
"base::sign" = "sign",
"base::expm1" = "expm1",
# trunc is defined in dplyr-functions.R

# trigonometric functions
"base::acos" = "acos_checked",
"base::asin" = "asin_checked",
"base::cos" = "cos_checked",
"base::atan" = "atan",
"base::sin" = "sin_checked",
"base::tan" = "tan_checked",
"base::cosh" = "cosh",
"base::sinh" = "sinh",
"base::tanh" = "tanh",
"base::acosh" = "acosh_checked",
"base::asinh" = "asinh",
"base::atanh" = "atanh_checked",

# logical functions
"!" = "invert",
Expand Down
36 changes: 0 additions & 36 deletions r/extra-tests/helpers.R

This file was deleted.

8 changes: 5 additions & 3 deletions r/extra-tests/test-read-files.R
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@
library(arrow)
library(testthat)

source("tests/testthat/helper-skip.R")

pq_file <- "files/ex_data.parquet"

test_that("Can read the file (parquet)", {
Expand All @@ -30,7 +32,7 @@ test_that("Can read the file (parquet)", {

### Parquet
test_that("Can see the metadata (parquet)", {
skip_if_version_less_than("2.0.0", "Version 1.0.1 can't read new version metadata.")
skip_if_arrow_version_less_than("2.0.0", "Version 1.0.1 can't read new version metadata.")

df <- read_parquet(pq_file)
expect_s3_class(df, "tbl")
Expand Down Expand Up @@ -74,7 +76,7 @@ for (comp in c("lz4", "uncompressed", "zstd")) {
})

test_that(paste0("Can see the metadata (feather ", comp, ")"), {
skip_if_version_less_than("2.0.0", "Version 1.0.1 can't read new version metadata.")
skip_if_arrow_version_less_than("2.0.0", "Version 1.0.1 can't read new version metadata.")

df <- read_feather(feather_file)
expect_s3_class(df, "tbl")
Expand Down Expand Up @@ -132,7 +134,7 @@ test_that("Can read the file (parquet)", {
})

test_that("Can see the metadata (stream)", {
skip_if_version_less_than("2.0.0", "Version 1.0.1 can't read new version metadata.")
skip_if_arrow_version_less_than("2.0.0", "Version 1.0.1 can't read new version metadata.")
df <- read_ipc_stream(stream_file)

expect_s3_class(df, "tbl")
Expand Down
20 changes: 20 additions & 0 deletions r/tests/testthat/helper-skip.R
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,26 @@ skip_on_python_older_than <- function(python_version) {
}
}

if_arrow_version <- function(version, op = `==`) {
op(packageVersion("arrow"), version)
}

if_arrow_version_less_than <- function(version) {
if_version(version, op = `<`)
}

skip_if_arrow_version_less_than <- function(version, msg) {
if (if_arrow_version(version, `<`)) {
skip(msg)
}
}

skip_if_arrow_version_equals <- function(version, msg) {
if (if_arrow_version(version, `==`)) {
skip(msg)
}
}

process_is_running <- function(x) {
if (force_tests()) {
# Return TRUE as this is used as a condition in an if statement
Expand Down
28 changes: 19 additions & 9 deletions r/tests/testthat/test-compute-arith.R
Original file line number Diff line number Diff line change
Expand Up @@ -223,22 +223,32 @@ test_that("Math group generics work on Array objects", {
)

expect_error(signif(Array$create(c(4L, 1L))), "Unsupported operation on `Array`")
expect_error(expm1(Array$create(c(4L, 1L))), "Unsupported operation on `Array`")

expect_error(cospi(Array$create(c(4L, 1L))), "Unsupported operation on `Array`")
expect_error(sinpi(Array$create(c(4L, 1L))), "Unsupported operation on `Array`")
expect_error(tanpi(Array$create(c(4L, 1L))), "Unsupported operation on `Array`")

expect_error(cosh(Array$create(c(4L, 1L))), "Unsupported operation on `Array`")
expect_error(sinh(Array$create(c(4L, 1L))), "Unsupported operation on `Array`")
expect_error(tanh(Array$create(c(4L, 1L))), "Unsupported operation on `Array`")

expect_error(acosh(Array$create(c(4L, 1L))), "Unsupported operation on `Array`")
expect_error(asinh(Array$create(c(4L, 1L))), "Unsupported operation on `Array`")
expect_error(atanh(Array$create(c(4L, 1L))), "Unsupported operation on `Array`")

expect_error(lgamma(Array$create(c(4L, 1L))), "Unsupported operation on `Array`")
expect_error(gamma(Array$create(c(4L, 1L))), "Unsupported operation on `Array`")
expect_error(digamma(Array$create(c(4L, 1L))), "Unsupported operation on `Array`")
expect_error(trigamma(Array$create(c(4L, 1L))), "Unsupported operation on `Array`")
})

test_that("hyperbolic trig functions work on Array objects", {
skip_if_arrow_version_less_than("18.1.0.9000", "Hyperbolic trig functions not available until version 19.")

expect_equal(sinh(Array$create(c(0.6, 0.9))), Array$create(sinh(c(0.6, 0.9))))
expect_equal(cosh(Array$create(c(0.6, 0.9))), Array$create(cosh(c(0.6, 0.9))))
expect_equal(tanh(Array$create(c(0.6, 0.9))), Array$create(tanh(c(0.6, 0.9))))
expect_equal(asinh(Array$create(c(0.6, 0.9))), Array$create(asinh(c(0.6, 0.9))))
expect_error(acosh(Array$create(c(0.6, 0.9))), "Invalid: domain error")
expect_equal(acosh(Array$create(c(1, 2))), Array$create(acosh(c(1, 2))))
expect_error(atanh(Array$create(c(-1, 1))), "Invalid: domain error")
expect_equal(atanh(Array$create(c(0.6, 0.9))), Array$create(atanh(c(0.6, 0.9))))
})

test_that("expm1 works on Array objects", {
skip_if_arrow_version_less_than("18.1.0.9000", "expm1 not available until version 19.")

expect_equal(expm1(Array$create(c(0.00000001, 10))), Array$create(expm1(c(0.00000001, 10))))
})
Loading
Loading