Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use st_apply to calculate mutltiple aggregates for each pixel in one function-call #726

Open
RobinKohrs opened this issue Nov 29, 2024 · 3 comments

Comments

@RobinKohrs
Copy link

I have a stars-object like this one:
https://drive.google.com/file/d/1qcMATPjSyw-Rk9WpfFTa1rpr7dtOUxsu/view?usp=drive_link

It looks like this:

nc = read_stars("~/Desktop/SPARTACUS - Spatial Dataset for Climate in Austria Datensatz_202301_202410.nc") %>% 
  rename(value=1) %>% 
  mutate(value=as.numeric(value))
nc
stars object with 3 dimensions and 1 attribute
attribute(s):
       Min. 1st Qu. Median     Mean 3rd Qu.  Max.
value   5.2    54.8   86.7 105.2152   131.6 651.3
dimension(s):
     from  to offset delta                   refsys                    values x/y
x       1 136 487000  1000 ETRS89 / Austria Lambert                      NULL [x]
y       1  99 491000 -1000 ETRS89 / Austria Lambert                      NULL [y]
time    1  22     NA    NA                  POSIXct 2023-01-01,...,2024-10-01   

Now I want to find some aggregates for each pixel.

1. For each pixel how often was the value above 0

2. When (which date) was the last time a value was above 0

I think I can compute both statistics with two calls to st_apply like this (although I'd prefer to get the date in the second directly instead of the index, yet I don't know how to do that...)

times = st_apply(nc, 1:2, function(x){
  ifelse(all(is.na(x)), NA, sum(x > 0))}, .fname = "times"
)

last_time = st_apply(nc, 1:2, function(x) {
  ifelse(all(is.na(x)), NA, max(which(x > 0)))}, .fname = "last_time")
)

And the best case would be to compute the two statistics in one call to st_apply. I'm thinking something similar to the summarise from dplyr where I can compute as many summaries for each group as I want. Is that possible with stars? Am I missing something obvious and should use other functions?

@edzer
Copy link
Member

edzer commented Dec 1, 2024

The README contains an example where you get time of maximum back as array value. st_apply returns multiple values into a new dimension, so you cannot combine that with your "times". But you can combine them if you'd just want the numbers:

fn = function(x) {
	if (all(is.na(x))) {
		ret = c(NA, NA) 
	} else {
		ret = c(sum(x > 0), max(which(x > 0)))
	}
	setNames(ret, c("times", "last_time"))
}
st_apply(nc, 1:2, fn)
# stars object with 3 dimensions and 1 attribute
# attribute(s):
#         Min. 1st Qu. Median Mean 3rd Qu. Max.
# value    22      22     22   22      22   22
# dimension(s):
#    from  to         refsys                           values x/y
# fn    1   2             NA             times    , last_time    
# x     1 136 WGS 84 (CRS84) [136x99] 14.49 [°],...,16.33 [°] [x]
# y     1  99 WGS 84 (CRS84) [136x99] 47.39 [°],...,48.31 [°] [y]
# curvilinear grid

note that the result dimension is now the first.

@RobinKohrs
Copy link
Author

Thank you very much @edzer! I think I'm still thinking a bit wrongly about dimensions and attributes... In my mind it made more 'sense' if both values from the results-vector would have become a new attribute instead of one being a dimension and the other an attribute. Similar to this:

res = c(times, last_time)
res

which gives me this

stars object with 2 dimensions and 2 attributes
attribute(s):
           Min. 1st Qu. Median Mean 3rd Qu. Max.
times        22      22     22   22      22   22
last_time    22      22     22   22      22   22
dimension(s):
  from  to offset delta                   refsys x/y
x    1 136 487000  1000 ETRS89 / Austria Lambert [x]
y    1  99 491000 -1000 ETRS89 / Austria Lambert [y]

Does that make any sense?

@edzer
Copy link
Member

edzer commented Dec 2, 2024

Does that make any sense?

If they are of the same mode (numeric) it does not matter; choose what's convenient.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants