Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I38 generic var templates #40

Open
wants to merge 77 commits into
base: master
Choose a base branch
from
Open

Conversation

QSparks
Copy link
Contributor

@QSparks QSparks commented Oct 11, 2024

Description

This PR introduces new classes and functionality to handle generic scalar and vector climate variables, allowing users to compute statistics for a broader range of climate data.

Generic Scalar and Vector Classes

  • Added ClimdexGenericScalar and ClimdexGenericVector classes for scalar variables (e.g., humidity, snow depth) and vector variables (e.g., wind speed and direction).
  • Both classes support raw data input and reading from CSV files.

Scalar Data Handling

  • compute.stat.scalar() function to compute statistics (mean, max, min, sum, sd, var) for scalar data.
  • Support for exact date calculations for max and min statistics.
  • Functions to read scalar data from CSV (climdexGenericScalar.csv) and construct raw scalar objects (climdexGenericScalar.raw).

Vector Data Handling

  • compute.stat.vector() function for vector data, supporting statistics: max, min, mean, sum, circular_mean, sd, and circular_sd.
  • Supports formats: "polar", "cartesian", and "cardinal".
  • Direction-based filtering using direction.range.
  • Functions to read vector data from CSV (climdexGenericVector.csv) and construct raw vector objects (climdexGenericVector.raw).

Circular Statistics Support

  • Added compute_circular_mean() and compute_circular_sd() for directional data (e.g., wind direction), using the circular package.

Data Filtering by Direction

  • Added filter_by_direction_range() to filter vector data based on direction range.

Utility Functions

  • Conversion functions:
    • convert_cartesian_to_polar()
    • convert_polar_to_cartesian()
    • convert_degrees_to_cardinal()
      for vector data in different coordinate formats.

Tests Added

  • Unit tests for scalar and vector statistics:
    • Scalar statistics: mean, sum, sd, var, max, min, and exact date support.
    • Vector statistics for magnitude and direction, including circular_mean and circular_sd.
    • Direction-based filtering and support for different vector formats.
    • NA handling.

This commit adds the 'as.df' parameter to relevant functions, allowing users to choose whether results should be returned as data frames, including the exact date, or start and end dates of climate extremes. Additionally, utility functions like 'ymd.dates' and 'exact.date' are introduced for date handling.
@QSparks QSparks self-assigned this Oct 11, 2024
@QSparks QSparks marked this pull request as ready for review October 28, 2024 15:29
Copy link

@rod-glover rod-glover left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, Quintin! The code is very clean and clear.

The tests look solid, a lot of obvious cases covered, but I have not dug in to think through all the non-obvious ones. Are there any that occur to you after the fact?

A kind of generic question: Is there a prettifier/formatter for R? It seems possible it might be useful to standardize the code format, as I see a few indentation mistakes/inconsistencies and the like. (For application in a separate PR, I would think.)

R/generic_stats.R Outdated Show resolved Hide resolved
#' If `include.exact.dates = TRUE`, exact dates are returned for max/min statistics.
#' The function can also filter the data based on a specified degree range of directions (using `direction.range`).
#'
#' @seealso \code{\link{compute.stat.scalar}}, \code{\link{compute.gen.stat}}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible something might be a bit wonky with the docs ... this doesn't look as if it corresponds to what's in the doc PDF you sent me, see pp 64-65 of that document.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind providing a few more details on this? If it pertains to section reordering, that behavior is expected from roxygen.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks right down to the Details section bottom of p. 64. Then text reading

Value

A list containing the computed statistic for magnitude and, if applicable, the computed statistic for
direction. For circular statistics (e.g., circular_mean, circular_sd), the result is returned for
directions in degrees. If include.exact.dates = TRUE, the function returns exact dates for the
max/min statistics.

Note

This function is designed for vector climate data, where the data includes both a magnitude and
direction component. For scalar data, use compute.stat.scalar instead.

appears to be inserted before the See Also section on p. 65, where everything appears to pick up normally again.

Maybe this is roxygen, but it doesn't semantically make sense to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for pointing out this discrepancy. I've adjusted the roxygen comments so that their ordering matches the generated PDFs.

Roxygen automatically reorders the documentation tags to match the base R standard, as mentioned here:

"Note also that the order in which tags appear in your roxygen comments (or even in handwritten .Rd files) does not dictate the order in rendered documentation. The order of presentation is determined by tooling within base R."

I looked for an official documentation order, and the closest I found was this Writing R Extensions manual, but it doesn't encompass the full list of available tags in roxygen.

@QSparks
Copy link
Contributor Author

QSparks commented Oct 28, 2024

Great work, Quintin! The code is very clean and clear.

Thanks, Rod!

The tests look solid, a lot of obvious cases covered, but I have not dug in to think through all the non-obvious ones. Are there any that occur to you after the fact?

I will continue to think of test cases while I work on #39. They will likely be focused on input validation.

A kind of generic question: Is there a prettifier/formatter for R? It seems possible it might be useful to standardize the code format, as I see a few indentation mistakes/inconsistencies and the like. (For application in a separate PR, I would think.)

There are two complementary packages, styler and lintr. Styler will prettify and lintr ensures conformity with a specific style (the Tidyverse style) and can be added to our CI workflow. I'll open a new issue for this.

@QSparks QSparks mentioned this pull request Oct 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add generic templates to support additional scalar and vector climate variables
2 participants