Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong parsing of numeric values as part of units #383

Open
bergsmat opened this issue Feb 4, 2025 · 7 comments
Open

Wrong parsing of numeric values as part of units #383

bergsmat opened this issue Feb 4, 2025 · 7 comments
Labels

Comments

@bergsmat
Copy link

bergsmat commented Feb 4, 2025

It seems like strings representing integers cannot be coerced to units (except for 1) and "dot" means exponentiation (except after zero). Can you point me to the documentation? Thanks in advance.

as_units('1')
1 [1]
> as_units('2')
Error in as_units.call(expr, check_is_valid = check_is_valid) : 
  is.language(x) is not TRUE
> as_units('2.1')
1 [2.]
> as_units('2.2')
1 [2.^2]
> as_units('2.3')
1 [2.^3]
> as_units('0.3')
Error: In ‘`0.`^(3)’, ‘0.’ is not recognized by udunits.
@Enchufa2
Copy link
Member

Enchufa2 commented Feb 4, 2025

Sure, in ?as_units:

‘as_units’, a generic with methods for a character string and
for quoted language. Note, direct usage of this function by
users is typically not necessary, as coercion via ‘as_units’
is automatically done with ‘units<-’ and ‘set_units’.

:) Basically, if you use it, use it in the same way as set_units, i.e. with unit strings, not numbers. The only exception is1, which has a special meaning in udunits2: it means unitless, basically.

@bergsmat
Copy link
Author

bergsmat commented Feb 4, 2025

Thanks. I read the help for ?set_units. I didn't see text or examples indicating that dot implies exponentiation (nor in the vignettes). I'm not familiar with that convention so I'm just trying to reassure myself that this is so. It becomes relevant, say, in physiology where estimated glomerular filtration rate is commonly expressed in ml/min/1.73m^2, which could be problematic if not handled delicately.

@Enchufa2
Copy link
Member

Enchufa2 commented Feb 5, 2025

Wow, that's some unit! :) Question: do you mean ml/min/1.73/m^2, in other words, ml/min/(1.73m^2)? Or you mean exactly ml/min/1.73m^2?

I'm not sure we meant the dot to imply anything. Probably it's just that parsing units is a complex problem, and we may just not consider a case like this. We'll need to check the parser and adapt it to support this.

@Enchufa2
Copy link
Member

Enchufa2 commented Feb 5, 2025

In fact, when there is an actual unit after the number, the problem is another:

# this is wrong: two times 1.73 there
unclass(as_units("ml/min/1.73m^2"))
#> [1] 1
#> attr(,"units")
#> $numerator
#> [1] "ml"
#> 
#> $denominator
#> [1] "1.73m" "1.73m" "min"  
#> 
#> attr(,"class")
#> [1] "symbolic_units"

@Enchufa2 Enchufa2 added the bug label Feb 5, 2025
@Enchufa2 Enchufa2 changed the title How does as_units() treat the dot in a character string? Wrong parsing of numeric values as part of units Feb 5, 2025
@Enchufa2
Copy link
Member

Enchufa2 commented Feb 5, 2025

Some more digging:

library(units)
#> udunits database from /usr/share/udunits/udunits2.xml

# supported by udunits2, but I think this is NOT what we want
units:::R_ut_format(units:::R_ut_parse("ml/min/1.73m^2"))
#> [1] "9.63391136801542e-09 m⁵·s⁻¹"

# supported too
units:::R_ut_format(units:::R_ut_parse("ml/min/1.73/m^2"))
#> [1] "9.63391136801542e-09 m·s⁻¹"

# avoid our parsing
x <- structure(
  1, units = structure(
    list(numerator = "ml/min/1.73/m^2", denominator = NULL),
    class="symbolic_units"), 
  class="units")

# conversion works as expected
set_units(x, "m/s")
#> 9.633911e-09 [m/s]

# fine... in a way
unclass(as_units("ml/min/1.73/m^2"))
#> [1] 0.5780347
#> attr(,"units")
#> $numerator
#> [1] "ml"
#> 
#> $denominator
#> [1] "m"   "m"   "min"
#> 
#> attr(,"class")
#> [1] "symbolic_units"

# but we ignore the number
set_units(1, "ml/min/1.73/m^2")
#> Warning in `units<-.numeric`(`*tmp*`, value = as_units(value, ...)): numeric
#> value 0.578034682080925 is ignored in unit assignment
#> 1 [ml/m^2/min]

@bergsmat
Copy link
Author

bergsmat commented Feb 5, 2025

@Enchufa2 To your question above, I think I mean ml/min/(1.73m^2) . One might express filtration rate as volume per time. Since values vary by body size, a common practice is to express the result relative to the body surface area of a reference adult (63 kg body weight, 1.7m height). Literally: "milliliters per minute per 1.73 square meters of body surface area". https://en.wikipedia.org/wiki/Glomerular_filtration_rate

@Enchufa2
Copy link
Member

Enchufa2 commented Feb 5, 2025

I suspected that much, thanks for confirming. We'll look into this to support this use case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants