Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

type.convert arg #29

Merged
merged 17 commits into from
Dec 20, 2024
Merged

type.convert arg #29

merged 17 commits into from
Dec 20, 2024

Conversation

tdhock
Copy link
Owner

@tdhock tdhock commented Dec 16, 2024

Closes #28

TODO add tests

":",
chromStart=".*?",
"-",
chromEnd="[0-9,]*", keep.digits,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, really like that we get this "default with override" behavior "for free"

Copy link
Contributor

@MichaelChirico MichaelChirico left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM and it's definitely what I had in mind.

Glancing at the PR, I wonder if it would be a better API to instead offer "default.coercion=identity", then I can use default.coercion=type.convert but also we allow even more flexibility.

put another way, before we have (1) only custom column-wise conversion allowed, here we have (2) custom column-wise conversion, possibly defaulting to type.convert(). But maybe we could get (3) custom column-wise conversion backing up to custom default conversion.

@codecov-commenter
Copy link

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (487f89b) to head (1f364b1).
Report is 25 commits behind head on master.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff            @@
##            master       #29   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           18        18           
  Lines          433       437    +4     
=========================================
+ Hits           433       437    +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@tdhock
Copy link
Owner Author

tdhock commented Dec 17, 2024

thanks for sharing your additional ideas. That makes sense to me but I kept the argument name as type.convert instead of your suggested default.coercion because "coercion" was not mentioned anywhere in the code/docs, but "convert" and "conversion" were.

@tdhock
Copy link
Owner Author

tdhock commented Dec 17, 2024

also I noticed that I had to specify type.convert(x,as.is=TRUE) to get character output and avoid a warning, so I kept the type.convert=TRUE alias for that as a shortcut.

@tdhock
Copy link
Owner Author

tdhock commented Dec 17, 2024

do you have a compelling example in mind that we could add to a man page or vignette?

For the SLURM sacct example (v1-capture-first vignette) we convert everything to integer but the existing example handles that by integrating the conversion function with the pattern:

(sacct.df <- data.frame(
  Elapsed = c(
    "07:04:42", "07:04:42", "07:04:49",
    "00:00:00", "00:00:00"),
  JobID=c(
    "13937810_25",
    "13937810_25.batch",
    "13937810_25.extern",
    "14022192_[1-3]",
    "14022204_[4]"),
  stringsAsFactors=FALSE))
int.pattern <- list("[0-9]+", as.integer)
range.pattern <- list(
  "\\[",
  task1=int.pattern,
  list(
    "-",#begin optional end of range.
    taskN=int.pattern
  ), "?", #end is optional.
  "\\]")
nc::capture_first_df(sacct.df, JobID=range.pattern, nomatch.error=FALSE)

gives

> nc::capture_first_df(sacct.df, JobID=range.pattern, nomatch.error=FALSE)
    Elapsed              JobID task1 taskN
     <char>             <char> <int> <int>
1: 07:04:42        13937810_25    NA    NA
2: 07:04:42  13937810_25.batch    NA    NA
3: 07:04:49 13937810_25.extern    NA    NA
4: 00:00:00     14022192_[1-3]     1     3
5: 00:00:00       14022204_[4]     4    NA

@MichaelChirico
Copy link
Contributor

MichaelChirico commented Dec 20, 2024

do you have a compelling example in mind that we could add to a man page or vignette?

I came to this via Advent of Code, specifically this year's Day 14 problem:

https://github.com/MichaelChirico/advent-of-code/blob/dc272d9c342fa4cb35e8519f5f41cd8d3e17b54d/2024/14.R#L1-L14

https://adventofcode.com/2024/day/14

In case that requires log-in to view, here's a sample file:

https://github.com/MichaelChirico/advent-of-code/blob/main/2024/input-data/14-test

The Advent of Code problems generally provide good motivating examples for {nc}, as the problem input often takes the form of structured text files from which you need to extract data in a friendlier format.

@tdhock
Copy link
Owner Author

tdhock commented Dec 20, 2024

thanks for sharing the example.

@tdhock tdhock merged commit 16306ff into master Dec 20, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Offer 'type.convert' in interface?
3 participants