-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Jannocoalesce #282
Jannocoalesce #282
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #282 +/- ##
==========================================
- Coverage 68.32% 68.22% -0.11%
==========================================
Files 25 26 +1
Lines 3375 3468 +93
Branches 376 390 +14
==========================================
+ Hits 2306 2366 +60
- Misses 693 712 +19
- Partials 376 390 +14 ☔ View full report in Codecov by Sentry. |
OK, @nevrome, this is ready for review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty cool. I will run some tests on the command line now.
Ok - so I tried to run it for the open minotaur-archive PR here: poseidon-framework/minotaur-archive#2 Some minor observations:
Beyond that and most importantly ❗: I see two ways to fix this. Either we go with your initial idea and allow coalescing (?) by additional columns, e.g. the Funny how this issue goes back to the question what exactly a Poseidon_ID is. Note that I wrote about this for the paper in Supp. Text 5.2. |
I was thinking that the suffixes would cause issue. I wonder if a |
Potentially yes. If it was an Edit: I still think the suffix is a good thing, btw. |
OK, I think we could easily put a flag to ignore suffixes in the target, as well as custom key-lookup-columns. I can implement both of these changes easily. Thanks for the feedback. |
OK, so I've added quite a number of features:
this should now help with the cases we discussed above. Note that I had to put in one safety check: If Regexes are given, it is possible that Janno rows differ in the It would be nice if you could try things out. I've added a number of automatic tests now to |
Note that I think we need a bit more output on the statistics how many rows have been edited. I will work on that now. |
OK, I've added some log-output. Here is an example run on the test-data in the package:
And I also checked whether the in-place writing works, and it does (if you omit the |
So I think this is ready for another try, @nevrome |
I looked at the code and found it quite brilliant. Only two comments:
I see that the tests are failing because of some issue with the base monad. I'll look into that now to see if I can easily fix it. Afterwards I'll go back to my use-case in and for the minotaur archive. |
So the first thing I tried was this
This is obviously wrong (missing qualifier before the
in this case, which left me wondering if it actually worked or not. I think we should report this (probably common) failure more clearly, maybe with a message like this:
When I used a correct regex ( I wonder, though, if we should rename I'll now prepare a PR with the changes I suggested here and in the comment above. |
See my PR #284. When running I want to point out how well the regex solution works here. Still a big fan of that. |
Great, thanks. I'll take a look at your PR at once. Sorry for letting this hang stale for so long! |
…ustments Jannocoalesce minor adjustments
OK, just for the record, I think we discussed that you, @nevrome, would perhaps add a few small things to this PR based on your experience with poseidon-framework/minotaur-archive#5. So I will stand back for now unless told otherwise |
… work, but I'm not sure if it's fast enough
…ustments2 more adjustements for jannocoalesce
OK, I've finished the CLI API and functionality, and it compiles. I haven't tested it yet.