Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] Edit Domain: Improve Text/Categorical to Time conversion #4601

Merged
merged 1 commit into from
Apr 5, 2020

Conversation

robertcv
Copy link
Collaborator

Issue

New TimeVariable is initialized considering its parsing of string but actual data is converted by pandas.
This results in two issues:

  • Edit Domain can convert more formats (like m/d/y) but the TimeVariable is not correctly initialized (have_date and have_time)
  • pandas conversion is applied only when converting from Text but not when converting from Categorical.
Description of changes

Move pandas transformation into a separate function and apply it accordingly.

Includes
  • Code changes
  • Tests
  • Documentation

@codecov
Copy link

codecov bot commented Mar 31, 2020

Codecov Report

Merging #4601 into master will increase coverage by 0.07%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #4601      +/-   ##
==========================================
+ Coverage   83.55%   83.62%   +0.07%     
==========================================
  Files         281      276       -5     
  Lines       56104    55462     -642     
==========================================
- Hits        46877    46382     -495     
+ Misses       9227     9080     -147     

@@ -917,6 +918,19 @@ def test_column_str_repr(self):
d = column_str_repr(v, np.array([0., np.nan, 1.0]))
assert_array_equal(d, ["00:00:00", "?", "00:00:01"])

def test_time_parse(self):
"""parsing additional datetimes by pandas"""
date = ["1/22/20", "1/23/20", "1/24/20"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about when dates are 1/1/20? How do you know, which is day and which is month? M/D/Y is not universal.

Copy link
Collaborator Author

@robertcv robertcv Apr 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If data is not in iso8601 format then month first is assumed. Trying to parse dates without specifying the format is always tricky. Perhaps an additional field to specify d-m-y order would be the solution (to_datetime has dayfirst and yearfirst parameters). However, that would probably be in a new pr. I would really like to merge this since I need it to parse such dates.

@ajdapretnar ajdapretnar changed the title [FIX] OWEeditDomain: Text/Categorical to Time conversion [FIX] OWEditDomain: Text/Categorical to Time conversion Apr 1, 2020
@janezd janezd added the urgent Requires immediate action label Apr 3, 2020
Comment on lines 2533 to 2534
dti = pd.to_datetime(values, errors="coerce")
_values = str_to_time(np.array(values))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You extracted some code into a new function str_to_time to reuse it here, but then have to call pd.to_datetime outside of the new function anyway, to get some extra info...

@lanzagar lanzagar changed the title [FIX] OWEditDomain: Text/Categorical to Time conversion [FIX] Edit Domain: Improve Text/Categorical to Time conversion Apr 5, 2020
@lanzagar lanzagar merged commit 3f4c786 into biolab:master Apr 5, 2020
@robertcv robertcv deleted the enh/editdomain_time branch April 9, 2020 14:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
urgent Requires immediate action
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants