Phone_Keyboard - Behavior issues with session construction and feature differences between documentation and code #220
Replies: 2 comments 4 replies
-
Hi @DouglasBellew, thank you for using RAPIDS and for this thorough summary of your investigation into the phone keyboard features. We will add a complete audit of the RAPIDS feature provider for this sensor to our roadmap and will correct any bugs and/or discrepancies in our documentation in a future release. Thank you again! |
Beta Was this translation helpful? Give feedback.
-
@jenniferfedor Hello! Sorry for the delayed response but I was on vacation for a few weeks. I'm asking these questions because I'm just trying to know how you believe this feature should work. I need to use this feature in my own analysis so I have already programmed fixes for these issues (and added automated test cases to confirm they work) but I just wanted to confirm with you how you would like them to work, so that I can match my changes to that (and update the documentation) and upload the fixes into a pull request. No work required from you other than direction. Questions (with details above): Thanks! |
Beta Was this translation helpful? Give feedback.
-
Hello,
I was doing some work on RAPIDS that touched on the Phone Keyboard feature. However, when I was checking the results, I wasn't getting the answers I was expecting. Looking through the code, I think there are a few bugs in the code, and a few things that might be desired (or might be bugs), but don't match what the documentation says.
The first problem is with how data is separated into "sessions". These sessions can cross over time-segment boundaries, which, depending on where the boundary falls, can leave you with time segments that have sessions with no data, or time segments with data but no sessions. Also, the way sessions are calculated for repeating time segments (a particular 30 min segment every day is the test case) can leave you with inter-keystroke durations that last from one instance of the time segment to the next. Using the test data, you have one keystroke duration that lasts 8 months- around 10^10ms which can throw off the average keystroke delay immensely. I've created a solution to these problems by adding a parameter to allow you to keep sessions from breaking across time segments (if desired) as well as keeping inter-keystroke durations to within a single session instead of the way they are being calculated now.
Issues with feature creation:
sessioncount can currently be incorrect for reasons stated above. The proposed fix makes the current coding for this work correctly.
averagesessionlength is also fixed by having sessions created correctly
averageinterkeydelay currently calculates by summing all of the interkey delays for 2->n keystrokes for a session. This fails for the frequency issue stated above, as well as giving incorrect results because strokeduration is incorrect for the last keypress of a session.
I've changed the way this is calculated to just add up the interkey durations for 1->(n-1) keystrokes. Which seems to work correctly.
changeintextlengthlessthanminusone
changeintextlengthequaltominusone
changeintextlengthequaltoone
changeintextlengthmorethanone
These features have a difference between what the documentation says and what the code does. The documentation says: Number of times a keyboard typing or swiping event changed the length of the current text to less than one fewer character. However, what it's actually giving you is the number of sessions with keystrokes of these types:
Example:
For this, you can see that each of the two time segment groupings have a single session that contains 6 minus one keypresses, however, what gets reported in the data is "1" for the single session instead of "6" for the number of keypresses.
My question is... which behavior is correct? I could see uses for both ways of doing it. I could add "sessionswithchangeintextlengthequaltominusone" as another feature if both features are desirable?
maxtextlength - this states "Length in characters of the longest sentence(s) contained in the typing text box of any app during the time segment." However, what's actually being reported is the average of the maximum text length of all sessions for that time segment. What is the desired behavior for this feature?
lastmessagelength this states "Length of the last text in characters of the sentence(s) contained in the typing text box of any app during the time segment." This description seems like it should be returning some type of list, but what it returns is the average of the last message length of each session in the time segment. Is this correct?
totalkeyboardtouches this states "Average number of typing events across all sessions in a time segment instance." It does seem to return the average of the number of keystrokes per session for a time segment.
Beta Was this translation helpful? Give feedback.
All reactions