Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for finding PVR Channels and PVR Broadcasts #7

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

ta264
Copy link

@ta264 ta264 commented Jun 3, 2017

This is based on the work by @freemans13 here:
m0ngr31/kanzi#47

Hopefully I've managed to make it fit in with the new flask-ask layout.

except:
pass
wordified = wordified + word + " "
return wordified[:-1]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a function for this already -- words2digits(). It also supports German.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was added after I forked. Should have checked, sorry. I've rebased onto the latest version and switched to the proper function.

name = re.sub(r"\sch\s", " channel ", name)
name = re.sub("^channel", "", name)
name = re.sub(r"(?<=\D)(?=\d)|(?<=\d)(?=\D)", " ", name)
name = words2digits(name, lang=lang)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't call words2digits() here because the fuzzy matcher will do it later on. Also, have a look at the other things we added to sanitize_name() -- Amazon's new builder is more restrictive of what characters are allowed in slots.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be easier just to call santize_name from within sanitize_channel?

I put in the words2digits here because I couldn't get the matching to work reliably otherwise, but I've probably misunderstood something somewhere...

Channels on the backend often have "number words" in the name e.g. "BBC One HD". When you say "ask kodi to switch to channel BBC One HD", you seem to get given the string "BBC 1 hd" in the JSON from amazon. I think it's this string that gets words2digits applied to it by the fuzzy matcher?

Without this words2digits in sanitize_channel the simple match fails (BBC 1 hd != BBC One HD) and I kept getting given CBBC HD by the digits2roman fuzzy match. What I've tried to do in sanitize_channel is make all the channel names more similar to the string you get from amazon so that the simple match stands a chance.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For text mangling, that's what the fuzzy matching is for. It shouldn't do this in sanitize_channel().
If we need to make adjustments to get it to work more reliably, we can, but this isn't the place to do that.

Furthermore, in German, Amazon spits out number-words rather than digits.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. In that case I'll change it to call sanitize_name from sanitize_channel and remove the words2digits call? Are you happy with keeping the rest of sanitize_channel?

  • switching + to plus (so channel 4+1 -> channel 4 plus 1)
  • removing "channel" since it's so common it doesn't seem to be useful in the matching (i can combine the "ch" and "channel" regexps into one)
  • putting breaks between numbers and words (4seven -> 4 seven)

Copy link
Collaborator

@jingai jingai Jun 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think where you're getting confused is that sanitize_name() was for sanitizing the output from Kodi for the slots, primarily. It's not intended to sanitize the user's input, with the sole exception of a quick attempt at a straight ('simple') match.

We handle input-mangling (that is, stuff from Alexa) in Kodi.matchHeard(), where it tries to massage the input to more closely match how it's stored in Kodi.

@@ -479,6 +486,44 @@ def FindSong(self, heard_search):

return None, None

def FindPVRChannel(self, heard_search):
print 'Searching for channel "%s"' % (sanitize_name(heard_search))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use heard_search.encode("utf-8") instead of sanitize_name(heard_search) here and in all other print statements. I know we used to do this, but with the more restrictive sanitization especially, it makes debugging a bit harder since the string we see in the log might not look anything like the actual input string.

@jingai
Copy link
Collaborator

jingai commented Jun 4, 2017

edit: sorry, I posted my comments on the wrong PR >.>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants