Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Summarizer still tries to process video links and picture galleries #27

Open
jake-patt opened this issue Aug 12, 2013 · 6 comments
Open
Assignees

Comments

@jake-patt
Copy link
Collaborator

No description provided.

@ghost ghost assigned Taiiwo Aug 12, 2013
@Taiiwo
Copy link
Collaborator

Taiiwo commented Aug 12, 2013

Pretty sure this is fixed now. Can you find some broken links?
EDIT: Oh, I see what you mean. That's a bug in the Readablilty API. Not sure how to fix that other than to revert back to the old version, or do a really dirty fix like: Only use the old version when 'video' is in the article title.

@jake-patt
Copy link
Collaborator Author

Environment and 2 weeks is an example. How did the old API handle it? Even returning 'cannot summarize article' would be an improvement

@Taiiwo
Copy link
Collaborator

Taiiwo commented Aug 12, 2013

The old API looked at all text inside a div with a specific id/class. The current one takes an URL and looks for the largest body of text within the page. Checking for the id before loading the page would take too long to load, so the only thing I can think of would be to use the old method for all URLs with theguardian.com/video/ in the URL.

@jake-patt
Copy link
Collaborator Author

Good idea, what did the old method return? If it still doesn't make sense then returning an error would be a better option

@Taiiwo
Copy link
Collaborator

Taiiwo commented Aug 12, 2013

Pushed quick fix to master. Will now error if it starts talking about autoplay. '/video/' is floating in the middle of the URL, and can't be parsed nicely without affecting future links.

@edlongman
Copy link
Owner

Can't a regular expression sort that?
Ok can you pull since I'm on mobile
On 12 Aug 2013 18:56, "Taiiwo" [email protected] wrote:

Pushed quick fix to master. Will now error if it starts talking about
autoplay. '/video/' is floating in the middle of the URL, and can't be
parsed nicely without affecting future links.


Reply to this email directly or view it on GitHubhttps://github.com//issues/27#issuecomment-22512145
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants