-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request - Option to skip over an element #79
Comments
I have a similar issue. I have custom tags that I want to retain when converting, e.g. I'd like to be able to call something like: md("<ul><li><foo>bar</foo></li></ul>", keep=['foo']) and get back: * <foo>bar</foo> instead of: * bar Or, alternatively (or in addition), have an option to keep all unrecognized elements. I can handle this with a custom converter, but it seems like it should be a pretty common use case, so it'd be nice if there were a simple option for it. |
I want to keep something like I know I can edit init.py line ~143 like this, but I want to know how to achive this in custom converter.
|
Something like this, I guess: class MyConverter(MarkdownConverter):
def convert_span(self, el, text, convert_as_inline):
if el.get('custom-style'):
return self.process_text(el)
else:
return super().process_tag(el, text, convert_as_inline) Then you get:
|
Using markdownify 0.11.6, and it is working like a charm except for one thing. I'm scraping a site that has a youtube video embedded in an iframe. In this case i need to just it unchanged from out the site had it.
an option like --skip 'iframe' for example would be great. (ideally with some criteria, such as matching an id or regex).
The following change produces the desired outcome. It's obviously just a quick hack, but it demonstrates to the functionality.
in init.py line ~143
test.html
Produces:
The text was updated successfully, but these errors were encountered: