You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
https://lyricsnary.com/dirty-little-secret-nora-fatehi-zack-knight/
From this I can't able to extract images and iframe which is basically YouTube embedded URL
Basically Iam trying extract lyrics from lyric website. I able to extract html elements but not the images and iframes which contains images from youtube and iframe contains embedded video url of Youtube.
The library is geared towards text extraction, in the page you mention all of the main text is extracted correctly. Keeping elements containing Youtube videos would require additional code.
adbar
changed the title
is there a way to extract iframe tag including with other tags
Extraction of Youtube iframes and img elements with links
Dec 5, 2022
Not able to fetch image tags
Not able to fetch iframe tags.
From command prompt in windows machine
trafilatura --sitemap "https://www.lyricspulp.com/" --list > linklist.txt
trafilatura --sitemap homepage --list > linklist.txt
trafilatura -i linklist.txt --xml -o outputfile.txt
trafilatura -i linklist.txt --formatting --links --images --no-comments --xml -o outputfile.txt
The text was updated successfully, but these errors were encountered: