You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Dec 9, 2018. It is now read-only.
I'm thinking about try to read and convert some metadata information of pdf file to html accessible "elements".
This metadata would be external links or why not.. embedded videos.
I don't know if you have thought about this or it is planned for future. I think that it would be a very good improvement.
Thanks a lot, and good job.
The text was updated successfully, but these errors were encountered:
In very early version of pdf2htmlEX, some metadata (e.g. title, author) are retrieved, I wanted to extract title and put it as the title of HTML. However I found that that piece of info of lots of PDF generated from LaTeX were inaccurate, due to the intermediate conversions (tex -> dvi -> ps -> pdf). So I dropped those code.
It's not hard to extract those information, but I'm not sure what's the proper way of storing them into HTML. I'm almost sure that I should use <meta> tag, but not sure about the property names.
Hi, Toneti.
Thanks for sharing your problem. I wonder have you found your way out? When it comes to pdf text extraction processing, I wonder whether text extraction from pdf files is much simpler than pdf to text conversion process. There's something wrong with my pdf viewer. I want to look for a method to help with the relevant process. Any suggestion will be appreciated. Thanks in advance.
Best regards,
Pan
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
I'm thinking about try to read and convert some metadata information of pdf file to html accessible "elements".
This metadata would be external links or why not.. embedded videos.
I don't know if you have thought about this or it is planned for future. I think that it would be a very good improvement.
Thanks a lot, and good job.
The text was updated successfully, but these errors were encountered: