Skip to content

Commit

Permalink
feat: add arXiv paper link to header and adjust PDF parsing logic- Ad…
Browse files Browse the repository at this point in the history
…d arXiv paper link to the header template for easy access to the latest research paper.

- Modify the PDF parsing logic to handle edge cases more accurately, particularly in determining the number of lines in a block based on its height.
  • Loading branch information
myhloli committed Oct 8, 2024
1 parent de60127 commit a71db70
Show file tree
Hide file tree
Showing 2 changed files with 11 additions and 1 deletion.
2 changes: 1 addition & 1 deletion magic_pdf/pdf_parse_union_core_v2.py
Original file line number Diff line number Diff line change
Expand Up @@ -191,7 +191,7 @@ def insert_lines_into_block(block_bbox, line_height, page_w, page_h):
# 如果block高度小于n行正文,则直接返回block的bbox
if line_height*3 < block_height:
if block_height > page_h*0.25 and page_w*0.5 > block_weight > page_w*0.25: # 可能是双列结构,可以切细点
lines = int(block_height/line_height)
lines = int(block_height/line_height)+1
else:
# 如果block的宽度超过0.4页面宽度,则将block分成3行
if block_weight > page_w*0.4:
Expand Down
10 changes: 10 additions & 0 deletions projects/gradio_app/header.html
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,16 @@
</a>
</span>

<!-- arXiv Link. -->
<span class="link-block">
<a href="https://arxiv.org/abs/2409.18839" class="external-link button is-normal is-rounded is-dark" style="text-decoration: none; cursor: pointer">
<span class="icon" style="margin-right: 8px">
<i class="fas fa-file" style="color: white"></i>
</span>
<span style="color: white">Paper</span>
</a>
</span>

<!-- Homepage Link. -->
<span class="link-block">
<a href="https://opendatalab.com/" class="external-link button is-normal is-rounded is-dark" style="text-decoration: none; cursor: pointer">
Expand Down

0 comments on commit a71db70

Please sign in to comment.