Follow the following step to run the pipeline in its entirity.
-
downloading_arxiv.py
-
filtering_pdf.py
-
extracting_imgs.md
-
crop2pdf.py
-
isLine.py
-
conda install -c conda-forge pdf2svg
-
Run pdf2svg.py (It also matches IDs with isLine)
-
svg2time_series.py (still pending)