You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Here is the link to the pdf which fails: [s3://ai2-s2-pdfs/e824/7449ba86efa714e39f8918b750654fc6284e.pdf to ./7449ba86efa714e39f8918b750654fc6284e.pdf](s3://ai2-s2-pdfs/e824/7449ba86efa714e39f8918b750654fc6284e.pdf to ./7449ba86efa714e39f8918b750654fc6284e.pdf)
See the slack thread for the discussion
Here is the link to the pdf which fails: [s3://ai2-s2-pdfs/e824/7449ba86efa714e39f8918b750654fc6284e.pdf to ./7449ba86efa714e39f8918b750654fc6284e.pdf](s3://ai2-s2-pdfs/e824/7449ba86efa714e39f8918b750654fc6284e.pdf to ./7449ba86efa714e39f8918b750654fc6284e.pdf)
Stack trace:
`
Input In [90], in generate_mmda_figure_table_pdf(sha, doc_dict, display_)
9 else:
10 recipe_doc = CoreRecipe()
---> 11 doc = recipe_doc.from_path(os.path.join(dir_name, name))
13 doc_dict[name] = doc
15 figure_table_pred = FigureTablePredictions(doc).predict()
File ~/Documents/codes/git/ai2/s2/mmda/src/mmda/recipes/core_recipe.py:54, in CoreRecipe.from_path(self, pdfpath)
52 blocks = self.effdet_publaynet_predictor.predict(document=doc)
53 equations = self.effdet_mfd_predictor.predict(document=doc)
---> 54 doc.annotate(blocks=blocks + equations)
56 logger.info("Predicting vila...")
57 vila_span_groups = self.vila_predictor.predict(document=doc)
File ~/Documents/codes/git/ai2/s2/mmda/src/mmda/types/document.py:96, in Document.annotate(self, is_overwrite, **kwargs)
91 span_groups = self._annotate_span_group(
92 span_groups=annotations, field_name=field_name
93 )
94 elif annotation_type == BoxGroup:
95 # TODO: not good. BoxGroups should be stored on their own, not auto-generating SpanGroups.
---> 96 span_groups = self._annotate_box_group(
97 box_groups=annotations, field_name=field_name
98 )
99 else:
100 raise NotImplementedError(
101 f"Unsupported annotation type {annotation_type} for {field_name}"
102 )
File ~/Documents/codes/git/ai2/s2/mmda/src/mmda/types/document.py:175, in Document._annotate_box_group(self, box_groups, field_name)
168 for box in box_group.boxes:
169
170 # Caching the page tokens to avoid duplicated search
171 if box.page not in all_page_tokens:
172 cur_page_tokens = all_page_tokens[box.page] = list(
173 itertools.chain.from_iterable(
174 span_group.spans
--> 175 for span_group in self.pages[box.page].tokens
176 )
177 )
178 else:
179 cur_page_tokens = all_page_tokens[box.page]
IndexError: list index out of range
`
The text was updated successfully, but these errors were encountered: