fix(magic_pdf): correct range for images in document analysis

- Update the range used to generate images_with_extra_info to match the number of images - This fixes a potential IndexError when the number of images differs from the dataset length

fix(magic_pdf): correct range for images in document analysis
- Update the range used to generate images_with_extra_info to match the number of images - This fixes a potential IndexError when the number of images differs from the dataset length
67b31a78 · myhloli · 4f129a64 · 67b31a78
Commit 67b31a78 authored Apr 14, 2025 by myhloli
Hide whitespace changes
Inline Side-by-side

Showing with 1 addition and 1 deletion

magic_pdf/model/doc_analyze_by_custom_model.py magic_pdf/model/doc_analyze_by_custom_model.py +1 -1

No files found.
--- a/magic_pdf/model/doc_analyze_by_custom_model.py
+++ b/magic_pdf/model/doc_analyze_by_custom_model.py
@@ -147,7 +147,7 @@ def doc_analyze(
            images.append(img_dict['img'])
            page_wh_list.append((img_dict['width'], img_dict['height']))
-    images_with_extra_info = [(images[index], ocr, dataset._lang) for index in range(len(dataset))]
+    images_with_extra_info = [(images[index], ocr, dataset._lang) for index in range(len(images))]
    if len(images) >= MIN_BATCH_INFERENCE_SIZE:
        batch_size = MIN_BATCH_INFERENCE_SIZE