-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FatalError: Segmentation fault
is detected by the operating system.
#362
Comments
有尝试按照这个教程https://github.com/opendatalab/MinerU/blob/master/docs/README_Ubuntu_CUDA_Acceleration_zh_CN.md |
我的cuda和驱动版本如下: 提问:目前cuda版本和驱动版本不一致会影响该项目的正常运行嘛,因为是公用服务器,没办法换驱动版本。 |
驱动版本不一致没有太大问题,你这个问题是在教程中第9步还是第10步出现的? |
刚刚测试了一下,设置cuda加速报错,如果是cpu没有问题。第9步报错,第10步也出现的问题。 |
样本pdf可以上传一份到这里,我们调试一下,教程第九步就开始出现问题的话,说明系统不兼容,可能要搞个ubuntu22.04的docker试试 |
这里是我的几个测试用例,都是扫描版,包含纯文本、简单表格、复杂表格、图片等元素。另外文件都有页眉和水印,识别难度比较大。 |
我是按照https://github.com/opendatalab/MinerU/blob/master/docs/README_Ubuntu_CUDA_Acceleration_zh_CN.md来的,从第4步骤开始: 报错 不知道为什么
|
我是从第4步开始一步步来到第8步跑demo就出现这个错误,无论cpu还是cuda,都出现这个错误 |
报错的堆栈要完整上传一下 |
`(min) bigdata@gpu2 Miner $ magic-pdf pdf-command --pdf small_ocr.pdf sys.platform linux PyTorch built with:
[08/08 17:08:52 detectron2]: Command line arguments: {'config_file': '/home/bigdata/.conda/envs/min/lib/python3.10/site-packages/magic_pdf/resources/model_config/layoutlmv3/layoutlmv3_base_inference.yaml', 'resume': False, 'eval_only': False, 'num_gpus': 1, 'num_machines': 1, 'machine_rank': 0, 'dist_url': 'tcp://127.0.0.1:57823', 'opts': ['MODEL.WEIGHTS', '/home/bigdata/projects/ysl/paint/Miner/PDF-Extract-Kit/models/Layout/model_final.pth']}
[08/08 17:08:54 d2.checkpoint.detection_checkpoint]: [DetectionCheckpointer] Loading from /home/bigdata/projects/ysl/paint/Miner/PDF-Extract-Kit/models/Layout/model_final.pth ... C++ Traceback (most recent call last):0 at::_ops::conv2d::call(at::Tensor const&, at::Tensor const&, std::optionalat::Tensor const&, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, c10::SymInt) Error Message Summary:FatalError: Segmentation fault (core dumped) |
如果是按照教程从第四步开始装的话,pytorch 不会是cu118的版本,正常教程安装pytorch 是cu121的,你这个真的有按教程安装吗? |
|
这是正常的,因为在0.6.x版本中没有表格解析功能 |
我这边本地测试结果和在线效果一致,我把本地解析结果发你看下: 或者你也可以打包一下输出目录的所有文件,供我们分析。 |
我这边因cpu处理速度慢的原因只测试了 |
看了下中间过程文件,是有一些不应该出现的公式区域影响了解析效果,不清楚是不是依赖库版本不兼容导致的,如果可以的话,请运行pip list并上传结果供我们分析 |
` absl-py 2.1.0 |
我按照GPU文档里的第十条10. Enable CUDA Acceleration for OCR |
我是做了这一步安装paddlepaddle-gpu之后出现Segmentation fault (core dumped), #748 会是cuda版本不一致(118/121)的问题吗?
|
Description of the bug | 错误描述
我在isseue看到了相似的问题,但他们的解决方式都不适合我。命令行运行报错。请大佬帮我看看。
magic-pdf == 0.6.2b1
How to reproduce the bug | 如何复现
1. 命令
magic-pdf pdf-command --pdf "testfile_1.pdf" --inside_model true
2.日志
`2024-08-08 14:51:32.631 | WARNING | magic_pdf.cli.magicpdf:get_model_json:312 - not found json testfile_1.json existed
2024-08-08 14:51:32.631 | WARNING | magic_pdf.libs.config_reader:get_local_dir:64 - 'temp-output-dir' not found in magic-pdf.json, use '/tmp' as default
2024-08-08 14:51:32.798 | INFO | magic_pdf.libs.pdf_check:detect_invalid_chars:57 - cid_count: 0, text_len: 1, cid_chars_radio: 0.0
2024-08-08 14:51:32.798 | WARNING | magic_pdf.filter.pdf_classify_by_type:classify:334 - pdf is not classified by area and text_len, by_image_area: False, by_text: False, by_avg_words: False, by_img_num: True, by_text_layout: False, by_img_narrow_strips: True, by_invalid_chars: True
INFO:datasets:PyTorch version 2.3.1 available.
2024-08-08 14:51:40.728 | INFO | magic_pdf.model.pdf_extract_kit:init:99 - DocAnalysis init, this may take some times. apply_layout: True, apply_formula: True, apply_ocr: True
2024-08-08 14:51:40.728 | INFO | magic_pdf.model.pdf_extract_kit:init:107 - using device: cuda
2024-08-08 14:51:40.729 | INFO | magic_pdf.model.pdf_extract_kit:init:109 - using models_dir: /root/.cache/modelscope/hub/wanderkid/PDF-Extract-Kit/models
CustomVisionEncoderDecoderModel init
CustomMBartForCausalLM init
CustomMBartDecoder init
[08/08 14:51:54 detectron2]: Rank of current process: 0. World size: 1
cuobjdump info : File '/root/anaconda3/envs/MinerU/lib/python3.10/site-packages/detectron2/_C.cpython-310-x86_64-linux-gnu.so' does not contain device code
[08/08 14:51:54 detectron2]: Environment info:
sys.platform linux
Python 3.10.14 (main, May 6 2024, 19:42:50) [GCC 11.2.0]
numpy 1.26.4
detectron2 0.6 @/root/anaconda3/envs/MinerU/lib/python3.10/site-packages/detectron2
detectron2._C not built correctly: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by /root/anaconda3/envs/MinerU/lib/python3.10/site-packages/detectron2/_C.cpython-310-x86_64-linux-gnu.so)
Compiler ($CXX) c++ (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
CUDA compiler Build cuda_11.8.r11.8/compiler.31833905_0
detectron2 arch flags /root/anaconda3/envs/MinerU/lib/python3.10/site-packages/detectron2/_C.cpython-310-x86_64-linux-gnu.so
DETECTRON2_ENV_MODULE
PyTorch 2.3.1+cu121 @/root/anaconda3/envs/MinerU/lib/python3.10/site-packages/torch
PyTorch debug build False
torch._C._GLIBCXX_USE_CXX11_ABI False
GPU available Yes
GPU 0,1,2,3 NVIDIA GeForce RTX 3090 (arch=8.6)
CUDA_HOME /usr/local/cuda
Pillow 10.4.0
torchvision 0.18.1+cu121 @/root/anaconda3/envs/MinerU/lib/python3.10/site-packages/torchvision
torchvision arch flags 5.0, 6.0, 7.0, 7.5, 8.0, 8.6, 9.0
fvcore 0.1.5.post20221221
iopath 0.1.9
cv2 4.6.0
PyTorch built with:
[08/08 14:51:54 detectron2]: Command line arguments: {'config_file': '/root/anaconda3/envs/MinerU/lib/python3.10/site-packages/magic_pdf/resources/model_config/layoutlmv3/layoutlmv3_base_inference.yaml', 'resume': False, 'eval_only': False, 'num_gpus': 1, 'num_machines': 1, 'machine_rank': 0, 'dist_url': 'tcp://127.0.0.1:57823', 'opts': ['MODEL.WEIGHTS', '/root/.cache/modelscope/hub/wanderkid/PDF-Extract-Kit/models/Layout/model_final.pth']}
[08/08 14:51:54 detectron2]: Contents of args.config_file=/root/anaconda3/envs/MinerU/lib/python3.10/site-packages/magic_pdf/resources/model_config/layoutlmv3/layoutlmv3_base_inference.yaml:
AUG:
DETR: true
CACHE_DIR: ~/cache/huggingface
CUDNN_BENCHMARK: false
DATALOADER:
ASPECT_RATIO_GROUPING: true
FILTER_EMPTY_ANNOTATIONS: false
NUM_WORKERS: 4
REPEAT_THRESHOLD: 0.0
SAMPLER_TRAIN: TrainingSampler
DATASETS:
PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000
PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000
PROPOSAL_FILES_TEST: []
PROPOSAL_FILES_TRAIN: []
TEST:
TRAIN:
GLOBAL:
HACK: 1.0
ICDAR_DATA_DIR_TEST: ''
ICDAR_DATA_DIR_TRAIN: ''
INPUT:
CROP:
ENABLED: true
SIZE:
TYPE: absolute_range
FORMAT: RGB
MASK_FORMAT: polygon
MAX_SIZE_TEST: 1333
MAX_SIZE_TRAIN: 1333
MIN_SIZE_TEST: 800
MIN_SIZE_TRAIN:
MIN_SIZE_TRAIN_SAMPLING: choice
RANDOM_FLIP: horizontal
MODEL:
ANCHOR_GENERATOR:
ANGLES:
ASPECT_RATIOS:
NAME: DefaultAnchorGenerator
OFFSET: 0.0
SIZES:
BACKBONE:
FREEZE_AT: 2
NAME: build_vit_fpn_backbone
CONFIG_PATH: ''
DEVICE: cuda
FPN:
FUSE_TYPE: sum
IN_FEATURES:
NORM: ''
OUT_CHANNELS: 256
IMAGE_ONLY: true
KEYPOINT_ON: false
LOAD_PROPOSALS: false
MASK_ON: true
META_ARCHITECTURE: VLGeneralizedRCNN
PANOPTIC_FPN:
COMBINE:
ENABLED: true
INSTANCES_CONFIDENCE_THRESH: 0.5
OVERLAP_THRESH: 0.5
STUFF_AREA_LIMIT: 4096
INSTANCE_LOSS_WEIGHT: 1.0
PIXEL_MEAN:
PIXEL_STD:
PROPOSAL_GENERATOR:
MIN_SIZE: 0
NAME: RPN
RESNETS:
DEFORM_MODULATED: false
DEFORM_NUM_GROUPS: 1
DEFORM_ON_PER_STAGE:
DEPTH: 50
NORM: FrozenBN
NUM_GROUPS: 1
OUT_FEATURES:
RES2_OUT_CHANNELS: 256
RES5_DILATION: 1
STEM_OUT_CHANNELS: 64
STRIDE_IN_1X1: true
WIDTH_PER_GROUP: 64
RETINANET:
BBOX_REG_LOSS_TYPE: smooth_l1
BBOX_REG_WEIGHTS:
FOCAL_LOSS_ALPHA: 0.25
FOCAL_LOSS_GAMMA: 2.0
IN_FEATURES:
IOU_LABELS:
IOU_THRESHOLDS:
NMS_THRESH_TEST: 0.5
NORM: ''
NUM_CLASSES: 10
NUM_CONVS: 4
PRIOR_PROB: 0.01
SCORE_THRESH_TEST: 0.05
SMOOTH_L1_LOSS_BETA: 0.1
TOPK_CANDIDATES_TEST: 1000
ROI_BOX_CASCADE_HEAD:
BBOX_REG_WEIGHTS:
IOUS:
ROI_BOX_HEAD:
BBOX_REG_LOSS_TYPE: smooth_l1
BBOX_REG_LOSS_WEIGHT: 1.0
BBOX_REG_WEIGHTS:
CLS_AGNOSTIC_BBOX_REG: true
CONV_DIM: 256
FC_DIM: 1024
NAME: FastRCNNConvFCHead
NORM: ''
NUM_CONV: 0
NUM_FC: 2
POOLER_RESOLUTION: 7
POOLER_SAMPLING_RATIO: 0
POOLER_TYPE: ROIAlignV2
SMOOTH_L1_BETA: 0.0
TRAIN_ON_PRED_BOXES: false
ROI_HEADS:
BATCH_SIZE_PER_IMAGE: 512
IN_FEATURES:
IOU_LABELS:
IOU_THRESHOLDS:
NAME: CascadeROIHeads
NMS_THRESH_TEST: 0.5
NUM_CLASSES: 10
POSITIVE_FRACTION: 0.25
PROPOSAL_APPEND_GT: true
SCORE_THRESH_TEST: 0.05
ROI_KEYPOINT_HEAD:
CONV_DIMS:
LOSS_WEIGHT: 1.0
MIN_KEYPOINTS_PER_IMAGE: 1
NAME: KRCNNConvDeconvUpsampleHead
NORMALIZE_LOSS_BY_VISIBLE_KEYPOINTS: true
NUM_KEYPOINTS: 17
POOLER_RESOLUTION: 14
POOLER_SAMPLING_RATIO: 0
POOLER_TYPE: ROIAlignV2
ROI_MASK_HEAD:
CLS_AGNOSTIC_MASK: false
CONV_DIM: 256
NAME: MaskRCNNConvUpsampleHead
NORM: ''
NUM_CONV: 4
POOLER_RESOLUTION: 14
POOLER_SAMPLING_RATIO: 0
POOLER_TYPE: ROIAlignV2
RPN:
BATCH_SIZE_PER_IMAGE: 256
BBOX_REG_LOSS_TYPE: smooth_l1
BBOX_REG_LOSS_WEIGHT: 1.0
BBOX_REG_WEIGHTS:
BOUNDARY_THRESH: -1
CONV_DIMS:
HEAD_NAME: StandardRPNHead
IN_FEATURES:
IOU_LABELS:
IOU_THRESHOLDS:
LOSS_WEIGHT: 1.0
NMS_THRESH: 0.7
POSITIVE_FRACTION: 0.5
POST_NMS_TOPK_TEST: 1000
POST_NMS_TOPK_TRAIN: 2000
PRE_NMS_TOPK_TEST: 1000
PRE_NMS_TOPK_TRAIN: 2000
SMOOTH_L1_BETA: 0.0
SEM_SEG_HEAD:
COMMON_STRIDE: 4
CONVS_DIM: 128
IGNORE_VALUE: 255
IN_FEATURES:
LOSS_WEIGHT: 1.0
NAME: SemSegFPNHead
NORM: GN
NUM_CLASSES: 10
VIT:
DROP_PATH: 0.1
IMG_SIZE:
NAME: layoutlmv3_base
OUT_FEATURES:
POS_TYPE: abs
WEIGHTS:
OUTPUT_DIR:
SCIHUB_DATA_DIR_TRAIN: ~/publaynet/layout_scihub/train
SEED: 42
SOLVER:
AMP:
ENABLED: true
BACKBONE_MULTIPLIER: 1.0
BASE_LR: 0.0002
BIAS_LR_FACTOR: 1.0
CHECKPOINT_PERIOD: 2000
CLIP_GRADIENTS:
CLIP_TYPE: full_model
CLIP_VALUE: 1.0
ENABLED: true
NORM_TYPE: 2.0
GAMMA: 0.1
GRADIENT_ACCUMULATION_STEPS: 1
IMS_PER_BATCH: 32
LR_SCHEDULER_NAME: WarmupCosineLR
MAX_ITER: 20000
MOMENTUM: 0.9
NESTEROV: false
OPTIMIZER: ADAMW
REFERENCE_WORLD_SIZE: 0
STEPS:
WARMUP_FACTOR: 0.01
WARMUP_ITERS: 333
WARMUP_METHOD: linear
WEIGHT_DECAY: 0.05
WEIGHT_DECAY_BIAS: null
WEIGHT_DECAY_NORM: 0.0
TEST:
AUG:
ENABLED: false
FLIP: true
MAX_SIZE: 4000
MIN_SIZES:
DETECTIONS_PER_IMAGE: 100
EVAL_PERIOD: 1000
EXPECTED_RESULTS: []
KEYPOINT_OKS_SIGMAS: []
PRECISE_BN:
ENABLED: false
NUM_ITER: 200
VERSION: 2
VIS_PERIOD: 0
[08/08 14:51:56 d2.checkpoint.detection_checkpoint]: [DetectionCheckpointer] Loading from /root/.cache/modelscope/hub/wanderkid/PDF-Extract-Kit/models/Layout/model_final.pth ...
[08/08 14:51:56 fvcore.common.checkpoint]: [Checkpointer] Loading from /root/.cache/modelscope/hub/wanderkid/PDF-Extract-Kit/models/Layout/model_final.pth ...
2024-08-08 14:51:57.268 | INFO | magic_pdf.model.pdf_extract_kit:init:132 - DocAnalysis init done!
2024-08-08 14:51:57.268 | INFO | magic_pdf.model.doc_analyze_by_custom_model:custom_model_init:92 - model init cost: 24.469878435134888`
3.Traceback
`--------------------------------------
C++ Traceback (most recent call last):
0 at::_ops::conv2d::call(at::Tensor const&, at::Tensor const&, std::optionalat::Tensor const&, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, c10::SymInt)
1 at::native::conv2d_symint(at::Tensor const&, at::Tensor const&, std::optionalat::Tensor const&, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, c10::SymInt)
2 at::_ops::convolution::call(at::Tensor const&, at::Tensor const&, std::optionalat::Tensor const&, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, bool, c10::ArrayRefc10::SymInt, c10::SymInt)
3 at::_ops::convolution::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&, std::optionalat::Tensor const&, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, bool, c10::ArrayRefc10::SymInt, c10::SymInt)
4 at::native::convolution(at::Tensor const&, at::Tensor const&, std::optionalat::Tensor const&, c10::ArrayRef, c10::ArrayRef, c10::ArrayRef, bool, c10::ArrayRef, long)
5 at::_ops::_convolution::call(at::Tensor const&, at::Tensor const&, std::optionalat::Tensor const&, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, bool, c10::ArrayRefc10::SymInt, c10::SymInt, bool, bool, bool, bool)
6 at::native::_convolution(at::Tensor const&, at::Tensor const&, std::optionalat::Tensor const&, c10::ArrayRef, c10::ArrayRef, c10::ArrayRef, bool, c10::ArrayRef, long, bool, bool, bool, bool)
7 at::_ops::cudnn_convolution::call(at::Tensor const&, at::Tensor const&, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, c10::SymInt, bool, bool, bool)
8 at::native::cudnn_convolution(at::Tensor const&, at::Tensor const&, c10::ArrayRef, c10::ArrayRef, c10::ArrayRef, long, bool, bool, bool)
Error Message Summary:
FatalError:
Segmentation fault
is detected by the operating system.[TimeInfo: *** Aborted at 1723099917 (unix time) try "date -d @1723099917" if you are using GNU date ***]
[SignalInfo: *** SIGSEGV (@0x59) received by PID 37094 (TID 0x7f617e0133c0) from PID 89 ***]
Segmentation fault`
Operating system | 操作系统
Linux
Python version | Python 版本
3.10
Software version | 软件版本 (magic-pdf --version)
0.6.x
Device mode | 设备模式
cuda
The text was updated successfully, but these errors were encountered: