Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault (core dumped) on importing torch_geometric after installing torch_sparse #9925

Open
msolki opened this issue Jan 8, 2025 · 0 comments
Labels

Comments

@msolki
Copy link

msolki commented Jan 8, 2025

🐛 Describe the bug

On a freshly installed Ubuntu 22.04.5 LTS (x86_64) [VM] (version details available below), I followed these steps to install python 3.11:

sudo apt update && sudo apt upgrade

sudo apt install -y python3.11 python3.11-venv python3.11-dev python3.11-distutils

and create a virtual environment:

python3.11 -m venv venv
source venv/bin/activate

Inside virtual environment, I installed pytorch based on pytorch docs:

pip install torch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 --index-url https://download.pytorch.org/whl/cu124

and then based on PyG docs:

pip install torch_geometric

Up until this point everything is fine. Running:

import torch_geometric
print(f'version: {torch_geometric.__version__}')"

returns: version: 2.6.1. But after installing optional dependencies using:

pip install pyg_lib torch_scatter torch_sparse torch_cluster torch_spline_conv -f https://data.pyg.org/whl/torch-2.4.1+cu124.html
import torch_geometric

returns:

Segmentation fault (core dumped)

Turns out this issue also happens while importing torch_sparse but importing pyg_lib, torch_scatter, torch_cluster, and torch_spline_conv works fine.

I believe this issue might not be new as I found related examples in pytorch_sparse issues and pytorch_geometric issues as well, but those issues were most likely caused by incompatible dependency versions.

I appreciate your time and I can provide any additional information you might need.

Versions

Collecting environment information...
PyTorch version: 2.4.0+cu124
Is debug build: False
CUDA used to build PyTorch: 12.4
ROCM used to build PyTorch: N/A

OS: Ubuntu 22.04.5 LTS (x86_64)
GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.35

Python version: 3.11.0rc1 (main, Aug 12 2022, 10:02:14) [GCC 11.2.0] (64-bit runtime)
Python platform: Linux-5.15.0-130-generic-x86_64-with-glibc2.35
Is CUDA available: True
CUDA runtime version: Could not collect
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: GPU 0: NVIDIA A40
Nvidia driver version: 550.120
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Architecture:                         x86_64
CPU op-mode(s):                       32-bit, 64-bit
Address sizes:                        48 bits physical, 48 bits virtual
Byte Order:                           Little Endian
CPU(s):                               16
On-line CPU(s) list:                  0-15
Vendor ID:                            AuthenticAMD
Model name:                           AMD EPYC 75F3 32-Core Processor
CPU family:                           25
Model:                                1
Thread(s) per core:                   1
Core(s) per socket:                   8
Socket(s):                            2
Stepping:                             1
BogoMIPS:                             5888.90
Flags:                                fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm rep_good nopl cpuid extd_apicid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy cr8_legacy abm sse4a misalignsse 3dnowprefetch bpext invpcid_single ssbd ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves clzero xsaveerptr arat umip pku ospke vaes vpclmulqdq rdpid
Hypervisor vendor:                    Xen
Virtualization type:                  full
L1d cache:                            512 KiB (16 instances)
L1i cache:                            512 KiB (16 instances)
L2 cache:                             8 MiB (16 instances)
L3 cache:                             4 GiB (16 instances)
NUMA node(s):                         1
NUMA node0 CPU(s):                    0-15
Vulnerability Gather data sampling:   Not affected
Vulnerability Itlb multihit:          Not affected
Vulnerability L1tf:                   Not affected
Vulnerability Mds:                    Not affected
Vulnerability Meltdown:               Not affected
Vulnerability Mmio stale data:        Not affected
Vulnerability Reg file data sampling: Not affected
Vulnerability Retbleed:               Not affected
Vulnerability Spec rstack overflow:   Mitigation; safe RET
Vulnerability Spec store bypass:      Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1:             Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:             Mitigation; Retpolines; IBPB conditional; IBRS_FW; STIBP disabled; RSB filling; PBRSB-eIBRS Not affected; BHI Not affected
Vulnerability Srbds:                  Not affected
Vulnerability Tsx async abort:        Not affected

Versions of relevant libraries:
[pip3] numpy==1.26.3
[pip3] nvidia-cublas-cu12==12.4.2.65
[pip3] nvidia-cuda-cupti-cu12==12.4.99
[pip3] nvidia-cuda-nvrtc-cu12==12.4.99
[pip3] nvidia-cuda-runtime-cu12==12.4.99
[pip3] nvidia-cudnn-cu12==9.1.0.70
[pip3] nvidia-cufft-cu12==11.2.0.44
[pip3] nvidia-curand-cu12==10.3.5.119
[pip3] nvidia-cusolver-cu12==11.6.0.99
[pip3] nvidia-cusparse-cu12==12.3.0.142
[pip3] nvidia-nccl-cu12==2.20.5
[pip3] nvidia-nvjitlink-cu12==12.4.99
[pip3] nvidia-nvtx-cu12==12.4.99
[pip3] torch==2.4.0+cu124
[pip3] torch_cluster==1.6.3+pt24cu124
[pip3] torch-geometric==2.6.1
[pip3] torch_scatter==2.1.2+pt24cu124
[pip3] torch_sparse==0.6.18+pt24cu124
[pip3] torch_spline_conv==1.2.2+pt24cu124
[pip3] torchaudio==2.4.0+cu124
[pip3] torchvision==0.19.0+cu124
[pip3] triton==3.0.0
[conda] Could not collect
@msolki msolki added the bug label Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant