Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error during vacuum: the length + offset of the sliced StructArray cannot exceed the existing length #3237

Open
Menziess opened this issue Feb 21, 2025 · 1 comment
Labels
bug Something isn't working mre-needed Whether an MRE needs to be provided

Comments

@Menziess
Copy link

Environment

deltalake==0.25.0:

Binding: Python3.11?

Environment:

  • Azure:
  • Mac OS:

Bug

What happened:

During vacuuming the delta table, this error was raised:

PanicException: the length + offset of the sliced StructArray cannot exceed the existing length.
delta.vacuum(
    retention_hours=1,
    dry_run=False,
    enforce_retention_duration=False
)

What you expected to happen:

The program has worked using deltalake==0.23.2 before. I expected the delta table to be vacuum'd after being optimized without raising exceptions.

How to reproduce it:

Not quite sure yet.

More details:

INFO:main.py:Optimizing delta table.
INFO:main.py:Vacuuming delta table.
thread '<unnamed>' panicked at /Users/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-array-54.2.0/src/array/struct_array.rs:275:9:
the length + offset of the sliced StructArray cannot exceed the existing length
stack backtrace:
   0: _rust_begin_unwind
   1: core::panicking::panic_fmt
   2: arrow_array::array::struct_array::StructArray::slice
   3: <arrow_array::array::struct_array::StructArray as arrow_array::array::Array>::slice
   4: <alloc::vec::Vec<T> as alloc::vec::spec_from_iter::SpecFromIter<T,I>>::from_iter
   5: arrow_array::array::struct_array::StructArray::slice
   6: <arrow_array::array::struct_array::StructArray as arrow_array::array::Array>::slice
   7: <alloc::vec::Vec<T> as alloc::vec::spec_from_iter::SpecFromIter<T,I>>::from_iter
   8: arrow_array::array::struct_array::StructArray::slice
   9: <arrow_array::array::struct_array::StructArray as arrow_array::array::Array>::slice
  10: <alloc::vec::Vec<T> as alloc::vec::spec_from_iter::SpecFromIter<T,I>>::from_iter
  11: arrow_array::array::struct_array::StructArray::slice
  12: <arrow_array::array::struct_array::StructArray as arrow_array::array::Array>::slice
  13: arrow_select::filter::filter_array
  14: <alloc::vec::Vec<T> as alloc::vec::spec_from_iter::SpecFromIter<T,I>>::from_iter
  15: core::iter::adapters::try_process
  16: arrow_select::filter::filter_record_batch
  17: deltalake_core::kernel::snapshot::replay::LogReplayScanner::process_files_batch
  18: <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::try_fold
  19: <core::iter::adapters::chain::Chain<A,B> as core::iter::traits::iterator::Iterator>::try_fold
  20: <alloc::vec::Vec<T> as alloc::vec::spec_from_iter::SpecFromIter<T,I>>::from_iter
  21: core::iter::adapters::try_process
  22: deltalake_core::kernel::snapshot::EagerSnapshot::advance
  23: deltalake_core::operations::transaction::PostCommit::run_post_commit_hook::{{closure}}
  24: <deltalake_core::operations::transaction::PostCommit as core::future::into_future::IntoFuture>::into_future::{{closure}}
  25: <deltalake_core::operations::transaction::PreCommit as core::future::into_future::IntoFuture>::into_future::{{closure}}
  26: <deltalake_core::operations::vacuum::VacuumBuilder as core::future::into_future::IntoFuture>::into_future::{{closure}}
  27: tokio::runtime::park::CachedParkThread::block_on
  28: tokio::runtime::context::runtime::enter_runtime
  29: tokio::runtime::runtime::Runtime::block_on
  30: deltalake::RawDeltaTable::vacuum
  31: deltalake::RawDeltaTable::__pymethod_vacuum__
  32: pyo3::impl_::trampoline::trampoline
  33: deltalake::<impl pyo3::impl_::pyclass::PyMethods<deltalake::RawDeltaTable> for pyo3::impl_::pyclass::PyClassImplCollector<deltalake::RawDeltaTable>>::py_methods::ITEMS::trampoline
  34: _cfunction_call
  35: __PyObject_MakeTpCall
  36: __PyEval_EvalFrameDefault
  37: __PyEval_Vector
  38: _context_run
  39: _cfunction_vectorcall_FASTCALL_KEYWORDS
  40: _partial_vectorcall
  41: __PyEval_EvalFrameDefault
  42: __PyEval_Vector
  43: __PyEval_EvalFrameDefault
  44: __PyEval_Vector
  45: _method_vectorcall
  46: __PyEval_EvalFrameDefault
  47: __PyEval_Vector
  48: __PyObject_FastCallDictTstate
  49: __PyObject_Call_Prepend
  50: _slot_tp_call
  51: __PyObject_Call
  52: _thread_run
  53: _pythread_wrapper
  54: __pthread_deallocate
@Menziess Menziess added the bug Something isn't working label Feb 21, 2025
@ion-elgreco
Copy link
Collaborator

Please create an MRE

@ion-elgreco ion-elgreco added the mre-needed Whether an MRE needs to be provided label Feb 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working mre-needed Whether an MRE needs to be provided
Projects
None yet
Development

No branches or pull requests

2 participants