Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ffi: cancelling media upload results in panic #3573

Closed
stefanceriu opened this issue Jun 18, 2024 · 6 comments · Fixed by #4141
Closed

ffi: cancelling media upload results in panic #3573

stefanceriu opened this issue Jun 18, 2024 · 6 comments · Fixed by #4141
Assignees
Labels
bug Something isn't working panic An issue that leads to a panic/crash

Comments

@stefanceriu
Copy link
Member

Found on https://github.com/element-hq/element-x-ios-rageshakes/issues/1895

As easy to reproduce as cancelling an ongoing media upload i.e. handle + cancellation

2024-06-18T15:02:29.476321Z  INFO elementx: Transitioning from `mediaUploadPreview(fileURL: file:///Users/stefanceriu/Library/Developer/CoreSimulator/Devices/614B4262-0794-4800-BD44-CC9C7BFF3F31/data/Containers/Data/Application/D66733C1-0BAC-4BED-BFB3-661FFBF1DD77/tmp/IMG_0005.jpeg)` to `room` with event `dismissMediaUploadPreview` | RoomFlowCoordinator.swift:486 | spans: root
2024-06-18T15:02:29.528232Z ERROR log: thread 'tokio-runtime-worker' panicked at 'called `Result::unwrap()` on an `Err` value: JoinError::Cancelled(Id(23375))': bindings/matrix-sdk-ffi/src/timeline/mod.rs:1068
   0: backtrace::capture::Backtrace::new
   1: log_panics::Config::install_panic_hook::{{closure}}
   2: std::panicking::rust_panic_with_hook
   3: std::panicking::begin_panic_handler::{{closure}}
   4: std::sys_common::backtrace::__rust_end_short_backtrace
   5: _rust_begin_unwind
   6: core::panicking::panic_fmt
   7: core::result::unwrap_failed
   8: tokio::runtime::task::core::Core<T,S>::poll
   9: tokio::runtime::task::raw::poll
  10: tokio::runtime::scheduler::multi_thread::worker::Context::run_task
  11: tokio::runtime::task::raw::poll
  12: std::sys_common::backtrace::__rust_begin_short_backtrace
  13: core::ops::function::FnOnce::call_once{{vtable.shim}}
  14: std::sys::pal::unix::thread::Thread::new::thread_start
  15: __pthread_start
     | spans: root
2024-06-18T15:02:29.536977Z ERROR log: thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: JoinError::Panic(Id(23377), ...)': bindings/matrix-sdk-ffi/src/timeline/mod.rs:1068
   0: backtrace::capture::Backtrace::new
   1: log_panics::Config::install_panic_hook::{{closure}}
   2: std::panicking::rust_panic_with_hook
   3: std::panicking::begin_panic_handler::{{closure}}
   4: std::sys_common::backtrace::__rust_end_short_backtrace
   5: _rust_begin_unwind
   6: core::panicking::panic_fmt
   7: core::result::unwrap_failed
   8: <uniffi_core::ffi::rustfuture::future::RustFuture<F,T,UT> as uniffi_core::ffi::rustfuture::future::RustFutureFfi<<T as uniffi_core::ffi_converter_traits::LowerReturn<UT>>::ReturnType>>::ffi_poll
   9: $sSo015ffi_matrix_sdk_A22_rust_future_poll_voidyys6UInt64V_yAC_s4Int8VtXCACtFTO
  10: $s13MatrixRustSDK06uniffiB9CallAsync33_D13DA838D267D18EF7DBCAF4C15FEA09LL14rustFutureFunc04pollQ008completeQ004freeQ004liftQ012errorHandlerq_s6UInt64VyXE_yAK_yAK_s4Int8VtXCAKtXExAK_SpySo0bE6StatusVGtXEyAKXEq_xKXEs5Error_pSo0B6BufferVKcSgtYaKr0_lFySccyAMs5NeverOGXEfU_
             at /Users/stefanceriu/Library/Developer/Xcode/DerivedData/ElementX-hkylowshzurzvnaknoxrywlcjybe/SourcePackages/checkouts/matrix-rust-components-swift/Sources/MatrixRustSDK/matrix_sdk_ffi.swift:24865:13
  11: $ss22withUnsafeContinuationyxySccyxs5NeverOGXEYalFyBcXEfU_
  12: $ss22withUnsafeContinuationyxySccyxs5NeverOGXEYalF
  13: __ZN5swift34runJobInEstablishedExecutorContextEPNS_3JobE
  14: __ZL17swift_job_runImplPN5swift3JobENS_11ExecutorRefE
  15: __dispatch_root_queue_drain
  16: __dispatch_worker_thread2
  17: __pthread_wqthread
     | spans: root
2024-06-18T15:02:29.537187Z ERROR log: Caught a panic calling rust code: "called `Result::unwrap()` on an `Err` value: JoinError::Panic(Id(23377), ...)"     | spans: root
2024-06-18T15:02:29.539976Z ERROR elementx: Failed sending image with error: rustPanic("called `Result::unwrap()` on an `Err` value: JoinError::Panic(Id(23377), ...)") | TimelineProxy.swift:249 | spans: root
2024-06-18T15:02:29.604131Z ERROR elementx: Failed sending attachment with error: sdkError(MatrixRustSDK.(unknown context at $107772bc0).UniffiInternalError.rustPanic("called `Result::unwrap()` on an `Err` value: JoinError::Panic(Id(23377), ...)")) | MediaUploadPreviewScreenViewModel.swift:67 | spans: root
@Hywan Hywan self-assigned this Jun 27, 2024
@bnjbvr bnjbvr added bug Something isn't working panic An issue that leads to a panic/crash labels Jul 15, 2024
@bnjbvr
Copy link
Member

bnjbvr commented Oct 16, 2024

@stefanceriu is the calling code calling join() after abort()?

@bnjbvr
Copy link
Member

bnjbvr commented Oct 16, 2024

Can you try #4141 and let me know if it fixes the issue, please?

@stefanceriu
Copy link
Member Author

Can you try #4141 and let me know if it fixes the issue, please?

Sure can, but first, it does indeed call cancel before join. But seems it was like that from the very begining even though I honestly can't remember why. Does that make any sense?

element-hq/element-x-ios@ce7ca32#diff-3dd254084ff8f4b162e0aa8fb20ca0ebbe4223a75feb5ed3c6a133237fa64a83R85

@bnjbvr
Copy link
Member

bnjbvr commented Oct 16, 2024

It makes sense to me in terms of API usage; there's no good reason why you couldn't join after cancelling, it just means you're joining immediately. But the code didn't handle it at all, and thus would crash (also there's a weird entanglement of a task spawned in a task when joining, causing the cascading failure from Cancelled to Panic).

@stefanceriu
Copy link
Member Author

Well sure yeah, but why were we doing that as soon as receiving the handle? Surely can't be a plain old bug, we tested this feature a bunch after implementing it 🤔

@bnjbvr
Copy link
Member

bnjbvr commented Oct 16, 2024

I honestly don't know. It seems the first implementation of the join/cancel pair already had this issue, unless the behavior of tokio itself changed with respect to joining-after-cancelling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working panic An issue that leads to a panic/crash
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants