Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance Thread Safety in Core Data Management #12805

Open
wants to merge 17 commits into
base: trunk
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 9 commits
Commits
Show all changes
17 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion RELEASE-NOTES.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,9 @@
18.8
-----
- [*] Menu > Payments > Collect Payment: fix the the "card reader" & "Tap To Pay" payment methods where it was no-op before. [https://github.com/woocommerce/woocommerce-ios/pull/12801]
- [internal] Optimize API calls sent for Dashboard screen. [https://github.com/woocommerce/woocommerce-ios/pull/12775]
- [**] Adds edit support for the Min/Max Quantities extension in product details. [https://github.com/woocommerce/woocommerce-ios/pull/12758]
- [internal] Optimize API calls sent for Dashboard screen. [https://github.com/woocommerce/woocommerce-ios/pull/12775]
- [internal] Enhanced Thread Safety in Core Data Management. [https://github.com/woocommerce/woocommerce-ios/pull/12805]
pmusolino marked this conversation as resolved.
Show resolved Hide resolved

18.7
-----
Expand Down
108 changes: 59 additions & 49 deletions Storage/Storage/CoreData/CoreDataManager.swift
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,9 @@ public final class CoreDataManager: StorageManagerType {

private let modelsInventory: ManagedObjectModelsInventory

// A dispatch queue for synchronizing access to shared attributes
private let syncQueue = DispatchQueue(label: "com.automattic.woocommerce.CoreDataManager.syncQueue")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This queue is used to prevent concurrent access to the read context and write context themselves. They can still be used to perform concurrent Core Data changes.

If I understand it correctly, you'd want synchronized data writing to prevent database corruption. I don't think this new serial queue will help that.


/// Module-private designated Initializer.
///
/// - Parameter name: Identifier to be used for: [database, data model, container].
Expand Down Expand Up @@ -59,71 +62,76 @@ public final class CoreDataManager: StorageManagerType {
/// Returns the Storage associated with the View Thread.
///
public var viewStorage: StorageType {
return persistentContainer.viewContext
return syncQueue.sync {
return persistentContainer.viewContext
}
}

/// Returns a shared derived storage instance dedicated for write operations.
///
public lazy var writerDerivedStorage: StorageType = {
let childManagedObjectContext = NSManagedObjectContext(concurrencyType: .privateQueueConcurrencyType)
childManagedObjectContext.parent = persistentContainer.viewContext
childManagedObjectContext.mergePolicy = NSMergeByPropertyObjectTrumpMergePolicy
return childManagedObjectContext
return syncQueue.sync {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using DispatchQueue and closures, how about making CoreDataManager an actor?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another possibility, but do you see a real benefit over DispatchQueue? I'm concerned that making this class an actor might entail many changes elsewhere.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A benefit is to have a more modern syntax with no requirement for using closures, making the code more readable. I understand the potential side effects of changes required at call sites though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe, and let me know if you agree, that if this PR is merged and it fixes the crashes on Core Data, it's worth investing some time later in migrating it to actor, unless the impact is huge.

let childManagedObjectContext = NSManagedObjectContext(concurrencyType: .privateQueueConcurrencyType)
childManagedObjectContext.parent = persistentContainer.viewContext
childManagedObjectContext.mergePolicy = NSMergeByPropertyObjectTrumpMergePolicy
return childManagedObjectContext
}
}()

/// Persistent Container: Holds the full CoreData Stack
///
public lazy var persistentContainer: NSPersistentContainer = {
let container = NSPersistentContainer(name: name, managedObjectModel: modelsInventory.currentModel)
container.persistentStoreDescriptions = [storeDescription]

let migrationDebugMessages = migrateDataModelIfNecessary(using: container.persistentStoreCoordinator)
let container = NSPersistentContainer(name: name, managedObjectModel: modelsInventory.currentModel)
container.persistentStoreDescriptions = [storeDescription]

container.loadPersistentStores { [weak self] (storeDescription, error) in
guard let `self` = self, let persistentStoreLoadingError = error else {
return
}
let migrationDebugMessages = migrateDataModelIfNecessary(using: container.persistentStoreCoordinator)

DDLogError("⛔️ [CoreDataManager] loadPersistentStore failed. Attempting to recover... \(persistentStoreLoadingError)")

/// Remove the old Store which is either corrupted or has an invalid model we can't migrate from
///
var persistentStoreRemovalError: Error?
do {
try container.persistentStoreCoordinator.destroyPersistentStore(at: self.storeURL,
ofType: storeDescription.type,
options: nil)
} catch {
persistentStoreRemovalError = error
}

/// Retry!
///
container.loadPersistentStores { [weak self] (storeDescription, underlyingError) in
guard let underlyingError = underlyingError as NSError? else {
container.loadPersistentStores { [weak self] (storeDescription, error) in
guard let `self` = self, let persistentStoreLoadingError = error else {
return
}

let error = CoreDataManagerError.recoveryFailed
DDLogError("⛔️ [CoreDataManager] loadPersistentStore failed. Attempting to recover... \(persistentStoreLoadingError)")

/// Remove the old Store which is either corrupted or has an invalid model we can't migrate from
///
var persistentStoreRemovalError: Error?
do {
try container.persistentStoreCoordinator.destroyPersistentStore(at: self.storeURL,
ofType: storeDescription.type,
options: nil)
} catch {
persistentStoreRemovalError = error
}

/// Retry!
///
container.loadPersistentStores { [weak self] (storeDescription, underlyingError) in
guard let underlyingError = underlyingError as NSError? else {
return
}

let error = CoreDataManagerError.recoveryFailed
let logProperties: [String: Any?] = ["persistentStoreLoadingError": persistentStoreLoadingError,
"persistentStoreRemovalError": persistentStoreRemovalError,
"retryError": underlyingError,
"appState": UIApplication.shared.applicationState.rawValue,
"migrationMessages": migrationDebugMessages]
self?.crashLogger.logFatalErrorAndExit(error,
userInfo: logProperties.compactMapValues { $0 })
}

let logProperties: [String: Any?] = ["persistentStoreLoadingError": persistentStoreLoadingError,
"persistentStoreRemovalError": persistentStoreRemovalError,
"retryError": underlyingError,
"appState": UIApplication.shared.applicationState.rawValue,
"migrationMessages": migrationDebugMessages]
self?.crashLogger.logFatalErrorAndExit(error,
userInfo: logProperties.compactMapValues { $0 })
self.crashLogger.logMessage("[CoreDataManager] Recovered from persistent store loading error",
properties: logProperties.compactMapValues { $0 },
level: .info)
}

let logProperties: [String: Any?] = ["persistentStoreLoadingError": persistentStoreLoadingError,
"persistentStoreRemovalError": persistentStoreRemovalError,
"appState": UIApplication.shared.applicationState.rawValue,
"migrationMessages": migrationDebugMessages]
self.crashLogger.logMessage("[CoreDataManager] Recovered from persistent store loading error",
properties: logProperties.compactMapValues { $0 },
level: .info)
}

return container
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This diff seems unnecessary and just adds mental overhead to review the PR/find changes in the history.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This diff seems unnecessary and just adds mental overhead to review the PR/find changes in the history.

I think there was a wrong indentation in the code, that has been like that for years

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the updated indent is incorrect, can you revert that please?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you check if they have been fixed now?

return container
}()

/// Performs the received closure in Background. Note that you should use the received Storage instance (BG friendly!).
Expand All @@ -150,14 +158,16 @@ public final class CoreDataManager: StorageManagerType {
/// This method effectively destroys all of the stored data, and generates a blank Persistent Store from scratch.
///
public func reset() {
let viewContext = persistentContainer.viewContext
syncQueue.sync {
let viewContext = persistentContainer.viewContext

viewContext.performAndWait {
viewContext.reset()
self.deleteAllStoredObjects()
viewContext.performAndWait {
viewContext.reset()
self.deleteAllStoredObjects()

DDLogVerbose("💣 [CoreDataManager] Stack Destroyed!")
NotificationCenter.default.post(name: .StorageManagerDidResetStorage, object: self)
DDLogVerbose("💣 [CoreDataManager] Stack Destroyed!")
NotificationCenter.default.post(name: .StorageManagerDidResetStorage, object: self)
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there is a potential dead lock here.

Imagine reset is called on the main thread. The main thread now waits on the syncQueue to finish running. The syncQueue waits viewContext to complete. If the viewContext is bound to the main context, it waits for the main thread, which is blocked.

The above scenario is pretty easy to set up in unit tests. I'd suggesting adding an unit test to see if there is a dead-lock here.


Expand Down
73 changes: 73 additions & 0 deletions Storage/StorageTests/CoreData/CoreDataManagerTests.swift
Original file line number Diff line number Diff line change
Expand Up @@ -223,6 +223,79 @@ final class CoreDataManagerTests: XCTestCase {
insertAccount(to: manager.viewStorage)
XCTAssertEqual(manager.viewStorage.countObjects(ofType: Account.self), 3)
}

func test_viewStorage_is_thread_safe() {
// Given
let manager = CoreDataManager(name: storageIdentifier, crashLogger: MockCrashLogger())
let expectation = self.expectation(description: "Thread Safety")

// When
DispatchQueue.global(qos: .background).async {
let viewContext = manager.viewStorage as? NSManagedObjectContext
// Then
XCTAssertNotNil(viewContext)
expectation.fulfill()
}

wait(for: [expectation], timeout: Constants.expectationTimeout)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see expectationTimeout has been with us for a long time, but 10 seconds feels extreme. Having a specific number of seconds for timeout is arbitrary, but if we change it to 1 second the test suite still passes, and around 0.1-0.5 starts to become flaky.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This constant is used almost everywhere in the unit tests, so if I need a custom timeout for these methods, I would prefer to set it individually rather than change it globally. This is because tests might still pass on your local machine but could require more time on the CI server. In any case, for most methods where it is applied, this value simply acts as the timeout limit. However, if a test is successful, it typically doesn't require the full duration of this timeout to complete. I would make a separate PR to change the timeout since it impacts hundreds of tests

}

func test_reset_is_thread_safe() {
// Given
let manager = CoreDataManager(name: storageIdentifier, crashLogger: MockCrashLogger())
let expectation = self.expectation(description: "Thread Safety")

// When
DispatchQueue.global(qos: .background).async {
manager.reset()
// Then
expectation.fulfill()
}

wait(for: [expectation], timeout: Constants.expectationTimeout)
}

func test_concurrent_access_to_core_data_manager() {
// Given
let manager = CoreDataManager(name: storageIdentifier, crashLogger: MockCrashLogger())
let expectation1 = self.expectation(description: "Thread 1")
let expectation2 = self.expectation(description: "Thread 2")

// When
DispatchQueue.global(qos: .background).async {
manager.reset()
// Then
expectation1.fulfill()
}

DispatchQueue.global(qos: .background).async {
let viewContext = manager.viewStorage as? NSManagedObjectContext
// Then
XCTAssertNotNil(viewContext)
expectation2.fulfill()
}

wait(for: [expectation1, expectation2], timeout: Constants.expectationTimeout)
}

func test_saving_operations_are_thread_safe() {
// Given
let manager = CoreDataManager(name: storageIdentifier, crashLogger: MockCrashLogger())
let expectation = self.expectation(description: "Thread Safety")

// When
DispatchQueue.global(qos: .background).async {
let derivedContext = manager.writerDerivedStorage as? NSManagedObjectContext
derivedContext?.performAndWait {
_ = derivedContext?.insertNewObject(ofType: ShippingLine.self)
derivedContext?.saveIfNeeded()
}
// Then
expectation.fulfill()
}

wait(for: [expectation], timeout: Constants.expectationTimeout)
}
Comment on lines +226 to +298
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure how useful are these tests or if are actually testing the changes as we intend to, if we remove the syncQueue.sync in this PR and re-run the tests, all of them still pass.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure how useful are these tests or if are actually testing the changes as we intend to, if we remove the syncQueue.sync in this PR and re-run the tests, all of them still pass.

I'm not sure why I lost this comment. By the way, @iamgabrielma, you're right, but at the same time, it's quite difficult to replicate thread safety issues, so these tests serve more as a sort of sanity check rather than unit tests. By the way, if you have any suggestions for improvement, I'm open to hearing them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely tricky.

I don't have a better suggestion for the test themselves, but I think that we should change the names at a minimum so they doesn't conflate that are testing something that aren't, in this case we don't know if the operations are thread safe, for example: Accessing viewStorage from a background thread and verifying its non-nil state does not confirm thread-safety, or the "reset" test just demonstrates that the function completes without immediate errors on a background thread, not its thread safety.

Which bring the point of: Are these tests really necessary? If they are, they should be renamed to not miss-guide what they test.

}

// MARK: - Helpers
Expand Down
32 changes: 18 additions & 14 deletions Yosemite/Yosemite/Tools/EntityListener.swift
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,8 @@ public class EntityListener<T: ReadOnlyType> {
///
public var onDelete: (() -> Void)?

/// Serial queue for thread safety.
private let queue = DispatchQueue(label: "com.automattic.woocommerce.CoreDataManager.syncQueue")

/// Designated Initializer.
///
Expand Down Expand Up @@ -66,22 +68,24 @@ private extension EntityListener {
/// Handles the ContextObjectDidChange Notification triggered by the ViewContext.
///
func viewContextDidChange(notification: Notification) {
guard let note = ManagedObjectsDidChangeNotification(notification: notification) else {
return
}
queue.sync {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ Do we also need a queue here? i.e. Do we have to execute these callbacks synchronously?

guard let note = ManagedObjectsDidChangeNotification(notification: notification) else {
return
}

/// Scenario: Upsert (Insert + Update + Refresh)
///
if let storageEntity = readOnlyConvertible(from: note.upsertedObjects, representing: readOnlyEntity),
let updatedEntity = storageEntity.toTypeErasedReadOnly() as? T {
readOnlyEntity = updatedEntity
onUpsert?(readOnlyEntity)
}
/// Scenario: Upsert (Insert + Update + Refresh)
///
if let storageEntity = readOnlyConvertible(from: note.upsertedObjects, representing: readOnlyEntity),
let updatedEntity = storageEntity.toTypeErasedReadOnly() as? T {
readOnlyEntity = updatedEntity
onUpsert?(readOnlyEntity)
}

/// Scenario: Nuked
///
if let _ = readOnlyConvertible(from: note.deletedObjects, representing: readOnlyEntity) {
onDelete?()
/// Scenario: Nuked
///
if let _ = readOnlyConvertible(from: note.deletedObjects, representing: readOnlyEntity) {
onDelete?()
}
}
}

Expand Down
43 changes: 43 additions & 0 deletions Yosemite/YosemiteTests/Tools/EntityListenerTests.swift
Original file line number Diff line number Diff line change
Expand Up @@ -145,4 +145,47 @@ class EntityListenerTests: XCTestCase {

wait(for: [expectation], timeout: Constants.expectationTimeout)
}


/// Tests the thread safety of the readOnlyEntity property.
///
func test_thread_safety_for_readOnlyEntity() {
// Given: A sample storage account is inserted and saved.
let storageAccount = storageManager.insertSampleAccount()
viewContext.saveIfNeeded()

// And: An EntityListener is set up with the readOnly entity being the inserted storage account.
let listener = EntityListener(viewContext: viewContext, readOnlyEntity: storageAccount.toReadOnly())

// And: Expectations for multiple threads.
let expectation1 = self.expectation(description: "Thread 1")
let expectation2 = self.expectation(description: "Thread 2")
let expectation3 = self.expectation(description: "Thread 3")

// When: Multiple threads access the readOnlyEntity property concurrently.
DispatchQueue.global().async {
for _ in 0..<1000 {
_ = listener.readOnlyEntity
}
expectation1.fulfill()
}

DispatchQueue.global().async {
for _ in 0..<1000 {
_ = listener.readOnlyEntity
}
expectation2.fulfill()
}

DispatchQueue.global().async {
for _ in 0..<1000 {
_ = listener.readOnlyEntity
}
expectation3.fulfill()
}

// Then
wait(for: [expectation1, expectation2, expectation3], timeout: 0.1)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as with the above tests, if we remove the syncqueue changes to make the calls single-threaded this test still passes, so it's unlikely that's testing the changes in the way we want.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this either. How about using DispatchQueue.concurrentPerform like this suggestion?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried it, implementing it in this way:

func test_thread_safety_for_readOnlyEntity() {
    // Given: A sample storage account is inserted and saved.
    let storageAccount = storageManager.insertSampleAccount()
    viewContext.saveIfNeeded()

    // And: An EntityListener is set up with the readOnly entity being the inserted storage account.
    let listener = EntityListener(viewContext: viewContext, readOnlyEntity: storageAccount.toReadOnly())

    // When: Multiple threads access the readOnlyEntity property concurrently.
    let iterations = 100
    let expectation = self.expectation(description: "Concurrent access")

    DispatchQueue.global(qos: .userInitiated).async {
        DispatchQueue.concurrentPerform(iterations: iterations) { index in
            print("Iteration \(index)")
            _ = listener.readOnlyEntity
        }
        expectation.fulfill()
    }

    // Then: If no crashes or exceptions occurred, the test is considered successful.
    wait(for: [expectation], timeout: 10.0)
    XCTAssert(true)
}

But no luck, the test still passes. I would say that it's probably also "normal"? Since it's quite difficult to replicate thread safety issues?


}