-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: refactor file storage to fix thread safety issues #144
Conversation
51229e4
to
095cfdb
Compare
android/src/main/java/com/amplitude/android/migration/StorageKeyMigration.kt
Show resolved
Hide resolved
val readMutex = Mutex() | ||
val filePathSet: MutableSet<String> = Collections.newSetFromMap(ConcurrentHashMap<String, Boolean>()) | ||
val curFile: MutableMap<String, File> = ConcurrentHashMap<String, File>() | ||
private val writeMutex = Mutex() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @falconandy.
I was hoping that we could resolve the issue by update the mutexs shared a single instance for a given instanceName
rather than being isolated to each storage instance.
I would like to avoid changing the storage format or file structure if possible.
Some ideas
getMutex(directory, storageKey): Mutex
a provider to create a mutex, or return existing mutex for this directory/storageKey. This could allow multiple file managers to access the same files without concurrency issues.getEventsFileManager(): EventFileManager
a provider for the EventsFileManager instance used in the FileStorage, that could ensure a single instance of EventsFileManager for a given directory, storageKey. This may be less code, but also less ideal becauselogger
and other constructor args would likely need to be a single instance as well.
wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Current format: single line with json array. If an app crashes/terminates for some reason, the line can be unfinished (an event object can be written partially) - in this case we can't repair the json array easily, all events will be lost. With new format (one line per event) we can skip only broken lines - most lines/events will be correct.
I would prefer timestamps instead of indexes. Indexes are stored in storage - it complicates code.
let client1 = Amplitude(Configuration())
let client2 = Amplitude(Configuration())
Should client1.flush()
flush client2's events? What if flush options are different?
Right now code doesn't support "sharing" properly, there are not-shared local state. A couple of examples:
https://github.com/amplitude/Amplitude-Kotlin/blob/main/core/src/main/java/com/amplitude/core/platform/EventPipeline.kt#L25
https://github.com/amplitude/Amplitude-Kotlin/blob/main/core/src/main/java/com/amplitude/core/platform/intercept/IdentifyInterceptor.kt#L29
Also look at https://github.com/amplitude/Amplitude-Kotlin/blob/main/core/src/main/java/com/amplitude/core/utilities/EventsFileManager.kt#L30-L32 - how long read "locks" are implemented to prevent read the same files.
It looks like we should ensure no Amplitude duplicates with the same instance name. Maybe Amplitude.getInstance(instanceName)
is a better way instead of Amplitude(...)
. Alternatively we can throw/log a error in ``Amplitude(...)` if new instance with existing instance name is created.
wdyt?
Resolved in #181 |
Summary
Refactored file storage to fix thread safety issues.
json-event-per-line
structure instead ofsingle-line-with-json-event-array
.Checklist