Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sampling period option #14

Closed
wants to merge 1 commit into from
Closed

Add sampling period option #14

wants to merge 1 commit into from

Conversation

longguzzz
Copy link

I add a new way to limit edit store with a given time in periods. It can prevent the accumulation of time deviation.
eg.
store time with inSecondsBetweenEdits setted as 60 when continuously editing: 12:00:50, 12:01:53, 12:02:54, 12:03:57, 12:05:00
store time with secondsOfEditSamplingPeriod setted as 60 when continuously editing: 12:00:50, 12:01:01, 12:02:03, 12:03:02, 12:04:01, 12:05:03

It is useful when using time data in .edtz file for statistics, especially for a short interval, eg. 20s

@longguzzz longguzzz mentioned this pull request Mar 20, 2024
@antoniotejada
Copy link
Owner

antoniotejada commented Mar 21, 2024

Thanks for taking the time to do this pull request, but I fail to see the value.

I assume you are seeing the "time deviation" in the drop down of versions. Any "time deviation" there is probably due to:

  • zip dates having DOS precision, see
    // jszip stores dates in UTC but the zip standard and zip tools
    // expect the date in local times (DOS times). Also, note that
    // dates in zip are only accurate to even seconds because DOS
    // times only use 16 bytes, which can only fit 5 bits for
    // seconds.
  • the time it takes to save the zip file and the code using the zip file date as proxy for the date of the most recent file in the zip, essentially this comparison:
    (zipFile != null) && ((file.stat.mtime - zipFile.stat.mtime) < this.minMsBetweenEdits)) {

    Unzipping on every modification is out of the question because it's a hotpath, but an easy fix could be to change the date of the zip file when saved to that of the last file in the zip. I can't remember if Obsidian regular file api allows that or you have to use the other file api.

At any rate, I don't think it's a problem that needs fixing (especially given DOS date precision of zip contents).

@longguzzz longguzzz closed this Mar 21, 2024
@longguzzz
Copy link
Author

Thanks for your remind, you are right. My Math.floor(file.stat.mtime / this.msOfEditSamplingPeriod) <= Math.floor(zipFile.stat.mtime / this.msOfEditSamplingPeriod) way can only effect in a statistical sense and can not eliminate single time deviation caused by DOS date precision.

@antoniotejada
Copy link
Owner

FYI I created #15 to implement changing the zip file date to match the last version file date which should help preventing time drift.

@longguzzz
Copy link
Author

FYI I created #15 to implement changing the zip file date to match the last version file date which should help preventing time drift.

Thank you. However, after giving it some more thought, I've come to believe that the functionality for behavioral statistics and tracking should be implemented independently from the version history recording feature. This is because if the timing information used for statistics is strongly bound to the complete text of the file, the space consumed could end up being tens of times that of the original note after a long period of statistics gathering. To save space, we would eventually be forced to discard older historical statistical information. Therefore, for tracking and statistics, I will look for other solutions for statistics. Your code is a very good example for how to detect Obsidian events and track edit behaviors. Thank you very much!

@antoniotejada
Copy link
Owner

No problem, whatever works for your use case.

Having said that,

  • I remembered I don't use the DOS zip file dates for the dropdown, I actually use the timestamp encoded in the file name see
    for (let filepath of filepaths) {
    const utcepoch = this.plugin.getEditEpoch(filepath);
    const date = new Date(utcepoch);
    select.addOption(filepath, date.toLocaleString());
  • I doubt you have to be concerned about saving space, the versions are stored as diffs (deltas), so storing the version every 20 minutes takes the same space as storing the version every 20 seconds (ignoring text deletions and the overhead of diff marks and each file entry in the zip).

@longguzzz
Copy link
Author

That obsidian-edit-history is good at detecting edit behavior leads to my idea about statistics. So, I tried to figure out whether the time information can be used for statistics. That's why this PR created. And the time deviation "problem" only occurs when someone trys to use it for recording high-frequency editing behavior.

In my opinion, the drift is mainly from (file.stat.mtime - zipFile.stat.mtime) < this.minMsBetweenEdits. The minMsBetweenEdits is lower bound of next stored edit time. There is always a time gap between the last_modified_time + minMsBetweenEdits moment and next first valid detected edit. accumulates. And it accumulates. Therefore, sampling can not avoid the time deviation.

However, I believe my use case is rare. (So I decide to find other way to do statistics.) Most people use this plugin mainly for normal backup. So, actually there is no problem for obsidian-edit-history.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants