You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The callback is called by librsync. Thus there is no direct control over the parameters that are provided.
You could provide a kwarg to the patch function that allows the caller to provide their own callback, rather than using the default one. You could then provide your own buffering callback function.
It is possible the OS is buffering the read data already, obviating the need for actually going to disk, but YMMV.
On the other hand perhaps some changes could happen in librsync itself that would eliminate this reading pattern (algorithm optimizations). Of course librsync is old and mature so that is likely easier said (or typed) than done.
in the function patch, the read_cb callback reads the same file position repeated.
this leads to poor performance on big files
how to repeat:
to be able to see the problem, add a line to the read_cb function
see:
https://github.com/smartfile/python-librsync/blob/master/librsync/__init__.py#L221
add the follwing line before the f.seek(pos) line
print "pos:",pos," length:",length
the code should now look:
now store this testcase as file and execute it:
the output is:
python redundant-read.py
pos: 0 length: 2048
pos: 0 length: 65536
pos: 0 length: 131072
pos: 0 length: 196608
pos: 0 length: 262144
pos: 0 length: 327680
pos: 0 length: 393216
pos: 0 length: 458752
pos: 0 length: 524288
pos: 0 length: 589824
pos: 0 length: 655360
pos: 0 length: 720896
pos: 0 length: 786432
pos: 0 length: 851968
pos: 0 length: 917504
pos: 0 length: 983040
here, the read_cb callback reads the file from position 0 again and again
The text was updated successfully, but these errors were encountered: