Releases: earthstar-project/earthstar
v5.8.0: pathSuffix queries
pathSuffix queries
Added pathSuffix
as a query option, to get documents whose paths END WITH the given string.
storage.docs({ pathSuffix: '.txt' })
It can be combined with pathPrefix
.
Note that pathPrefix
and pathSuffix
can overlap: for example
storage.documents({
pathPrefix: "/abc",
pathSuffix: "bcd"
})
will match /abcd
as well as /abc/xxxxxxx/bcd
.
pathSuffix
may sometimes be a slower query to run than pathPrefix
depending on the underlying storage implementation. For now it has the same performance as pathPrefix
on the memory and sqlite storage backends.
v5.7.6: deleteMyDocuments
Delete My Documents
Added a new function to delete all your documents from a workspace: deleteMyDocuments
.
There are a lot of caveats -- please read the full notes in the README.
Other small changes
...since the last release notes for 5.7.3:
- Set up TypeDoc to generate documentation pages. I'm not sure this is very helpful. To use it:
npm run typedoc
- Then view
typedoc/index.html
- Wrote more comments, using TypeDoc format
- Our subscription class, Emitter, now uses Sets instead of arrays so is more efficient with very large numbers of subscribers
- Wrote a Tutorial
- Added a standalone UMD build which can be directly pulled into browsers without any further build steps. It's in
dist/
. I hope this works :) - Set up Github CI
- Wrote some speed benchmarks (run with
npm run benchmark
) and started collecting results inbenchmark-results/
. This is mostly used to see how thebeta
branch is coming along, so look at that folder in thebeta
branch.
v6.0.0-beta.3
Updated packages
[email protected]
[email protected]
[email protected]
Changes
No API changes.
Updated dependencies including Typescript.
Optimized StorageMemory:
- Queries with
pathPrefix
orlimit
are up to 5 times faster - Fixed speed regression on queries with a specific
path
,getDocument
,getContent
; now this is as fast as the 5.x versions again
v6.0.0-beta.1
This is a big under-the-hood rewrite with minor user-visible changes. Existing pubs and data will interoperate.
- Javascript API: minor changes
- SQLite schema: unchanged, but now it contains a schema version of
1
- Document format: unchanged
- HTTP pub syncing: unchanged
Updated packages
[email protected]
[email protected]
[email protected]
Javascript API changes
Query objects
Query objects are different. Here's the new type definition.
includeHistory: boolean
is now history: "all" | "latest"
Added continueAfter
for resuming a query after hitting its limit
.
Added limitBytes
to limit by total bytes of content returned.
There's now only one way to query by author, just called author
. Read the comments (linked above) for details.
contentIsEmpty
is now contentLength
, contentLength_gt
, contentLength_lt
Added timestamp
, timestamp_gt
, timestamp_lt
Storage objects
IMPORTANT -- You now must run storage.close()
when done with a storage object, or your program will hang forever. Storage objects are now clearing out their old expired docs once an hour, and those timers will hang around and prevent your program from exiting unless you close the storage.
The IStorage
classes have slightly different methods
Removed storage.deleteAndClose
; now you can pass a delete option to the main close
method.
New events: onWillClose
and onDidClose
. Removed deprecated onChange
. onWrite
remains the same.
Removed storage.sync
method. Instead use the standalone function syncLocal(storage1, storage2)
.
Added methods for storing config info in a Storage -- it's like another little key-value store besides the Earthstar documents. getConfig
, setConfig
, etc. This is used to remember which workspace the storage is for, sqlite schema version, and maybe other things like pub URLs in the future. This info is not synced, it's just local.
New StorageLocalStorage
class which persists its data to browser LocalStorage. Not really tested yet :)
The live sync algorithms had a problem where they would bounce changed documents right back to the peer that sent them, wasting bandwidth. To fix this, storage objects now have a sessionId
which is randomly generated each time they're instantiated. (It's not saved to disk.) When you ingestDocument(doc, fromSessionId)
you now also have to tell it which session id gave you the document (from the most direct hop, not the original source). fromSessionId
is also available in the onWrite
event. This should soon be used to solve the live sync bounce problem, but it's not hooked up there yet.
Async Storage!
So far all of our storage type have been synchronous. Now there's an IStorageAsync
type which returns promises for every method, and we can use that to build IndexedDb support soon.
For testing purposes, you can convert one of the existing synchronous storage types to an async one with the StorageToAsync wrapper class
Storage subclasses
The various storage types are now implemented as subclasses of the StorageBase
class. This will make it easier to add new storage types with less code duplication.
The base class implements most of its functionality (for paths()
, authors()
, etc) by using just the documents(query)
method. When you write a new storage type you can start by just implementing that one method and everything will work, but might be slow. Then you can eventually override the other methods with more optimized versions.
More efficient sync, soon
There's a new fancy sync algorithm which is not quite hooked up all the way.
It queries the pub for "fingerprints" of all its documents, incrementally, with limit
and continueAfter
. It compares those fingerprints to the local documents, then pushes and pulls only the docs that need to be sync'd.
A "fingerprint" is a unique identifier of a document: [path, author, timestamp, signature]
.
This sync algorithm runs as a pipeline of about 10 parallel threads connected by go-style channels from the concurrency-friends package so it should be quick even with network and database latency.
All the brains of the sync algorithm are in the client side. The pub just needs to expose a new fingerprints
endpoint.
TODO
- Optimize some of the storage methods for Memory and Sqlite
- Test StorageLocalStorage in the browser
- Use
sessionId
to prevent extra echoing back of documents in live sync (it needs to be provided in the HTTP request by the client) - Finish the new sync algorithm
- Add an EarthstarPeer class to manage multiple workspace storages at once, and their settings
- Make syncing smarter about dropped server-sent event connections (use Last-Event-Id or handle error events)
- Add RPC-based syncing with mini-rpc
v5.7.3
New method to delete a whole IStorage from disk
The new method storage.deleteAndClose()
will close the IStorage instance and then remove all of the storage's data. This is a local deletion only. It does not propagate to other peers and pubs.
For memory storage, this empties out the data from memory.
For SQLite, it deletes the sqlite file from disk.
After an IStorage instance is closed, the only methods you can call are close()
, isClosed()
, and deleteAndClose()
. It's safe to call close
and deleteAndClose
more than once. Calling any other method will throw a StorageIsClosedError
.
v5.7.1
Adding a new helper class: Bus
Commit 3733bf6
Throughout Earthstar we use the Emitter
class to subscribe to and send events. I've added a similar class, Bus
, which lets you separate your events into channels.
This way if you have several kinds of events you don't need to make several Emitter instances, you can just use one Bus.
// define your channel names and their types
interface Channels {
click: string,
drag: number,
}
let bus = new Bus<Channels>();
bus.subscribe("click", (msg) => { console.log("click happened", msg) });
bus.subscribe("drag", (msg) => { console.log("drag happened", msg) });
bus.send("click", "this is the click message");
bus.send("drag", 12345);
// send() returns after the callbacks are done running.
It can also handle async functions:
bus.subscribe("click", async (msg) => {
console.log("click happened", msg);
await sleep(1000);
});
await bus.send("click", "hello");
// await send() will block until the callbacks are all done, even the async callbacks
I expect this will be useful when building Layers that you can subscribe to. Imagine something like this:
// imaginary code
let todoLayer = new TodoLayer(myStorage);
todoLayer.bus.subscribe("todo:new"), (todo) => { ...... });
// Each Todo could have its own channel named after its id
todoLayer.bus.subscribe("todo:update:1833298"), (todo) => { ...... });
Bus
scales efficiently to thousands of channels and can handle frequent re-subscriptions quickly, so it can be used from React to subscribe to each Todo separately.
This might also be used in IStorage
, like:
// imaginary code
// subscribe to a single document
myStorage.bus.subscribe("doc:update:/about/foo/bar.txt", (doc) => { ...... });
Lastly, you can subscribe to all channels using "*"
but read the comments below for details.
There are extensive comments in the code describing how it works in more detail.
Code: https://github.com/earthstar-project/earthstar/blob/master/src/util/emitter.ts
Tests: https://github.com/earthstar-project/earthstar/blob/master/src/test/emitter.test.ts
v5.7.0
Improved Syncer API
The new syncer class has a new method:
// (syncer is an instance of OnePubOneWorkspaceSyncer)
syncer.syncOnceAndContinueLive();
This will start a bulk sync and then continue with a live-streaming sync. In most cases this is what you want to do when your page loads.
As before, you can stop sync by:
syncer.stopPushStream();
syncer.stopPullStream();
This class is too complex and will change again soon -- here's how it's being used from Foyer, which now has just a single "sync switch" which controls both bulk and live syncing:
v5.6.0
Live streaming of sync'd changes
There's a new kind of Syncer class which can do live sync -- sending and receiving documents as they change. It can also do bulk sync, which is how it worked before.
The old Syncer
class is now deprecated but still works.
The new class is OnePubOneWorkspaceSyncer
which is in sync2.ts.
Note that live sync captures write events so it only includes document changes that occur after the sync begins. It does not include existing documents. The expected use case is: get a live sync running first to start capturing documents, then fire off syncOnce
on top of it.
Also note that the new syncer is only for one pub and one workspace, so you'll have a lot more of these objects to manage than before. You must call close()
on them when you no longer need them (e.g. when removing a pub or switching to a new workspace).
Changes to earthstar-pub
A new version earthstar-pub ^5.6.1
adds support for pull streaming from the pub to the client.
Existing earthstar-pub versions will continue to work for batch sync and for push streaming to the pub.
New Syncer API examples
let syncer = new OnePubOneWorkspaceSyncer(myStorage, myPubUrl);
// live streams
syncer.startPullStream(); // begin listening for newly changed documents on the pub
syncer.startPushStream(); // begin sending new locally changed documents to the pub
syncer.stopPullStream();
syncer.stopPushStream();
// batch upload and download
await syncer.pushOnce();
await syncer.pullOnce();
await syncer.syncOnce(); // a sync is just a push and a pull
// get notified when the syncer changes state
syncer.onStateChange.subscribe(state => {...});
// make sure to close the syncer when you're discarding it.
// this closes the network connection and unsubscribes from the local storage events
syncer.close()
// syncer state looks like this.
// this is useful for rendering UI that shows what's going on.
let exampleState: SyncerState = {
isPushStreaming: false,
isPullStreaming: false,
isBulkPulling: false, // pullOnce()
isBulkPushing: false, // pushOnce()
isBulkSyncing: false, // the overall progress of syncOnce(), which is a wrapper around pullOnce() and pushOnce().
closed: false,
lastCompletedBulkPush: 0, // timestamps in microseconds
lastCompletedBulkPull: 0,
}
How it works
For pull streaming, we use SSE (server-sent events) to send documents down from the pub as they change. This is a single long-running HTTP call which trickles information slowly as it occurs. The browser will maintain the connection and re-establish it if needed. The pub also sends keep-alive messages every 28 seconds.
For push streaming, each document is uploaded to the pub in a separate POST request, just like with batch push, except now it's one document at a time. This should be optimized to collect documents into a small array and send them every 500ms in a batch.
Demo
- Open up Earthstar Foyer in a regular tab and a private browsing tab, or use two browsers. (*)
- Check the "Live" checkbox on both tab
- Make a change on one tab and it will quickly appear on the other tab
(*) You have to use 2 browsers because otherwise the tabs will share a LocalStorage instance, which causes trouble.
v5.5.0
commit: fec8354
Minor change to IStorage.set()
behavior: bumping timestamps forward
Background
When setting a document, Earthstar can bump the document timestamp forward if necessary so it becomes the winning, latest document for that path. This can happen when there's clock skew between peers and another peer has given you a document "from the future" which you want to overwrite.
We do this for two reasons:
- To avoid surprises where you set a document and then immediately read a different one back
- If you already have other documents in your storage for that path, you "know about them" so your document should have a higher timestamp to reflect this potential causality between documents.
Change
Previously, this bumping behavior always happened. Now it only happens if you omit the timestamp to set()
. If you provide a timestamp, we will no longer change it for you.
// automatically choose a timestamp
// which will be max(now, highest_existing_timestamp_in_this_path)
storage.set({ path: '/foo', content: 'bar' })
// this manually provided timestamp will not be altered
storage.set({ path: '/foo', content: 'bar', timestamp: Date.now()*1000 })
Better timestamp error checking for set()
set()
now also detects invalid timestamps and returns a ValidationError
. This usually happens if you've forgotten to multiply your timestamp by 1000 -- remember that all timestamps in Earthstar are in microseconds, not milliseconds.
Like most Earthstar errors, it's returned, not thrown. Detect the error like this:
let result = storage.set({ path: '/foo', content: 'bar', timestamp: Date.now() });
if (isErr(result)) {
// handle error
}
You can also do a more specific check: if (result instanceof ValidationError)
v5.4.0
More detailed write events
8682832 WriteEvent
s have a new property, isLatest
, which tells you if the written document is the latest one for that path. In other words, is it the "head", the one that will be returned from getDocument(path)
? This is sometimes false
when we obtain older synced documents which update an old item back in the history for a given path.
If your application ignores history documents and only uses the default latest document (e.g. as returned by getDocument
), you probably want to ignore WriteEvents that have isLatest: false
.
In settings where you're using history documents and doing your own conflict resolution (e.g. with documents({ includeHistory: true })
), you will be interested in all of the WriteEvents.
Enforced immutability of document objects
92ea0bf Whenever an IStorage touches a document object, it calls Object.freeze(doc)
on it. This happens when ingesting documents and also when returning them from getDocument()
etc.
Document objects should always be treated as immutable; this just enforces that rule.