-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Listen to chain events and update twin on requests #194
Changes from 2 commits
fae3a21
602adca
d164d7a
319af11
deccc8e
24498b7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -71,4 +71,6 @@ message Envelope { | |
bytes plain = 13; | ||
bytes cipher = 14; | ||
} | ||
|
||
optional string relays = 17; | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,6 +4,7 @@ use std::time::Duration; | |
use anyhow::{Context, Result}; | ||
use clap::{builder::ArgAction, Parser}; | ||
use rmb::cache::RedisCache; | ||
use rmb::events; | ||
use rmb::redis; | ||
use rmb::relay::{ | ||
self, | ||
|
@@ -142,14 +143,11 @@ async fn app(args: Args) -> Result<()> { | |
.await | ||
.context("failed to initialize redis pool")?; | ||
|
||
// we use 6 hours cache for twin information because twin id will not change anyway | ||
// and we only need twin public key for validation only. | ||
let twins = SubstrateTwinDB::<RedisCache>::new( | ||
args.substrate, | ||
RedisCache::new(pool.clone(), "twin", Duration::from_secs(args.cache * 60)), | ||
) | ||
.await | ||
.context("cannot create substrate twin db object")?; | ||
let redis_cache = RedisCache::new(pool.clone(), "twin"); | ||
|
||
let twins = SubstrateTwinDB::<RedisCache>::new(args.substrate.clone(), redis_cache.clone()) | ||
.await | ||
.context("cannot create substrate twin db object")?; | ||
|
||
let max_users = args.workers as usize * args.user_per_worker as usize; | ||
let opt = relay::SwitchOptions::new(pool.clone()) | ||
|
@@ -175,6 +173,14 @@ async fn app(args: Args) -> Result<()> { | |
let r = relay::Relay::new(&args.domain, twins, opt, federation, limiter, ranker) | ||
.await | ||
.unwrap(); | ||
|
||
let mut l = events::Listener::new(args.substrate, redis_cache).await?; | ||
tokio::spawn(async move { | ||
if let Err(e) = l.listen().await { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i see a problem here. If the listener failed for any reason (and returned an error here) It's not gonna be good for the system that the relay keeps running otherwise it will rely on out-dated date forever. So what we need to do is:
I checked the listener code and it seems there are many points where the listener can return an error. for example cache.flush() but what if redis was temporary unavailable or restarting for some reason. This will cause the system to continue working but with a cache that will never be updated. IMHO the best approach is to make the listener infailable. This can be accomplished but basically never give up on errors |
||
log::error!("failed to listen to events: {:#}", e); | ||
} | ||
}); | ||
|
||
r.start(&args.listen).await.unwrap(); | ||
Ok(()) | ||
} | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,105 @@ | ||
use std::collections::LinkedList; | ||
|
||
use crate::{cache::Cache, tfchain::tfchain, twin::Twin}; | ||
use anyhow::Result; | ||
use futures::StreamExt; | ||
use log; | ||
use subxt::{OnlineClient, PolkadotConfig}; | ||
|
||
#[derive(Clone)] | ||
pub struct Listener<C> | ||
where | ||
C: Cache<Twin>, | ||
{ | ||
cache: C, | ||
api: OnlineClient<PolkadotConfig>, | ||
substrate_urls: LinkedList<String>, | ||
} | ||
|
||
impl<C> Listener<C> | ||
where | ||
C: Cache<Twin> + Clone, | ||
{ | ||
pub async fn new(substrate_urls: Vec<String>, cache: C) -> Result<Self> { | ||
let mut urls = LinkedList::new(); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am sure you can create a linked list directly from a Vec LinkedList::from_iter(substrate_urls) just saying |
||
for url in substrate_urls { | ||
urls.push_back(url); | ||
} | ||
|
||
let api = Self::connect(&mut urls).await?; | ||
|
||
cache.flush().await?; | ||
Ok(Listener { | ||
api, | ||
cache, | ||
substrate_urls: urls, | ||
}) | ||
} | ||
|
||
async fn connect(urls: &mut LinkedList<String>) -> Result<OnlineClient<PolkadotConfig>> { | ||
let trials = urls.len() * 2; | ||
for _ in 0..trials { | ||
let url = match urls.front() { | ||
Some(url) => url, | ||
None => anyhow::bail!("substrate urls list is empty"), | ||
}; | ||
|
||
match OnlineClient::<PolkadotConfig>::from_url(url).await { | ||
Ok(client) => return Ok(client), | ||
Err(err) => { | ||
log::error!( | ||
"failed to create substrate client with url \"{}\": {}", | ||
url, | ||
err | ||
); | ||
} | ||
} | ||
|
||
if let Some(front) = urls.pop_front() { | ||
urls.push_back(front); | ||
} | ||
} | ||
|
||
anyhow::bail!("failed to connect to substrate using the provided urls") | ||
} | ||
|
||
pub async fn listen(&mut self) -> Result<()> { | ||
loop { | ||
// always flush in case some blocks were finalized before reconnecting | ||
self.cache.flush().await?; | ||
match self.handle_events().await { | ||
Err(err) => { | ||
if let Some(subxt::Error::Rpc(_)) = err.downcast_ref::<subxt::Error>() { | ||
self.api = Self::connect(&mut self.substrate_urls).await?; | ||
} else { | ||
return Err(err); | ||
} | ||
} | ||
Ok(_) => { | ||
// reconnect here too? | ||
self.api = Self::connect(&mut self.substrate_urls).await?; | ||
} | ||
} | ||
} | ||
} | ||
|
||
async fn handle_events(&self) -> Result<()> { | ||
log::info!("started chain events listener"); | ||
let mut blocks_sub = self.api.blocks().subscribe_finalized().await?; | ||
while let Some(block) = blocks_sub.next().await { | ||
let events = block?.events().await?; | ||
for evt in events.iter() { | ||
let evt = evt?; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why do u assume this is a connection error here. Although it can be encoding error (most probably) this mean that you are flushing the entire db, and reconnecting if you receive an bad event type I am wondering if we better log this error and continue instead ? what do you think |
||
if let Ok(Some(twin)) = evt.as_event::<tfchain::tfgrid_module::events::TwinStored>() | ||
{ | ||
self.cache.set(twin.0.id, twin.0.into()).await?; | ||
} else if let Ok(Some(twin)) = | ||
evt.as_event::<tfchain::tfgrid_module::events::TwinUpdated>() | ||
{ | ||
self.cache.set(twin.0.id, twin.0.into()).await?; | ||
} | ||
} | ||
} | ||
Ok(()) | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
mod events; | ||
|
||
pub use events::Listener; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Check the warning from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why this is just a single string, not a list of strings? it's better to send it as a []string imho
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we gonna re-introduced this again why not reuse the old
federation
field?is that to keep compatibility with old peers that may interrupt it differently?
Also it is expected to have some seconds delay after switching the relay before it can receive messages.
it won't be bad as DNS propagation time which can take anywhere from a few hours up to 48 hours to propagate a new domain name for a website worldwide :D
so even if we didn't go with this path (adding the relays filed) we still just fine, how often node would switch its relays given that we are storing on-chain multiple relays per twin? is that even supported in zos nodes?
@muhamadazmy @AbdelrahmanElawady
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it is currently used to set the relay of twin destination in
rmb-sdk-go
here. so it would break compatibility with older versions.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also for the delay part, for clients that are using this new feature there should be no delay as cache will be updated with the requests made. for clients that are not using this field and changing relays there should be a delay up to 6 seconds until new blocks are produced.
This is already an improvement to the currently used relay warmer that adds a delay up to 10 minutes and previously before warmer it could take up to 60 seconds.