Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: sonda tool #2893

Merged
merged 34 commits into from
Jul 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
ecc6d6c
adding initial files
gabrielmer Jul 5, 2024
e751b12
adding makefile
gabrielmer Jul 5, 2024
5de37ce
fixing compilation error
gabrielmer Jul 8, 2024
5ecd12b
creating waku node configuration
gabrielmer Jul 8, 2024
3170bf6
initializing node
gabrielmer Jul 8, 2024
ecf74df
adding docker compose
gabrielmer Jul 8, 2024
878a3e3
fixing env variables
gabrielmer Jul 9, 2024
637d032
configuring bootstrap nodes and shard
gabrielmer Jul 9, 2024
bdc418e
adding store nodes env and fixing script logs
gabrielmer Jul 9, 2024
35ddf99
adding query period and sending sonda message
gabrielmer Jul 9, 2024
a330aa0
adding message querying
gabrielmer Jul 9, 2024
1114828
adding timestamp to messages and queries
gabrielmer Jul 9, 2024
096a6d6
initial prometheus integration
gabrielmer Jul 9, 2024
c0e5f6d
changing name and removin unnecessary flag
gabrielmer Jul 9, 2024
9e08daa
adding env example
gabrielmer Jul 9, 2024
9deb676
adding labels
gabrielmer Jul 9, 2024
cbd9752
refactor
gabrielmer Jul 9, 2024
619615d
debugging
gabrielmer Jul 9, 2024
b4e1c68
creating sonda dashboard and fixing bug
gabrielmer Jul 10, 2024
c561cc9
adding visualizations
gabrielmer Jul 10, 2024
c58f737
fixing dashboard
gabrielmer Jul 10, 2024
6590684
updating env variable name
gabrielmer Jul 10, 2024
44f67ba
rename again env variables
gabrielmer Jul 10, 2024
0de03b3
improving grafana legends
gabrielmer Jul 10, 2024
22c1106
file cleanup
gabrielmer Jul 10, 2024
71744f9
refactors
gabrielmer Jul 10, 2024
3527083
adding health metric
gabrielmer Jul 10, 2024
ba84d3c
updating .env example
gabrielmer Jul 10, 2024
106eb34
removing sonda from make configs
gabrielmer Jul 10, 2024
23eca9e
removing debug logs
gabrielmer Jul 10, 2024
86ac059
changing delay in example
gabrielmer Jul 10, 2024
07bee43
removing comments
gabrielmer Jul 10, 2024
d72f130
added README
gabrielmer Jul 10, 2024
fe9a875
fixed puntuation
gabrielmer Jul 10, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 33 additions & 0 deletions apps/sonda/.env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# RPC URL for accessing testnet via HTTP.
# e.g. https://sepolia.infura.io/v3/123aa110320f4aec179150fba1e1b1b1
RLN_RELAY_ETH_CLIENT_ADDRESS=

# Private key of testnet where you have sepolia ETH that would be staked into RLN contract.
# Note: make sure you don't use the '0x' prefix.
# e.g. 0116196e9a8abed42dd1a22eb63fa2a5a17b0c27d716b87ded2c54f1bf192a0b
ETH_TESTNET_KEY=

# Password you would like to use to protect your RLN membership.
RLN_RELAY_CRED_PASSWORD=

# Advanced. Can be left empty in normal use cases.
NWAKU_IMAGE=
NODEKEY=
DOMAIN=
EXTRA_ARGS=
RLN_RELAY_CONTRACT_ADDRESS=

# -------------------- SONDA CONFIG ------------------
CLUSTER_ID=16
SHARD=32
# Comma separated list of store nodes to poll
STORE_NODES="/dns4/store-01.do-ams3.shards.test.status.im/tcp/30303/p2p/16Uiu2HAmAUdrQ3uwzuE4Gy4D56hX6uLKEeerJAnhKEHZ3DxF1EfT,\
/dns4/store-02.do-ams3.shards.test.status.im/tcp/30303/p2p/16Uiu2HAm9aDJPkhGxc2SFcEACTFdZ91Q5TJjp76qZEhq9iF59x7R,\
/dns4/store-01.gc-us-central1-a.shards.test.status.im/tcp/30303/p2p/16Uiu2HAmMELCo218hncCtTvC2Dwbej3rbyHQcR8erXNnKGei7WPZ,\
/dns4/store-02.gc-us-central1-a.shards.test.status.im/tcp/30303/p2p/16Uiu2HAmJnVR7ZzFaYvciPVafUXuYGLHPzSUigqAmeNw9nJUVGeM,\
/dns4/store-01.ac-cn-hongkong-c.shards.test.status.im/tcp/30303/p2p/16Uiu2HAm2M7xs7cLPc3jamawkEqbr7cUJX11uvY7LxQ6WFUdUKUT,\
/dns4/store-02.ac-cn-hongkong-c.shards.test.status.im/tcp/30303/p2p/16Uiu2HAm9CQhsuwPR54q27kNj9iaQVfyRzTGKrhFmr94oD8ujU6P"
Comment on lines +21 to +29
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I change the .env example to use cluster-id 1 and some other store nodes? If so, which? didn't see in https://fleets.status.im/ store nodes for TWN

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think is fine that way 🥳

# Wait time in seconds between two consecutive queries
QUERY_DELAY=60
# Consecutive successful store requests to consider a store node healthy
HEALTH_THRESHOLD=5
4 changes: 4 additions & 0 deletions apps/sonda/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
.env
keystore
rln_tree
.env
3 changes: 3 additions & 0 deletions apps/sonda/Dockerfile.sonda
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
FROM python:3.9.18-alpine3.18

RUN pip install requests argparse prometheus_client
52 changes: 52 additions & 0 deletions apps/sonda/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Sonda

Sonda is a tool to monitor store nodes and measure their performance.

It works by running a `nwaku` node, publishing a message from it every fixed interval and performing a store query to all the store nodes we want to monitor to check they respond with the last message we published.

## Instructions

1. Create an `.env` file which will contain the configuration parameters.
You can start by copying `.env.example` and adapting it for your use case

```
cp .env.example .env
${EDITOR} .env
```

The variables that have to be filled for Sonda are

```
CLUSTER_ID=
SHARD=
# Comma separated list of store nodes to poll
STORE_NODES=
# Wait time in seconds between two consecutive queries
QUERY_DELAY=
# Consecutive successful store requests to consider a store node healthy
HEALTH_THRESHOLD=
```

2. If you want to query nodes in `cluster-id` 1, then you have to follow the steps of registering an RLN membership. Otherwise, you can skip this step.

For it, you need:
* Ethereum Sepolia WebSocket endpoint. Get one free from [Infura](https://www.infura.io/).
* Ethereum Sepolia account with some balance <0.01 Eth. Get some [here](https://www.infura.io/faucet/sepolia).
* A password to protect your rln membership.

Fill the `RLN_RELAY_ETH_CLIENT_ADDRESS`, `ETH_TESTNET_KEY` and `RLN_RELAY_CRED_PASSWORD` env variables and run

```
./register_rln.sh
```

3. Start Sonda by running

```
docker-compose up -d
```

4. Browse to http://localhost:3000/dashboards and monitor the performance

There's two Grafana dashboards: `nwaku-monitoring` to track the stats of your node that is publishing messages and performing queries, and `sonda-monitoring` to monitor the responses of the store nodes.

107 changes: 107 additions & 0 deletions apps/sonda/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@

version: "3.7"
x-logging: &logging
logging:
driver: json-file
options:
max-size: 1000m

# Environment variable definitions
x-rln-relay-eth-client-address: &rln_relay_eth_client_address ${RLN_RELAY_ETH_CLIENT_ADDRESS:-} # Add your RLN_RELAY_ETH_CLIENT_ADDRESS after the "-"

x-rln-environment: &rln_env
RLN_RELAY_CONTRACT_ADDRESS: ${RLN_RELAY_CONTRACT_ADDRESS:-0xCB33Aa5B38d79E3D9Fa8B10afF38AA201399a7e3}
RLN_RELAY_CRED_PATH: ${RLN_RELAY_CRED_PATH:-} # Optional: Add your RLN_RELAY_CRED_PATH after the "-"
RLN_RELAY_CRED_PASSWORD: ${RLN_RELAY_CRED_PASSWORD:-} # Optional: Add your RLN_RELAY_CRED_PASSWORD after the "-"

x-sonda-env: &sonda_env
CLUSTER_ID: ${CLUSTER_ID:-1}
SHARD: ${SHARD:-0}
STORE_NODES: ${STORE_NODES:-}
QUERY_DELAY: ${QUERY_DELAY-60}
HEALTH_THRESHOLD: ${HEALTH_THRESHOLD-5}

# Services definitions
services:
nwaku:
image: ${NWAKU_IMAGE:-harbor.status.im/wakuorg/nwaku:v0.30.1}
restart: on-failure
ports:
- 30304:30304/tcp
- 30304:30304/udp
- 9005:9005/udp
- 127.0.0.1:8003:8003
- 80:80 #Let's Encrypt
- 8000:8000/tcp #WSS
- 127.0.0.1:8645:8645
<<:
- *logging
environment:
DOMAIN: ${DOMAIN}
NODEKEY: ${NODEKEY}
RLN_RELAY_CRED_PASSWORD: "${RLN_RELAY_CRED_PASSWORD}"
RLN_RELAY_ETH_CLIENT_ADDRESS: *rln_relay_eth_client_address
EXTRA_ARGS: ${EXTRA_ARGS}
STORAGE_SIZE: ${STORAGE_SIZE}
<<:
- *rln_env
- *sonda_env
volumes:
- ./run_node.sh:/opt/run_node.sh:Z
- ${CERTS_DIR:-./certs}:/etc/letsencrypt/:Z
- ./rln_tree:/etc/rln_tree/:Z
- ./keystore:/keystore:Z
entrypoint: sh
command:
- /opt/run_node.sh

sonda:
build:
context: .
dockerfile: Dockerfile.sonda
ports:
- 127.0.0.1:8004:8004
environment:
<<:
- *sonda_env
command: >
python -u /opt/sonda.py
--delay-seconds=${QUERY_DELAY}
--pubsub-topic=/waku/2/rs/${CLUSTER_ID}/${SHARD}
--store-nodes=${STORE_NODES}
--health-threshold=${HEALTH_THRESHOLD}
volumes:
- ./sonda.py:/opt/sonda.py:Z
depends_on:
- nwaku

prometheus:
image: docker.io/prom/prometheus:latest
volumes:
- ./monitoring/prometheus-config.yml:/etc/prometheus/prometheus.yml:Z
command:
- --config.file=/etc/prometheus/prometheus.yml
# ports:
# - 127.0.0.1:9090:9090
restart: on-failure:5
depends_on:
- nwaku

grafana:
image: docker.io/grafana/grafana:latest
env_file:
- ./monitoring/configuration/grafana-plugins.env
volumes:
- ./monitoring/configuration/grafana.ini:/etc/grafana/grafana.ini:Z
- ./monitoring/configuration/dashboards.yaml:/etc/grafana/provisioning/dashboards/dashboards.yaml:Z
- ./monitoring/configuration/datasources.yaml:/etc/grafana/provisioning/datasources/datasources.yaml:Z
- ./monitoring/configuration/dashboards:/var/lib/grafana/dashboards/:Z
- ./monitoring/configuration/customizations/custom-logo.svg:/usr/share/grafana/public/img/grafana_icon.svg:Z
- ./monitoring/configuration/customizations/custom-logo.svg:/usr/share/grafana/public/img/grafana_typelogo.svg:Z
- ./monitoring/configuration/customizations/custom-logo.png:/usr/share/grafana/public/img/fav32.png:Z
ports:
- 0.0.0.0:3000:3000
restart: on-failure:5
depends_on:
- prometheus

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Loading