From 22d8b4869fa7a28fc0e4e9124e78cb5f66ac04d8 Mon Sep 17 00:00:00 2001
From: anomit ghosh <anomit.ghosh@gmail.com>
Date: Wed, 29 Nov 2023 20:53:59 +0530
Subject: [PATCH] details of data markets, sources, project IDs

---
 .../Snapshotter/implementations.md            | 13 ++++
 .../Snapshotter/snapshot_build.md             | 69 ++++++++++++++++++-
 docs/Protocol/data_sources.md                 |  6 ++
 3 files changed, 87 insertions(+), 1 deletion(-)

diff --git a/docs/Protocol/Specifications/Snapshotter/implementations.md b/docs/Protocol/Specifications/Snapshotter/implementations.md
index 5755bba..e4277dd 100644
--- a/docs/Protocol/Specifications/Snapshotter/implementations.md
+++ b/docs/Protocol/Specifications/Snapshotter/implementations.md
@@ -22,6 +22,12 @@ Use case specific logic of generating snapshots as well as other configuration a
 
 The architecture has been designed to facilitate the seamless interchange of configuration and modules. Adapting the system to different use cases is as straightforward as changing a Git branch.
 
+You can observe the corresponding branches within [snapshotter-configs](https://github.com/PowerLoom/snapshotter-configs/) and [snapshotter-computes](https://github.com/PowerLoom/snapshotter-computes/) repos:
+
+* `eth_uniswapv2`
+* `eth_uniswapv2_5_pairs` in `snapshotter-configs` that corresponds to `eth_uniswapv2_lite` in `snapshotter-computes`
+* `zkevm_quests`
+
 ### Configuration Files
 
 Configuration files, located in the `/config` directory are linked to [snapshotter-configs](https://github.com/PowerLoom/snapshotter-configs/) repo, play a pivotal role in defining project types, specifying paths for individual compute modules, and managing various project-related settings.
@@ -30,3 +36,10 @@ Configuration files, located in the `/config` directory are linked to [snapshott
 ### Compute Modules
 
 The heart of the system resides in the `snapshotter/modules` directory that's linked to [snapshotter-computes](https://github.com/PowerLoom/snapshotter-computes/), where the actual computation logic for each project type is defined. These modules drive the snapshot generation process for specific project types.
+
+
+# Useful links
+
+* [Snapshot generation specifications](/docs/Protocol/Specifications/Snapshotter/snapshot_build.md)
+* [Data markets and sources](/docs/Protocol/data_sources.md)
+* [Composition of snapshots and higher order datapoints](/docs/Protocol/data_composition.md)
\ No newline at end of file
diff --git a/docs/Protocol/Specifications/Snapshotter/snapshot_build.md b/docs/Protocol/Specifications/Snapshotter/snapshot_build.md
index ec74c8d..57d06e1 100644
--- a/docs/Protocol/Specifications/Snapshotter/snapshot_build.md
+++ b/docs/Protocol/Specifications/Snapshotter/snapshot_build.md
@@ -14,6 +14,73 @@ An important advantage of Bulk Mode is that, since all transaction receipts are
 
 Every time a new project is added for either of these two types on the protocol state smart contract by an off-chain data source-detector and signaller, a ProjectUpdated event is emitted according to the following data model
 
+## Snapshot computation modules
+
+As briefly introduced in the section on snapshotter implementations that [leverage Git Submodules for specific computation logic](/docs/Protocol/Specifications/Snapshotter/implementations.md), the modules are specified in the configuration for project types under the key `processor`.
+
+```json reference 
+https://github.com/PowerLoom/snapshotter-configs/blob/39e4713cdd96fff99d100f1dea7fb7332df9e491/projects.example.json#L15-L28
+```
+
+Let us take the example of the snapshot builder configured for the project type `zkevm:owlto_bridge` and locate it in the `snapshotter-computes` repo, in the `zkevm_quests` branch
+
+```python reference
+https://github.com/PowerLoom/snapshotter-computes/blob/29199feab449ad0361b5867efcaae9854992966f/owlto_bridge.py#L1-L31
+```
+
+As you can observe, it implements the `compute()` interface expected from snapshotter implementations inheriting `GenericProcessorSnapshot`.
+
+```python reference
+https://github.com/PowerLoom/pooler/blob/634610801a7fcbd8d863f2e72a04aa8204d27d03/snapshotter/utils/callback_helpers.py#L179-L196
+```
+
+
 ## Base snapshots
 
-## Aggregate snapshots
\ No newline at end of file
+Callback workers calculate base snapshots against an `epochId` which corresponds to collections of state observations and event logs between the blocks at height in the range `begin, end`. They call the use case specific computation logic as configured in the [computation modules section](#snapshot-computation-modules).
+
+The data sources are determined according to the following specification for the `projects` key in the configuration:
+
+* an empty array against the `projects` indicates no specific data source is defined, if `bulk_mode` is `False`
+  * the snapshotter node attempts to retrieve data sources corresponding to the `projects` key from the protocol state
+
+```python reference
+https://github.com/PowerLoom/pooler/blob/634610801a7fcbd8d863f2e72a04aa8204d27d03/snapshotter/processor_distributor.py#L304-L332
+```
+
+* If the `projects` key is non-existent
+  * data sources can also be dynamically added on the protocol state contract which the [processor distributor](#processor-distributor) [syncs with](https://github.com/PowerLoom/pooler/blob/d8b7be32ad329e8dcf0a7e5c1b27862894bc990a/snapshotter/processor_distributor.py#L1107)
+
+```python reference
+https://github.com/PowerLoom/pooler/blob/634610801a7fcbd8d863f2e72a04aa8204d27d03/snapshotter/processor_distributor.py#L738-L751
+```
+
+
+* Else, we can have a [static list of contracts](/docs/Protocol/data_sources.md#static-data-sources)
+
+### Format of data sources added as `projects`
+* EVM-compatible wallet address strings
+* `"<addr1>_<addr2>"` strings that denote the relationship between two EVM addresses (for eg ERC20 balance of `addr2` against a token contract `addr1`)
+
+
+### Project ID Generation
+
+```python reference
+https://github.com/PowerLoom/pooler/blob/634610801a7fcbd8d863f2e72a04aa8204d27d03/snapshotter/utils/snapshot_worker.py#L51-L71
+```
+
+## Aggregate snapshots
+
+Aggregate and higher order snapshots that build on computed base snapshots are configured in their specific repos like the following in our Uniswap V2 Dashboard use case.
+
+```python reference
+https://github.com/PowerLoom/snapshotter-configs/blob/fcf9b852bac9694258d7afcd8beeaa4cf961c65f/aggregator.example.json#L1-L29
+```
+
+The order and dependencies of these compositions are specified as such according to the `aggregate_on` key
+
+### `SingleProject` aggregation type
+
+
+
+### `MultiProject` aggregation type
\ No newline at end of file
diff --git a/docs/Protocol/data_sources.md b/docs/Protocol/data_sources.md
index 5b41b58..d570ce5 100644
--- a/docs/Protocol/data_sources.md
+++ b/docs/Protocol/data_sources.md
@@ -39,9 +39,15 @@ Data sources can be dynamically added to the contract according to the role of c
 
 In the present implementation of the use case that tracks wallet activity for Quests on Polygon zkEVM, such wallets are added from a data feed supplied by Mercle that consists of wallets that signup on their platform. Only these wallet addresses are of interest to the Quest platform on Mercle for their activities to be tracked across DEXs and asset bridges.
 
+Read more about it in the [snapshotter specs of bulk node](/docs/Protocol/Specifications/Snapshotter/snapshot_build.md#bulk-mode).
+
 
 ## Project types and IDs
 
+All data sources are tracked with a project ID on the protocol. Think of it as a stream of datasets, finalized by consensus against [each epoch released](/docs/Protocol/Specifications/Epoch.md#1-epoch_released) on the protocol.
+
+Find more details on this in the [specifications of snapshot generation](/docs/Protocol/Specifications/Snapshotter/snapshot_build.md).
+
 
 ## Useful links and concepts