Add a sb3 algo + policy for domains with graph observations #441

nhuet · 2024-11-19T21:23:58Z

we reuse our stable_baselines3 wrapper
the policy is extracting features from the graph with a GNN
the GNN is using pytorch-geometric
We subclass
- ActorCriticPolicy:
  - feature extractor = gnn
  - custom conversion of observation to torch to convert into torch_geometric.data.Data
- PPO to handle properly
  - observation conversion
  - rollout buffer
Current limitations:
- we extract a fixed number of features (independent of edge/node numbers) for now as we end with a feature reduction layer connected to a classic mlp (not knowning anything about the current graph structure)
User input: the user can define (and default choices are made else)
- the gnn (default to a 2 layers GCN), taking as inputs w.r.t torch_geometric conventions:
  - x: nodes features
  - edge_index: edge indices or sparse transposed adjency matrix
  - edge_attr (optional): edges features
  - edge_weight (optional): edge weights (taken from first dimension of edge_attr)
- the feature reduction layer from the gnn output to the fixed number of features (default to global_max_pool + linear layer + relu)

We also introduce a multiinput policy to take into account (for instance) static graph features. The observation space in that case is a DictSpace whose subspaces can contain some Graph spaces.

- we reuse our stable_baselines3 wrapper - the policy is extracting features from the graph with a GNN - the GNN is using pytorch-geometric - We subclass - ActorCriticPolicy: - feature extractor = gnn - custom conversion of observation to torch to convert into torch_geometric.data.Data - PPO to handle properly - observation conversion - rollout buffer - Current limitations: - we extract a fixed number of features (independent of edge/node numbers) for now as we end with a feature reduction layer connected to a classic mlp (not knowning anything about the current graph structure) - User input: the user can define (and default choices are made else) - the gnn (default to a 2 layers GCN), taking as inputs w.r.t torch_geometric conventions: - x: nodes features - edge_index: edge indices or sparse transposed adjency matrix - edge_attr (optional): edges features - edge_weight (optional): edge weights (taken from first dimension of edge_attr) - the feature reduction layer from the gnn output to the fixed number of features (default to global_max_pool + linear layer + relu) We also introduce a multiinput policy to take into account static graph features. The observation space is a DictSpace whose subspaces can contain some Graph spaces.

nhuet force-pushed the gnn-sb3 branch 9 times, most recently from ba35de0 to 204c1ed Compare November 26, 2024 16:20

nhuet force-pushed the gnn-sb3 branch 3 times, most recently from 993b819 to ecfd289 Compare December 5, 2024 09:18

nhuet force-pushed the gnn-sb3 branch from ecfd289 to a217d40 Compare December 12, 2024 20:35

nhuet mentioned this pull request Dec 13, 2024

Add possibility to GraphPPO for multi inputs with Dict spaces (including graphs) #446

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a sb3 algo + policy for domains with graph observations #441

Add a sb3 algo + policy for domains with graph observations #441

nhuet commented Nov 19, 2024 •

edited

Loading

Add a sb3 algo + policy for domains with graph observations #441

Are you sure you want to change the base?

Add a sb3 algo + policy for domains with graph observations #441

Conversation

nhuet commented Nov 19, 2024 • edited Loading

nhuet commented Nov 19, 2024 •

edited

Loading